I think "Chat driven programming" is the most common type of the most hyped LLM-based programming I see on twitter that I just can't relate to. I've incorporated LLMs mainly as auto-complete and search; asking ChatGPT to write a quick script or to scaffold some code for which the documentation is too esoteric to parse.
But having the LLM do things for me, I frequently run into issues where it feels like I'm wasting my time with an intern. "Chat-based LLMs do best with exam-style questions" really speaks to me, however I find that constructing my prompts in such a way where the LLM does what I want uses just as much brainpower as just programming the thing my self.
I do find ChatGPT (o1 especially) really good at optimizing existing code.
There's an art to cost-effectively coaxing useful answers (useful drafts of code) from an LLM, and there's an art to noticing the most productive questions to put to that process. It's a totally different way of programming than having an LLM looking over your shoulder while you direct, function by function, type by type, the code you're designing.
If you feel like you're wasting your time, my bet is that you're either picking problems where there isn't enough value to negotiate with the LLM, or your expectations are too high. Crawshaw mentions this in his post: a lot of the value of this chat-driven style is that it very quickly gets you unstuck on a problem. Once you get to that point, you take over! You don't convince the LLM to build the final version you actually commit to your branch.
Generating unit test cases --- in particular, generating unit test cases that reconcile against unsophisticated, brute-force, easily-validated reference implementations of algorithms --- are a perfect example of where that cost/benefit can come out nicely.
My technique is to feed it a series of intro questions that prepare it for the final task. Chat the thing into a proper comfort level, and then from there, with the context at hand, ask to help solve the real problem. Def feels like a new kind of programming model because its still very programming-esque.
> "Chat-based LLMs do best with exam-style questions" really speaks to me, however I find that constructing my prompts in such a way where the LLM does what I want uses just as much brainpower as just programming the thing my self.
It speaks to me too because my mechanical writing style (as opposed to creative prose) could best be described as what I learned in high school AP English/Literature and the rest of the California education system. For whatever reason that writing style dominated the training data and LLMs just happens to be easy to use because I came out of the same education system as many of the people working at OpenAI/Anthropic.
I’ve had to stop using several generic turns of phrase like “in conclusion” because it made my writing look too much like ChatGPT.
I’ve found that everything just works (more or less) since switching to Cursor. Agent based composer mode is magical. Just give it a few files for context, and ask it to do what you want.
It's interesting that you find it useful for optimization. I've found that they're barely capable of anything more than shallow optimization in my stuff without significant direction.
What I find useful is that I can keep thinking at one abstraction level without hopping back and forth between algorithm and codegen. The chat is also a written artifact I can use the faster language parts of my brain on instead of the slower abstract thought parts.
But having the LLM do things for me, I frequently run into issues where it feels like I'm wasting my time with an intern. "Chat-based LLMs do best with exam-style questions" really speaks to me, however I find that constructing my prompts in such a way where the LLM does what I want uses just as much brainpower as just programming the thing my self.
I do find ChatGPT (o1 especially) really good at optimizing existing code.