The vibe coding maximalist position can be stated in information theory terms: That there exists a decoder that can decode the space of useful programs from a much smaller prompt space.
The compression ratio is the vibe coding gain.
I think that way of phrasing it makes it easier to think about boundaries of vibe coding.
"A class that represents (A) concept, using the (B) data structure and (C) algorithms for methods (D), in programming language (E)."
That's decodeable, at least to a narrow enough distribution.
"A commercially successful team communication app built around the concept of channels, like in IRC."
Without already knowing Slack, that's not decodable.
Thinking about what is missing is very helpful. Obviously, the business strategic positioning, non technical stakeholder inputs, UX design.
But I think it goes beyond that: In sufficiently complex apps, even purely technical "software engineering" decisions are to some degree learnt from experiment.
This also makes it more clear how to use AI coding effectively:
* Prompt in increments of components that can be encoded in a short prompt.
* If possible, add pre-existing information to the prompt (documentation, prior attempts at implementation).
"Informally, from the point of view of algorithmic information theory, the information content of a string is equivalent to the length of the most-compressed possible self-contained representation of that string. A self-contained representation is essentially a program—in some fixed but otherwise irrelevant universal programming language—that, when run, outputs the original string."
Where it gets tricky is the "self-contained" bit. It's only true with the model weights as a code book, e.g. to allow the LLM to "know about" Slack.
> That there exists a decoder that can decode the space of useful programs from a much smaller prompt space.
I love this. I've been circling this idea for a while and you put into words what I've struggled to describe.
> "A commercially successful team communication app built around the concept of channels, like in IRC."
> Without already knowing Slack, that's not decodable.
I would like to suggest that implicit shared context matters here. Or rather, humans tend to assume more shared context than LLM's actually have, and that misleads us when it comes assessing the aforementioned decoder.
But I think it also suggests that there is a system that could be built with strong constraints and saliency that could really explode the compression ratio of vibe coding.
It's not necessarily just the terseness. Terseness might be a selling point for people who have already invested in training themselves to be fluent with programming languages and the associated ecosystem of tooling.
But there is an entire cohort of people who can think about specifying systems but lack the training to sdo so so using the current methods and see a lower barrier to entry in the natural language.
That doesn't mean the LLM is going to think on your behalf (although there is also a little bit of that involved and that's where stuff gets confusing) but it surely provides a completely different interface for turning your ideas into working machinery
"[T]here is an entire cohort of people who can think about specifying systems but lack the training to sdo so so using the current methods and see a lower barrier to entry in the natural language."
"Specifying" is the load-bearing term there. They are describing what they want to some degree, how how specifically?
> But there is an entire cohort of people who can think about specifying systems but lack the training to sdo so so using the current methods
Nah, it will be extremely surprising if even 1 such a person exists.
On the other hand, there are lots of people that can write code, but still can't specify a system. In fact, if you keep increasing the size of the system, you will eventually fit every single programmer in that category.
Could you say more on how the tasks where it works vs. doesn't work differ? Just the fact that it's both small and greenfield in the one case and presumably neither in the other?
I'd be curious to hear your thoughts on how the "fixer", who sounds rather ineffective as an executive, came into this position, in what sounds like overall a rather effective organization.
I've been personally thinking quite a bit about what makes organizations work or not work recently, and your story is quite interesting to me as a glimpse into a kind of organization that I've never seen from the inside myself.
This is a good question, and it felt like nepotism. I do want to point out that this is all somewhat hazy memories from years ago when all of this happened, so take everything with a grain of salt (as usual). Also, a lot of this is going to sound like nepotism, which is most likely was, but this is hearsay from other people.
My understanding of how the "fixer" came into there position is a somewhat circuitous route. From my understanding (I didn't hear any of this directly from the "fixer" themselves, but other people who spent far more time with the "fixer" than myself), the "fixer" had spent about a decade out of the workforce prior to joining Tesla. My understanding is that they were raising kids while also dealing with aging parents. We'll just call this time the "fixer"'s work hiatus.
Prior to the hiatus, the "fixer" had moved into a small-team managerial role at a large, name-brand tech company during the late 90s/early 2000s. At the end of the hiatus, they leveraged some connections and somehow attained a director position at Tesla managing a team of about 30-40 people straight out of the hiatus.
From my understanding, the first team the "fixer" managed at Tesla didn't like working for them and after about 18 months, the team basically forced the "fixer" out. I'm not exactly sure what the team was doing to push the person out, but from what I heard, work basically ground to a halt for the entire team where they refused to work for the "fixer".
This was around the same time that the two projects went sideways that I mentioned, so the director I reported to was on the outs and the director's manager (a VP) was looking for someone who could step into the role. The VP somehow connected with the "fixer" and they worked out a deal where the "fixer" would lead the team on a 3-month probation period while the VP continued to look for someone to come into the position, while also giving the "fixer" a chance to earn the role.
(Side note: One other bit of context I want to provide is that the team I was on was about 50-60 or so people at this time right before the "fixer" came on. The "fixer" also did not have any sort of technical background and this team consisted of probably ~90% software professionals in some capacity. A lot of the conversations were very technical in nature, and the "fixer" did A LOT of delegating and "just tell me what decision you'd make and we'll do that" leadership.)
During this probation period, I thought the "fixer" actually did a good job getting a lay of the land, the social dynamics at play, and helped work out some inefficiencies. However, a lot of this improvement was done by bringing in consultants to do the deep dive, discover problems, and provide guidance to the "fixer" on how to address the problems.
Once the probation period was over, the consultants left and the "fixer" was in charge. Pretty quickly, the firings began and over the course of the next 5-6 months, more than 70% of the team under the "fixer" was replaced. At the same time, the team I was working for merged with another team, and the team size under the "fixer" shot up to about 100-120 people post-merge (I forget the exact number). The "fixer" also hired quite a few more people thinking more people get the same projects done faster.
To say the least, it was a pretty chaotic time because the entire team was under a lot of pressure with in-flight projects, not knowing if they were going to randomly be fired or not, new people to mentor/gel with, and lots of random projects being thrown at us.
About 6 months after I left, the "fixer" was fired and someone else who had extensive experience was brought in to right the ship. Per my understanding with people who were still working there about a year after the "fixer" left, the new person was very successful and had done a good job leading the team. Also, the person who I found to be my replacement stayed nearly 7 years at Tesla, so I guess I did a good job with that one.
I use VS Code in a beefy Codespace, with GitHub Copilot (Opus 4.5).
I have a single instruction file telling the AI to always run "lake build ./lean-file.lean" to get
feedback.
This is very similar to how I worked with Lean a year ago (of course in a much simpler domain) - mostly manual editing, sometimes accepting an inline completion or next edit suggestion.
However, with agentic AI that can run lean via CLI my workflow changed completely and I rarely write full proofs anymore (only intermediate lemma statements or very high level calc statements).
There have been bugs in Lean that allowed people to prove False, from which you can prove anything (they have been fixed).
Otherwise, if you check that no custom axiom has been used (via print axioms), the proof is valid.
It's easy to construct such an example: Prove that for all a, b, c and n between 3 and 10^5, a^n=b^n+c^n has no solution.
The unmeaningful proof would enumerate all ~10^20 cases and proof them individually. The meaningful (and probably even shorter) proof would derive this from Fermat's theorem after proving that one.
My understanding is that all recent gains are from post training and no one (publicly) knows how much scaling pretraining will still help at this point.
Happy to learn more about this if anyone has more information.
I still remember gemini 1.5 ultra and gpt 4.5 as extremely strong at some areas that no benchmark capture. It was probably not economical to use them at 20 usd subscription, but they felt differently and smarter at some ways. The benchmarks seems to be missing something, because flash 3 was very close on some benchmarks to 3 pro, but much, much dumber.
The compression ratio is the vibe coding gain.
I think that way of phrasing it makes it easier to think about boundaries of vibe coding.
"A class that represents (A) concept, using the (B) data structure and (C) algorithms for methods (D), in programming language (E)."
That's decodeable, at least to a narrow enough distribution.
"A commercially successful team communication app built around the concept of channels, like in IRC."
Without already knowing Slack, that's not decodable.
Thinking about what is missing is very helpful. Obviously, the business strategic positioning, non technical stakeholder inputs, UX design.
But I think it goes beyond that: In sufficiently complex apps, even purely technical "software engineering" decisions are to some degree learnt from experiment.
This also makes it more clear how to use AI coding effectively:
* Prompt in increments of components that can be encoded in a short prompt.
* If possible, add pre-existing information to the prompt (documentation, prior attempts at implementation).
reply