I don't think there's much recursive improvement yet.
I'd say it's a combination of
A) Before, new model releases were mostly a new base model trained from scratch, with more parameters and more tokens. This takes many Months. Now that RL is used so heavily, you can make infinitely many tweaks to the RL setup, and in just a month get a better model using the same base model.
It's a wiki. Maybe you lose the edit history and stuff like that, but the actual content which is what matters should be very easy to recreate from those sources.
The agents are not that good yet, but with human supervision they are there already.
I've forked a couple of npm packages, and have agents implement the changes I want plus keep them in sync with upstream. Without agents I wouldn't have done that because it's too much of a hassle.
The 'global view' doc should be in DESIGN.md so that humans know to look for it there, and AGENTS.md should point to it. Similar for other concerns. Unless something really is solely of interest to robots, it shoudn't live directly in AGENTS.md AIUI.
You can't possibly cram everything into AGENTS, also LLMs still do not perfectly give the same weight to all of its context, ie. it still ignores instructions.
Perhaps I’m not using the latest and greatest in terms of models. I tend to avoid using tools that require excessive customization like this.
I find it infinitely frustrating to attempt to make these piece of shit “agents” do basic things like running the unit/integrations tests after making changes.
That’s not what Claude and Codex put there when you ask them to init it. Also, the global view is most definitely bigger than their tiny, loremipsum-on-steroids, context so what do you do then?
You know you can put anything there, not just what they init, right? And you can reference other doc files.
I should probably stop commenting on AI posts because when I try to help others get the most out of agents I usually just get down voted like now. People want to hate on AI, not learn how to use it.
Gen AI for art was different because it would just output a final image with basically 0 control for the artist. It's like if AI programming would output a binary instead of source code.
Programming languages need to give the developer a way to iterate (map, fold, for-loop, whatever) over a collection of items. Over time we've come up with more elegant ways of doing this, but as a programmer, until LLMs, you've still had to be actively involved in the control logic. My point is that a developer's relationship with the code is very different now, in a way that wasn't true with previous low-to-high level language climbs.
If you can't deliver features faster with AI assistance then you're either using it wrong or working on very specialized software that AI can't handle yet.
I've built a SaaS (with paying customers) in a month that would have taken me easily 6 months to build with this level of quality and features. AI wrote I'd say 99.9% of code. Without AI I wouldn't even have done this because it would have been too large of a task.
In addition, for my old product which is 5+ years old, AI now writes 95%+ of code for me. Now the programming itself takes a small percentage of my time, freeing me time for other tasks.
Quality is better both from a user and a code perspective.
From a user perspective I often implement a feature and then just throw it away no worries because I can reimplement it in an hour again based on my findings. No sunken cost. Also I can implement very small details that otherwise I'd have to backlog. This leads to a higher quality product for the user.
From a code standpoint I frequently do large refactors that also would never have been worth it by hand. I have a level of test coverage that would be infeasible for a one man show.
It's boring glorified CRUD for SMBs of a certain industry focused on compliance and workflows specific to my country. Think your typical inventory, ticketing, CRM + industry specific features.
Boring stuff from a programming standpoint but stuff that helps businesses so they pay for it.
NYC people uses it because the alternatives are either slower or much more expensive. I'm sure they'd rather use a waymo if it was as fast and cheap as the subway.
Using Lyft, Uber, or Waymo in San Francisco is slow, especially during peak times. To go across town in NYC by train, it would take 5-10 times as long to go that same distance in SF by car. If you have to cross a bridge or tunnel, it's going to be even longer during peak times.
That's the whole problem. Car transportation simply doesn't scale, so there will never be an option to use waymo that's as fast and cheap as the subway. It's worth calling out that an efficient train system is vital to keeping car traffic moving quickly, because once everyone is in a car, it's gridlock.
I'd say it's a combination of
A) Before, new model releases were mostly a new base model trained from scratch, with more parameters and more tokens. This takes many Months. Now that RL is used so heavily, you can make infinitely many tweaks to the RL setup, and in just a month get a better model using the same base model.
B) There's more compute online
C) Competition is more fierce.
reply