I didn't dig into what the actual repository was doing, but personally, I took some inspiration from the idea after reading about it and realizing that I might have been underestimating the ability of LLMs. I put a bit more work into a performance harness I was using locally and just set some agents to brainstorming and they did seem to find some great stuff. So I don't really have a stance one way or another on this specific repo, but the general idea seems like a really good one.
Could you elaborate in specifics how you had been underestimating models? Ypu mean just using more tighter harnessing to make them work in structured agentic eay or something else?
The specific code I was working on, I had a general idea of the sort of performance improvement that would be possible. I just thought that it would be too hard for the models to figure out without a lot of hand-holding.
But it ended up being not "too hard ever", but more like, in 1 out of every 5 tries, the model did in fact manage to get a large refactoring to the point where it improved performance. So once I set it up to try something, use the perf test, see if it worked, if not, throw it away, repeat. Then it started, slowly, finding some useful things.
Just remember that the will do clever but useless things to improve. Like changing the random seed as per autoresearch's hero image. lol! imo, out of the box thinking is needed.