Hacker Newsnew | past | comments | ask | show | jobs | submit | alasano's commentslogin

The main thing that was dumbing me down (and burning me out) was having to babysit LLMs on anything except basic tasks if I care about code quality/structure/maintainability.

I love coding, it always felt like Legos for adults. Not that Legos aren't also Legos for adults.

But there's no fighting the fact that we won't be writing 99% of the code anymore so I take pleasure in crafting the specs and requirements clearly, that's where I put the effort.

And then to avoid having to babysit the agents to get them to stick to the plan, I built a super robust external orchestrator that forces multiple review and fix rounds until I get the result I want.

I'll be fully open sourcing that soon also https://engine.build


Coinbase, CloudFlare, Cisco.

Another round of layoffs at CrowdStrike would fit the pattern nicely.


Meta , ms ( soft ) , Google .

I immediately closed the page as soon as I saw that pop up at the very top.

I've never enjoyed AWS more than with LLMs managing infra as code through sst.dev

I'm working on https://engine.build

It's a durable orchestration system for AI code generation which solves the problem of not being able to trust LLMs to complete long running (and high quality) implementations without having to babysit them and monitor the process, which is what I think is the most exhausting part of coding with AI.

You start with a spec or programmatic task list and the engine runs the whole workflow: implementation, verification, review, fixes, and finalization.

It treats agentic coding like a durable CI-style process, with state, retries, reviewer feedback, commits, and auditability built in. It's externally orchestrated, meaning it's not the agent running the loop, it's simply agents being used as tools and spawned in the loop as needed without awareness of the loop itself.

It's going to be open sourced soon and it's not meant to replace your IDE or Agentic Harness of choice. You keep using codex/claude code/open code/cursor/pi whatever you want and simply delegate the actual implementation to the engine, through MCP/CLI and other integration points.

It supports any LLM provider so you can have GPT 5.5 implementing and a mix of Opus 4.7 / Deepseek v4 Pro / GPT 5.5 reviewing at every phase for example.

Sign up on the website or follow us on https://x.com/enginedotbuild or me personally on https://x.com/aljosa , desperately need more followers :D


Sounds great, matches my philosophies and approaches I've been wanting to follow. Signed up, gave you a follow on twitter and am curious about the open source angle!

Thank you!

I've got a lot to say on the topic but I'll be making a video for launch showcasing everything.

For the open source angle, I think it's just a net positive for more people to have access to a way to build with LLMs without being exposed to AI related burnout.

And for open-source projects using it, the engine can act as a quality gate for PRs by requiring contributors to go through a repo defined implementation and review process.

Looking forward to getting it out :)


I'm building a robust runtime for this.

It's externally orchestrated and managed, not by an agent running the the loop.

The goal is to force LLMs to produce exactly what you want every time.

I will be open sourcing soon. You can use whatever harness or tools you already use, you just delegate the actual implementation to the engine.

https://engine.build


I think babysitting LLMs is exactly the thing that burns you.

Presuming you meant burns you out though.


No, "burns you" as in "play with fire and you'll get burned".

It will make a mistake and you will get burned, so you have to babysit it.


I enjoy the OpenSpec format but I think maintaining the main specs is not worth it.

I've stopped doing it entirely and just archive directly after implementation.

When you do the sync process, it just keeps drifting and drifting until you have duplication and contradictions across specs.

I agree that tying the specs and code together helps for that but it still seems like extra overhead, even if the value is better justified here.


Because I don't trust LLMs to fully implement what I want on the first try unless I babysit them. And it's the hand holding that burns people out.

I use detailed specs to implement but I don't maintain those specs as the source of truth afterwards, the code is indeed the source of truth.

I've built a library (and products on top) that takes in requirements (programmatic or various spec formats) and forces an externally orchestrated implement -> review -> fix loop that doesn't stop until all requirements are met.

So I'll write a detailed spec then I'll have GPT 5.5 implementing and a mix of opus 4.7 / GPT 5.5 / DeepSeek v4 pro reviewing at every phase until it produces the quality I want.

I can let it run overnight or just during the day while I'm doing stuff that doesn't burn me out and that I actually enjoy.

So tldr spec first for me but not as the source of truth afterwards.

I'll be open sourcing and launching soon https://engine.build


I tweeted about some implementation and review runs that used V4 Pro.

Even without the currently discounted pricing, the value is incredible.

It takes about twice as long to finish code reviews given an identical context compared to opus 4.7/gpt 5.5 but at 1/10 the cost of less, there's just no comparison.

https://twitter.com/aljosa/status/2049176528638902555


Did you do this test through OpenRouter?


Yes, but locked to the official DeepSeek provider since it's the only one that has the discounted pricing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: