Hacker Newsnew | past | comments | ask | show | jobs | submit | rbalicki's commentslogin

That's exactly the tradeoff I made with Barnum (https://barnum-circus.github.io/). It's just not important to optimize the performance of the rust side for the reason you stated. So instead, all focus goes into making it easy for an LLM to build a reliable pipeline (from which LLMs are invoked).


Hey folks! This talk is about GraphQL in a world of fullstack, rich clients and about Isograph. The question it asks is: does GraphQL need to exist? Can we get its benefits without a GraphQL schema, without a GraphQL server, and without sending GraphQL over the wire?

It's my opinion that Isograph gives us the benefit of GraphQL, without many of its limitations. Which is to say, if you start from scratch, you can avoid mistakes.

This describes a future iteration of Isograph. Currently, much of what's described here is on the roadmap. But it's coming!


You may want to check out Barnum, which is a programming language/agent orchestration tool that makes it easy to build things like /loop, or Claude code routines. And you won't end up dependent on the specifics of how Claude code routines work!

https://github.com/barnum-circus/barnum


If you want to feel like you're using a programming language when orchestrating agents, check out https://github.com/barnum-circus/barnum


You can lessen your dependence on the specific details of how /loop, code routines, etc. work by asking the LLM to do simpler tasks, and instead, having a proper workflow engine be in charge of the workflow aspects.

For example, this demo (https://github.com/barnum-circus/barnum/tree/master/demos/co...) converts a folder of files from JS to TS. It's something an LLM could (probably) do a decent job of, but 1. not necessarily reliably, and 2. you can write a much more complicated workflow (e.g. retry logic, timeout logic, adding additional checks like "don't use as casts", etc), 3. you can be much more token efficient, and 4. you can be LLM agnostic.

So, IMO, in the presence of tools like that, you shouldn't bother using /loop, code routines, etc.


One thing my team lead is working on is using Claude to 'generate' integration tests/add new tests to e2e runs.

Straight up asking Claude to run the tests, or to generate a test, could result in potential inconsistencies between runs or between tests, between models, and so on, so instead he created a tool which defines a test, inputs and outputs and some details. Now we have a system where we have a directory full of markdown files describing a test suite, parameters, test cases, error cases, etc., and Claude generates the usage of the tool instead.

This means that whatever variation Claude, or any other LLM, might have run-to-run or drift over time, it all still has to be funneled through a strictly defined filter to ensure we're doing the same things the same way over time.


I'm looking at implementing https://github.com/coleam00/Archon as a means to solve this. You can build arbitrary workflows custom to your codebase. Looks to bring a bit of much-needed determinism.


What kind of system/area (or product) are you working on?


>You can lessen your dependence on the specific details of how /loop, code routines, etc. work by asking the LLM to do simpler tasks, and instead, having a proper workflow engine be in charge of the workflow aspects.

Or, you know, by writing the code yourself?



"You can lessen your dependence on a specific LLM implementation by not using LLMs" is certainly a take but it doesn't really address the root issue of models getting nerfed to save resources after they've gained wide adoption.


A simple task ("convert this file from JS to TS, here are the types of all imported things") is much more likely to continue to work with a nerfed model compared to a complicated task ("convert this repo to TS, make sure to run tsc afterward and fix all errors"). The former is a subtask of the latter!

Taking a moment to create a workflow where these steps are separated (or rather, having an LLM build this workflow) and the LLMs are asked to just do minor leaf tasks increases your resilience to nerfed models.


I'm as nerdy as they come (my current project is the fourth compiler I've worked on), and I absolutely love this new way of working. There's a lot more time spent in discussion with the agent (an extremely frustrating discussion, to be fair). All of a sudden, there's an extremely high payoff to investing in good fundamentals (namely, clarity of requirements, good tools, etc.), which are the things I want to invest in anyway! If you get these fundamentals right, you can let the agent rip and produce hundreds of PRs that are correct, or create workflows that are actually not slop or ship code that is, while not yet as high quality as if you wrote it manually, quite close, at easily five times the speed.

And throughout this, if I'm ever curious about how the ideas relate to some other topic, I can just ask the agent, "Are we designing XYZ right now? Categorically, is it this?" Lots of really cool discussions to be had.

I might be less enthusiastic if I was just shipping CSS changes and the like.


The skill atrophy point strikes me as tenuous at best. Obviously, the plural of anecdote is not data, but I find myself able to work on projects of greater complexity than I would have been able to otherwise. 90% of my time is spent going back and forth on Markdown files, discussing the architecture, trade-offs, etc. I don't think it's necessarily impossible to use all this newfound power to ship more sloppier code. It's clearly possible to use all this newfound power to ship better code too.


> I find myself able to work on projects of greater complexity than I would have been able to otherwise

Yes. Now turn off the LLM and make an improvement to that code.


Exactly, this is like watching youtubers code, ie backseat coding. It’s easy to follow along but taking control midway is anything but, especially in a codebase that has been written by an agent and you don’t have any muscle-memory in.


I think what you're implying is that the agent ships unmaintainable slop. Certainly, if I don't pay attention and review the code line by line, it will ship slop. And even sometimes, when I'm certain that it is implemented one way, I'll come back to the code many days later and discover that it went a completely different route than I expected. Very frustrating.

But it doesn't have to be that way. You just have to put an effort into shipping fewer, better features as opposed to more features. The projects I'm working on (e.g. agent orchestration, because who isn't nowadays) have a small surface area and high payoff and thus are uniquely well positioned for this.

If I couldn't use an LLM, I would still work on this, and it would have roughly the same architecture. But because I'm able to go probably 100x as fast, I'm able to be much more ambitious. Or rather, I'm able to discover that my initial ideas were not on point and pivot, and not have any sense of sunk cost

Anyway, to each their own.


No problem. 30+ years of experience isn't going to disappear any time soon.


No one says it's going to disappear overnight, they're saying it's going to atrophy.


Totally agree with this taking on projects of greater complexity. I honestly feel the sloppier code thing is going to die soon. People make mistakes too. Always see people holding the machine to like this totally different standard.


I agree. I think we simply don't have the tools yet to hold agents to that high architectural standard. It simply takes a lot of focused effort and berating and close comprehension of the code at the moment to ship anything good, but there are lots of people (myself included) working on that problem. I'm pretty sure in a matter of months it will be solved.

Months! That's not a long time.


a real study from Microsoft + Carnegie Mellon University with 319 study participants

> while GenAI can improve worker efficiency, it can inhibit critical engagement with work and can potentially lead to long-term overreliance on the tool and diminished skill for independent problem-solving.

https://www.microsoft.com/en-us/research/wp-content/uploads/...

it's a real problem when applied to a population.


You may be interested in checking out https://www.youtube.com/watch?v=lhVGdErZuN4, where I talk about the benefits of Relay. This isn't (currently) possible without GraphQL, so it's a pretty compelling case for GraphQL.

But yeah, IMO, GraphQL doesn't justify itself unless you're using a client like Relay, with data masking and fragment colocation.


That is an interesting talk, thank you!


I would encourage you to write an educated person's critique of GraphQL, because OP's article + https://bessey.dev/blog/2024/05/24/why-im-over-graphql/ etc. suck up all of the oxygen, and no one hears about the genuine issues like that.

(And don't forget lack of generics, no support for interfaces with no fields, lack of closed unions/interfaces, the absolutely silly distinction between unions and interfaces, the fact that the SDL and operation language are two completely different things...)


This is a genuinely accurate critique of GraphQL. We're missing some extremely table-stakes things, like generics, discriminated unions in inputs (and in particular, discriminated unions you can discriminate and use later in the query as one of the variants), closed unions, etc.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: