Interesting that what you're talking about as ASI is "as capable of handling explicit requirements as a human, but faster". Which _is_ better than a human, so fair play, but it's striking that this requirement is less about creativity than we would have thought.
The work where I've done well in my life (smashing deadlines, rescuing projects) has so often come because I've been willing to push back on - even explicitly stated - requirements. When clients have tried to replace me with a cheaper alternative (and failed) the main difference I notice is that the cheaper person is used to being told exactly what to do.
Maybe this is more anthropomorphising but I think this pushing back is exactly the result that the LLMs are giving; but we're expecting a bit too much of them in terms of follow-up like: "ok I double checked and I really am being paid to do things the hard way".
To be fair, there is likely not much training data on the difficult conversations you need to handle in a senior position, pushback being one of them. The trouble for the agents is that it is post hoc, to explain themselves, rationalising rather than ”help me understand” beforehand.
> Maybe we should just commit the signature change with a TODO
I'm fascinated that so many folks report this, I've literally never seen it in daily CC use. I can only guess that my habitually starting a new session and getting it to plan-document before action ("make a file listing all call sites"; "look at refactoring.md and implement") makes it clear when it's time for exploration vs when it's time for action (i.e. when exploring and not acting would be failing).
I think the author is looking for something that doesn't exist (yet?). I don't think there's an agent in existence that can handle a list of 128 tasks exactly specified in one session. You need multiple sessions with clear context to get exact results. Ralph loops, Gastown, taskmaster etc are built for this, and they almost entirely exist to correct drift like this over a longer term. The agent-makers and models are slowly catching up to these tricks (or the shortcomings they exist to solve); some of what used to be standard practice in Ralph loops seems irrelevant now... and certainly the marketing for Opus 4.7 is "don't tell it what to do in detail, rather give it something broad".
In fairness to coding agents, most of coding is not exactly specified like this, and the right answer is very frequently to find the easiest path that the person asking might not have thought about; sometimes even in direct contradiction of specific points listed. Human requirements are usually much more fuzzy. It's unusual that the person asking would have such a clear/definite requirement that they've thought about very clearly.
Just as a human would use a task list app or a notepad to keep track of which tasks need to be done so can a model.
You can even have a mechanism for it to look at each task with a "clear head" (empty context) with the ability to "remember" previous task execution (via embedding the reasoning/output) in case parts were useful.
The article makes it seem like the author expected this without emptying context in between, which does not yet exist (actually I'm behind on playing with Opus 4.7, the Anthropic claim seems to be that longer sessions are ok now - would be interested to hear results from anyone who has).
That is probably the next step, and in practice it is much of what sub-agents already provide: a kind of tabula rasa. Context is not always an advantage. Sometimes it becomes the problem.
In long editing sessions with multiple iterations, the context can accumulate stale information, and that actively hurts model performance. Compaction is one way to deal with that. It strips out material that should be re-read from disk instead of being carried forward.
A concrete example is iterative file editing with Codex. I rewrite parts of a file so they actually work and match the project’s style. Then Codex changes the code back to the version still sitting in its context. It does not stop to consider that, if an external edit was made, that edit is probably important.
I have the same experience of reversing intentional steps I've made, but with Claude Code. I find that committing a change that I want to version control seems to stop that behaviour.
Long context as disadvantage is pretty well discussed, and agent-native compaction has been inferior to having it intentionally build the documentation that I want it to use. So far this has been my LLM-coding superpower. There are also a few products whose entire purpose is to provide structure that overcomes compaction shortcomings.
When Geoff Huntley said that Claude Code's "Ralph loop" didn't meet his standards ("this aint it") the major bone of contention as far as I can see was that it ran subagents in a loop inside Claude Code with native compaction; as opposed to completely empty context.
I do see hints that improving compaction is a major area of work for agent-makers. I'm not certain where my advantage goes at that point.
Agreed. I am asking for something beyond the current state of the art. My guess is that stronger RL on the model side, together with better harness support, will eventually make it possible. However, it's the part about framing the failure to do complete a task as a communication mishap that really makes me go awry.
"It's more powerful and easier" is a great claim, but I need examples in this opening page to convince me of the pain I could save myself or the awesome things I'm living without.
- You aren't forced to resolve rebase/merge conflicts immediately. You can switch branches halfway through resolving conflicts and then come back later and pick up where you left off. You can also just ignore the conflicts and continue editing files on the conflicted branch and then resolve the conflicts later.
- Manipulating commits is super easy (especially with jjui). I reorder commits all the time and move them between branches. Of course you can also squash and split commits, but that's already easy in git. Back when I was using git, I would rarely touch previous commits other than the occasional squash or rename. But now I frequently manipulate the commit history of my branch to make it more readable and organized.
- jj acts as a VCS for your VCS. It has an operation log that is a history of the state of the git repository. So anything that would be destructive in git (e.g. rebase, pull, squash, etc) can be undone.
- Unnamed branches is the feature that has changed my workflow the most. It's hard to explain, so I probably won't do it justice. Basically you stop thinking about things in terms of branches and instead just see it as a graph of commits. While I'm experimenting/exploring how to implement or refactor something, I can create "sub-branches" and switch between them. Similar to stashes, but each "stash" is just a normal branch that can have multiple commits. If I want to test something but I have current changes, I just `jj new`. And if I want to go back, I just make a new commit off of the previous one. And all these commits stick around, so I can go back to something I tried before. Hopefully this made some sense.
Also note that jj is fully compatible with git. I use it at work and all my coworkers use git. So it feels more like a git client than a git replacement.
All of these features sound like the recipe for a confusing nightmare!
"You can switch branches halfway through resolving conflicts and then come back later and pick up where you left off. You can also just ignore the conflicts and continue editing files on the conflicted branch and then resolve the conflicts later."
"Similar to stashes, but each "stash" is just a normal branch that can have multiple commits. If I want to test something but I have current changes, I just `jj new`. And if I want to go back, I just make a new commit off of the previous one. And all these commits stick around, so I can go back to something I tried before."
Turns out, git sorta trains you to be very, very afraid of breaking something.
jj answers this in a few ways:
1. everything is easily reversible, across multiple axes.
2. yes, everything is basically a stash, and it's a live stash — as in, I don't have to think about it because if it's in my editor, it's already safely stored as the current change. I can switch to a different one, create a new one, have an agent work on another one, etc, all without really caring about "what if I forgot to commit or stash something". Sounds like insanity from a git POV but it really is freeing.
3. Because of 2, you can just leave conflicts alone and go work on something else (because they are, like you said, essentially stashed). It's fine and actually very convenient.
The thing the article doesn't mention, that makes this all safe, is that trunk / "main" is strictly immutable. All this flexibility is *just* for unmerged WIP. (There are escape hatches though, naturally!)
The "you don't need to worry about resolving conflicts" thing is confusing when you hear it with words, so let me show you what it looks like in practice.
Let's say I have two branches off of trunk. They each have one commit. That looks like this (it looks so much nicer with color, I'm going to cut some information out of the default log so that it's easier to read without the color):
So both `foo` and `bar` are on top of trunk, and I'm also working on a third branch on top of trunk (@). Those vvxv and such are the change ids, and you can also see the named trunk there as well.
Now, I fetch from my remote, and want to rebase my work on top of them: a `jj git fetch`, and then let's rebase `foo` first: that's `jj rebase uu -o trunk` (you only need uu instead of uuowqquz because it's a non-ambiguous prefix, just like git). Uh oh! a conflict!
Note that jj did not put us into a "hey there's a conflict, you need to resolve it" state. It just did what you asked: it rebased it, there's a conflict, it lets you know.
So why is this better? Well, for a few reasons, but I think the simplest is that we now have choice: with git, I would be forced to deal with this conflict right now. But maybe I don't want to deal with this conflict right now: I'm trying to update my branches in general. Is this conflict going to be something easy to resolve? In this case, it's one commit. But what if each of these branches had ten commits, with five of them conflicted and five not? It might be a lot of work to fix this conflict. So the cool thing is: we don't actually have to. We could continue our "let's rebase all the branches" task and rebase bar as well. Maybe it doesn't have a conflict, and we'd rather go work on bar before we come back and deal with foo. Heck, sometimes, I've had a conflicted branch, and then a newer version of trunk makes the conflict go away! I only have to choose to address the conflict at the moment I want to return to work on foo.
There's broader implications here, but in practice, it's just that it's simply nicer to have choice.
In practice, it isn't. What you're identifying as potentially nightmarish - and no doubt quite tedious in git - are things that JJ enables you to do with a small subset of commands that work exactly how you expect them to work _in every workflow context_ in which they are needed.
Thinking specifically about conflicts: being able to defer conflicts until you're ready to deal with them is actually great. I might not be done with what I am actually working on and might want to finish that first. being forced into a possibly complicated conflict resolution when I'm in the middle of something is what I'd actually consider nightmarish.
When you want to solve the conflict: `jj new <rev>`, solve the conflict, then `jj squash`, your conflict resolution is automatically propagated to the chain of child commits from the conflict.
Remember when you used SVN or whatever before git, and you loved git because of how easy it is to make branches?
With branches, jj is to git what git was to SVN. It's an order of magnitude less friction to do branching in jj than git.
Not long ago, I pulled from main and rebased my branch onto it - merge conflicts. But I wanted to work on some other feature at the moment. Why should I have to fix this merge conflict to work on a feature on a totally different branch? With jj, I don't. I just switch to the other branch (that has no conflict), and code my new feature. Whenever I need to work on the conflicted branch, I'll go there and fix the conflict.
Once I started using jj, I realized how silly it was for git to have separate concepts for stash and index. And it's annoying that stash/index is not version controlled in git. Or is it? I have no idea.
In jj, a stash is simply yet another unnamed branch. Do whatever you want there. Add more commits. Then apply it to any branch(es) that you would like to. Or not.
Why does git need a separate concept of a stash? And wouldn't you like a version controlled stash in git?
Have you ever made a ton of changes, done a "git add", accidentally deleted some of the changes in one file, done a "git add", and thought "Oh crap!" I suppose that information can be recovered from the reflog. But wouldn't you wish "git add" was version controlled in the same way everything else is?
That's the appeal of jj. You get a better stash. You get a better index. And all with fewer concepts. You just need to understand what a branch (or graph) is, and you get all of it. Why give it a name like "stash" or "index"?
Why does git insist on giving branches names? Once you get used to unnamed branches, the git way just doesn't make sense. In jj you'll still give names wherever you need to.
Anonymous branches are amazing for when you are trying out a bunch of different approaches to a problem. As I search the space of possible solutions for what I'm really looking for, I end up with a tree of various approaches.
Then when you rebase, the entire tree of anonymous branches can be rebased onto main in 1 command. This is why the first class conflicts and not having to resolve conflicts immediately is so important: when i'm rebasing, an entire tree of branches is getting rebased and so if you had to resolve conflicts right away it would be incredibly cumbersome, because I'm rebasing like 30+ commits and a bunch of anonymous branches in a single operation.
I work on top of an octopus merge of all my in-flight PRs. ON top of that merge commit i have a bunch of anonymous branches with various things going on. When I'm ready to submit a PR, I take one of those anonymous branches and rebase it onto main and make it an additional parent of my 'dev-base' merge commit. Then i give that branch a name and submit it as a PR.
Every day when I start working, I rebase this entire subgraph of branches in a single command onto main. all my PRs are up to date, all my anonymous branches are up to date, etc... Takes like 2 seconds. If some of my anonymous branches are in a conflicted state, that's ok, i don't have to deal with it until I want to work on that change again.
These anonymous branches aren't confusing because they all show up in the default revset that is shown when looking at the jj log. I can easily browse through them with jjui TUI and instantly see which ones are what. It's really not confusing at all.
typical for experienced git users who already 'just don't do' things which git punishes you for; after a decade it's hard to even imagine any other way, not to mention that it might be better. been there, done that, jj is legit after letting go of (some of) git.
I also like the powerful revision querying mechanisms that they pulled in from mercurial. They seem to work just like mercurial revset queries which can be used in various operations on sets of revisions.
I would like them to have mercurial's awesome hg fa --deleted when it comes to history trawling, but apparently for it to work well, they also need to swap out git's diff format for mercurial's smarter one, so I'll be waiting on that for a while I suppose.
Yeah we moved on from SVN to git because SVN branches were truly a pain in the ass to work with. I truly do not have any rough edges or big pains in my day to day git workflow.
Specific commands don't really showcase the appeal of jj. If anything they might scare someone at first glance. It's the fact that the workflows are intuitive and you never find yourself reaching for help to get something done. You really need to try it to understand it.
jj is better for some workflows, which, if you're a git expert as you claim, you conciously or subconciously avoid as 'too much work' or 'too brittle'.
if you don't care about them after accepting this realization... it's fine. git is good enough.
I’m not a fit expert by any means. The workflows being described do not appeal to me but not because of the way fit works. They sound confusing and I don’t understand what benefit I’m getting out of them. Like, it’s a solution to a problem I’m not sure exists (for me)
does trivially working on 3 PRs in a single checkout and pushing focused changes to each one independently without thinking twice count?
if you don't need this, you might not see any value in jj and that's ok. you might use magit to get the same workflow (maybe? haven't used magit personally) and that's also ok.
It might count, but it is easy with git as well, what is the feature in jj that makes this easier? Switching branches and pushing changes to remotes is the core feature of git and in my opinion really easy so I'm curious how jj improves on it.
Guess he was talking about the presentation, not what the tool can achieve. It has no hard proof on the first page, which could easily just be a LinkedIn pitch, but not on hackernews
I know how I would do this in git, but don't really see how this would be in jj. I currently don't use it in my workflow, but if it is super easy in jj then I could see myself switching.
This creates an empty commit that merges all 3 branches, you can think of this as your staging area.
When you want to move specific changes to an existing commit, let's say a commit with an ID that starts with `zyx` (all jj commands highlights the starting characters that make the commit / change unambiguous):
jj squash -i --to zyx
Then select your changes in the TUI. `-i` stands for interactive.
If you want to move changes to a new commit on one of the branches:
jj split -i -A branch1
Then select the changes you want moved. `-A` is the same as `--insert-after`, it inserts the commit between that commit and any children (including the merge commit you're on).
There's one thing that's a bit annoying, the commit is there but the head of the branch hasn't been moved, you have to move it manually (I used + to get the child to be clearer, but I usually just type the first characters of the new change id):
the beauty of it is there's not much to show; I use a crude jjui approach where I have an octopus merge working tree commit (in command line terms, jj new PR_A PR_B PR_C) and either use native jj absorb (S-A in jjui) which guesses where to squash based on the path or, when I'm feeling fancy, rebase the octopus via jjui set parents (S-M) function (also handy to clean up parents when one of the PRs gets merged).
in large enough monorepos and teams and big enough changes you either do it like this or have a humongous giga-PR which eventually starts conflicting with everything.
"jj undo" is worth the price of admission by itself.
See the current top thread on HN about backblaze not backing up .git repos. People are flaming OP like they're an idiot for putting a git repo in a bad state. With jj, it's REALLY HARD to break your repo in a way that can't be fixed by just running "jj undo" a couple times.
Consider using the table of contents on the left of the page to view "Real World Workflows", "Branching, Merging, and Conflicts", and then "Sharing Your Code with Others" and then evaluate how JJ does things against your current git workflow. This requires some minor effort on your part.
What makes jj better requires a mindset change. When you start with jj, you use it as an alternate porcelain for git, meaning you just use the jj commands that map to the way you used git. You have to let go of that mindset. Until you do, you are still using those old git commands; they are just have prettier clothing. The prettier clothing is not worth the effort.
I don't know how to explain a mindset to you, so I'll give one example of something that sounds so grand, it seems impossible. (There are so many unusual aspects to jj, but hopefully this is one you can immediately relate to.) Git famously makes it hard to lose work, but nonetheless there are commands like `git reset --hard` that make you break out in a sweat. There is no jj command that destroys information another jj command can't bring back. And before you ask - yes of course jj has the equivalent of `git reset --hard`.
The general idea here is that jj has fewer and more orthogonal concepts than git. This makes it more regular, which is what I mean by "easy."
So for example, there is no index as a separate concept. But if you like to stage changes, you can accomplish this through a workflow, rather than a separate feature. This makes various things less complex: the equivalent of git reset doesn't need --hard, --soft, --mixed, because the index isn't a separate concept: it's just a commit. This also makes it more powerful: you can use any command that works on commits on your index.
Hard perhaps but it feels a lot easier now than three years ago. Or so my backlog of personal projects outside of my most familiar stack would suggest.
reply