More

GodelNumbering · 2026-04-12T14:16:32 1776003392

In the anticipation of a future where,

a) quotas will get restricted

b) the subscription plan prices will go up

c) all LLMs will become good enough at coding tasks

I just open sourced a coding agent https://github.com/dirac-run/dirac

The entire goal is to be token efficient (over 50% cheaper), and by extension, take advantage of LLM's better reasoning at shorter context lengths

This really started as an internal side project that made me more productive, I hope it will help others too. Apache 2.0

Currently it still can't compete the subsidized coding plan rates using Anthropic API pricing though (even though it beats CC while both use API key), which tells me that all subscription plan operators are losing money on such plans

GodelNumbering · 2026-04-09T12:08:41 1775736521

Highlights:

- Uses a novel approach to hash-anchoring that reduces the overhead of hash anchors to a minimum

- Uses AST searches and edits (builds a local sqlite3 db)

- A large amount of performace improvements and aggressive bloat removal

- Completely gutted mcp and enterprise features

- Last I checked, 40k+ lines were removed and other 64k lines were either added or changed

Please give this a try!

GodelNumbering · 2026-04-07T21:04:23 1775595863

Priced at $25/$125 per million input/output token. Makes you wonder whether it makes more financial sense to hire 1-2 engineers in a cheap cost of living country who use much cheaper LLMs

arm32 · 2026-04-07T21:13:54 1775596434

The issue is that those engineers have to have good taste, but yes—absolutely. Ah, industrialization.

GodelNumbering · 2026-03-30T14:08:43 1774879723

> Today, unlike in the Luddites’ time, we are already seeing skilled workers replaced not with lower-wage human labor, but with AI.

To me this is the weakest claim of the article. This claim been thrown around endlessly without proof.

https://fred.stlouisfed.org/series/IHLIDXUSTPSOFTDEVE

Software Engineer job openings for instance is at 2 year high (still far lower than covid dislocations though), but arguably all Enterprise AI was built or deployed in the last two years. We should have seen a crash in the job openings if the AI job replacement claim was correct.

This is something I've spend some time thinking about (personally written article, not AI slop): https://www.signalbloom.ai/posts/why-task-proficiency-doesnt...

GodelNumbering · 2026-03-25T21:59:54 1774475994

Off topic but I have been following your Twitter for a while and your posts specifically about the nature of intelligence have been a read.

GodelNumbering · 2026-03-12T19:32:24 1773343944

As an inference hungry human, I am obviously hooked. Quick feedback:

1. The models/pricing page should be linked from the top perhaps as that is the most interesting part to most users. You have mentioned some impressive numbers (e.g. GLM5~220 tok/s $1.20 in · $3.50 out) but those are way down in the page and many would miss it

2. When looking for inference, I always look at 3 things: which models are supported, at which quantization and what is the cached input pricing (this is way more important than headline pricing for agentic loops). You have the info about the first on the site but not 2 and 3. Would definitely like to know!

2uryaa · 2026-03-12T21:00:43 1773349243

Thank you for the feedback! I think we will definitely redo the info on the frontpage to reorg and show quantizations better. For reference, Kimi and Minimax are NVFP4. The rest are FP8. But I will make this more obvious on the site itself.

bethekind · 2026-03-13T00:05:00 1773360300

I love the phrase "inference hunger"

GodelNumbering · 2026-03-11T20:31:14 1773261074

Even if people try to bypass it, having the official rule matters a lot.

@dang, if you read this, why don't we implement honeypots to catch bots? Like having an empty or invisible field while posting/commenting that a human would never fill in

tomasz-tomczyk · 2026-03-11T20:42:01 1773261721

It's likely going to be a game of whack-a-mole, especially with AI as opposed to simple bots/scripts. Not that they shouldn't try to prevent it, but not entirely sure what the solution is.

tavavex · 2026-03-11T20:51:26 1773262286

There's probably no solution, but at least this gives a reason to go after the lowest hanging fruit - the zero-effort, obvious, low-quality output.

GodelNumbering · 2026-03-11T13:04:40 1773234280

I imagine that would cause a backlash from the website owners trusting cloudflare to keep their content 'safe'

GodelNumbering · 2026-03-03T17:52:23 1772560343

That's a 150% increase in the input costs and 275% increase on output costs over the same sized previous generation (2.5-flash-lite) model

GodelNumbering · 2026-02-26T07:06:22 1772089582

It is probably the first-time aha moment the author is talking about. But under the hood, it is probably not as magical as it appears to be.

Suppose you prompted the underlying LLM with "You are an expert reviewer in..." and a bunch of instructions followed by the paper. LLM knows from the training that 'expert reviewer' is an important term (skipping over and oversimplifying here) and my response should be framed as what I know an expert reviewer would write. LLMs are good at picking up (or copying) the patterns of response, but the underlying layer that evaluates things against a structural and logical understanding is missing. So, in corner cases, you get responses that are framed impressively but do not contain any meaningful inputs. This trait makes LLMs great at demos but weak at consistently finding novel interesting things.

If the above is true, the author will find after several reviews that the agent they use keeps picking up on the same/similar things (collapsed behavior that makes it good at coding type tasks) and is blind to some other obvious things it should have picked up on. This is not a criticism, many humans are often just as collapsed in their 'reasoning'.

LLMs are good at 8 out of 10 tasks, but you don't know which 8.

Kim_Bruning · 2026-02-26T07:58:43 1772092723

In your model, explain the old trick "think step by step"

GodelNumbering · 2026-02-26T17:15:36 1772126136

It simply forces the model to adopt an output style known to conduce systematic thinking without actually thinking. At no point has it through through the thing (unless there are separate thinking tokens)