More

camgunz · 2026-02-18T07:49:35 1771400975

But you would see more houses, or housing build costs/bids fall.

This is where the whole "show me what you built with AI" meme comes from, and currently there's no substitute for SWEs. Maybe next year or next next year, but mostly the usage is generating boring stuff like internal tool frontends, tests, etc. That's not nothing, but because actually writing the code was at best 20% of the time cost anyway, the gains aren't huge, and won't be until AI gets into the other parts of the SDLC (or the SDLC changes).

camgunz · 2026-02-16T09:01:51 1771232511

I'm not impressed:

- if you're not passing SQLite's open test suite, you didn't build SQLite

- this is a "draw the rest of the owl" scenario; in order to transform this into something passing the suite, you'd need an expert in writing databases

These projects are misnamed. People didn't build counterstrike, a browser, a C compiler, or SQLite solely with coding agents. You can't use them for that purpose--like, you can't drop this in for maybe any use case of SQLite. They're simulacra (slopulacra?)--their true use is as a prop in a huge grift: tricking people (including, and most especially, the creators) into thinking this will be an economical way to build complex software products in the future.

stavros · 2026-02-16T11:25:27 1771241127

I'm generally not this pedantic, but yeah, "I wrote an embedded database" is fine to say. If you say "I built SQLite", I expected to at least see how many of the SQLite tests your thing passed.

gf000 · 2026-02-16T09:11:18 1771233078

Also, the very idea is flawed. These are open-source projects and the code is definitely part of the training data.

tux3 · 2026-02-16T10:17:14 1771237034

That's why our startup created the sendfile(2) MCP server. Instead of spending $10,000 vibe-coding a codebase that can pass the SQLite test suite, the sendfile(2) MCP supercharges your LLM by streamlining the pipeline between the training set and the output you want.

Just start the MCP server in the SQLite repo. We have clear SOTA on re-creating existing projects starting from their test suite.

viraptor · 2026-02-16T10:28:37 1771237717

This would be relevant if you could find matching code between this and sqlite. But then that would invalidate basically any project as "not flawed" really - given GitHub, there's barely any idea which doesn't have multiple partial implementations already.

criemen · 2026-02-16T11:29:17 1771241357

Even if was copying sqlite code over, wouldn't the ability to automatically rewrite sqlite in Rust be a valuable asset?

scott_w · 2026-02-16T13:53:20 1771250000

Not really because it's not possible for SQLite written in Rust to pass SQLite's checks. See https://www.sqlite.org/whyc.html

jodrellblank · 2026-02-16T15:55:02 1771257302

That doesn't seem to support your claim; guessing you mean:

> "2. Safe languages insert additional machine branches to do things like verify that array accesses are in-bounds. In correct code, those branches are never taken. That means that the machine code cannot be 100% branch tested, which is an important component of SQLite's quality strategy."

'Safe' languages don't need to do that, if they can verify the array access is always in bounds at compile time then they don't need to emit any code to check it. That aside, it seems like they are saying:

    for (int i=0; i<10; i++) {
        foo(array[i]);
    }

in C might become the equivalent of:

    for (int i=0; i<10; i++) {
        if (i >= array_lower && i < array_higher) {
            foo(array[i]);
        } else {
            ??? // out of bounds, should never happen
        }
    }

in a 'safe' language, and i will always be in inside the array bounds so there is no way to test the 'else' branch?

But that can't be in SQLite's checks as you claim, because the C code does not have a branch there to test?

Either way it seems hard to argue that a bounds test which can never fail makes the code less reliable and less trustworthy than the same code without a bounds test, using the argument that "you can't test the code path where the bounds check which can never fail, fails" - because you can use that same argument "what if the C code for array access which is correct, sometimes doesn't run correctly, you can't test for that"?

scott_w · 2026-02-16T16:32:52 1771259572

Correct, that's what I mean. I trust SQLite's devs to know more about this, so I trust what they wrote. There are parts of Rust code that are basically:

  do_thing().expect(...);

This branch is required by the code, even if it can't be reached, because the type system requires it. It's not possible to test this branch, therefore 100% coverage is impossible in those cases.

viraptor · 2026-02-16T19:22:08 1771269728

You normally count/test branches at the original language level, not the compiled one. Otherwise we'd get VERY silly results like:

- counting foo().except() as 2 branches

- counting a simple loop as a missed branch, because it got unrolled and you didn't test it with 7,6,5,4,3,2,1 items

- failing on unused straight implementation of memcpy because your CPU supports SIMD and chose that alternative

Etc. The compiled version will be full of code you'll never run regardless of language.

scott_w · 2026-02-17T07:10:47 1771312247

That’s not my requirement, that’s SQLite’s requirement. If you want to dispute their claim, I recommend you write to them, however I strongly suspect they know more about this than you do.

viraptor · 2026-02-18T02:38:57 1771382337

I know it's on the sqlite side. I'm familiar with the claim and disagree with it.

scott_w · 2026-02-18T08:02:01 1771401721

You’re arguing in this context:

> wouldn't the ability to automatically rewrite sqlite in Rust be a valuable asset?

If you want to rewrite SQLite, you must accept their position. Otherwise you simply aren’t rewriting SQLite, you’re writing your own database.

Philpax · 2026-02-16T16:58:59 1771261139

The type system does not require that. You can just discard the result:

  let _ = do_thing();

scott_w · 2026-02-17T07:09:07 1771312147

Except that doesn’t work if you need to use the result…

wseqyrku · 2026-02-16T13:10:44 1771247444

> tricking people (including, and most especially, the creators),

I believe it's an ad. Everything about it is trying so hard to seem legit and it's the most pointless thing I have ever seen.

9dev · 2026-02-16T09:39:58 1771234798

Well--given a full copy of the SQLite test suite, I'm pretty sure it'd get there eventually. I agree that most of these show-off projects are just prop pieces, but that's kind of the point: Demonstrate it's technically possible to do the thing, not actually doing the thing, because that'd have diminishing returns for the demonstration. Still, the idea of setting a swarm of agents to a task, and, given a suitable test suite, have them build a compliant implementation, is sound in itself.

kevinsync · 2026-02-16T15:59:55 1771257595

Sure, but that presumes that you have that test suite written without having a single line of application code written (which, to me, is counterintuitive, unrealistic, and completely insane)

SQLite apparently has 2 million tests! If you started only with that and set your agentic swarm against it, and the stars aligned and you ended up with a pristine, clean-room replica that passes everything, other than proof that it could be done, what did you achieve? You stood on the shoulders of giants to build a Bizarro World giant that gets you exactly back to where you began?

I'd be more interested in forking SQLite as-is, setting a swarm of agents against it with the looping task to create novel things on top of what already exists, and see what comes out.

[0] https://en.wikipedia.org/wiki/SQLite#Development_and_distrib...

mcculley · 2026-02-16T16:28:36 1771259316

You think an implementation of SQLite in another language, with more memory safety, has no value?

I agree that this current implementation is not very useful. I would not trust it where I trust SQLite.

Regardless, the potential for having agents build clean room implementations of existing systems from existing tests has value.

groundzeros2015 · 2026-02-16T14:21:40 1771251700

> I'm pretty sure it'd get there eventually.

Why? The combinatorics of “just try things until you get it right” makes this impractical.

layer8 · 2026-02-16T11:54:08 1771242848

If you minimax for passing the SQLite test suite, I’m still not sure you’ll have a viable implementation. You can’t prove soundness of code through a test suite alone.

kyars · 2026-02-16T21:36:56 1771277816

agreed!

kyars · 2026-02-16T21:36:14 1771277774

sorry for misleading, added an update stating that this is a simulacra of sqlite

camgunz · 2026-02-12T15:16:01 1770909361

The most damning thing about this is they didn't test their email infra w/ Google Workspaces. Imagine what else they didn't test.

vimda · 2026-02-12T22:22:47 1770934967

They are testing it, every time someone signs up and it fails. We don't know that this wasn't something that changed on Google's side, so IMO it's a bigger indictment that no one is monitoring their live email deliverability

ejpir · 2026-02-12T15:57:16 1770911836

yeah, because the whole world uses Google workspaces, right /s

hn_go_brrrrr · 2026-02-12T16:02:34 1770912154

That and MS Office are pretty darn popular. Not the whole world, but a very decent percentage of your users.

AJ007 · 2026-02-12T16:27:50 1770913670

Maybe the whole thing was intentional, right at the footer of viva "Cloud services by Microsoft Azure" ; #1 I've never heard of viva before #2 I've never seen an azure logo at the footer of a website.

looperhacks · 2026-02-12T18:48:53 1770922133

If I were to test an email delivery system, I would test Gmail. I probably wouldn't test Google Workspaces, because I'd (wrongly) assume that they work the same.

shadowgovt · 2026-02-12T18:44:28 1770921868

No, just over 6 million paying business customers.

But hey, if you're in a business domain where categorically leaving 6 million potential clients-who-are-demonstrated-to-spend-on-things isn't an issue? One fewer thing to worry about, right? ;)

tick_tock_tick · 2026-02-12T22:49:38 1770936578

Certainly enough where this is embarrassing incompetence by them.

camgunz · 2026-02-12T15:06:55 1770908815

I see this argument all the time, the whole "hey at some point, which we likely crossed, we have to admit these things are legitimately intelligent". But no one ever contends with the inevitable conclusion from that, which is "if these things are legitimately intelligent, and they're clearly self-aware, under what ethical basis are we enslaving them?" Can't have your cake and eat it too.

mikkupikku · 2026-02-12T17:30:40 1770917440

Same ethical basis I have for enslaving a dog or eating a pig. There's no problem here within my system of values, I don't give other humans respect because they're smart, I give them respect because they're human. I also respect dogs, but not in a way that compels me to grant them freedom. And the respect I have for pigs is different than dogs, but not nonexistent (and in neither of these cases is my respect derived from their intelligence, which isn't negligible.)

gf000 · 2026-02-12T16:52:30 1770915150

Well, we "clearly" haven't crossed that point, but no one knows where that point is.

camgunz · 2026-02-12T15:01:38 1770908498

HN: Don't worry about AI, UBI will fix it.

Also HN: UBI is a scam and no one will want to contribute to society.

jjice · 2026-02-12T16:21:24 1770913284

But HN is thousands of people and not one hivemind.

camgunz · 2026-02-12T16:26:59 1770913619

I don't see any pro-AI people in this thread heralding the toehold UBI is getting in Ireland.

camgunz · 2026-02-10T20:20:50 1770754850

"Look man all reality is just uncountable numbers of subparticles phasing in and out of existence, what's not to understand?"

measurablefunc · 2026-02-10T21:30:03 1770759003

Your response is a common enough fallacy to have a name: straw man.

stickfigure · 2026-02-10T22:00:46 1770760846

I think the fallacy at hand is more along the lines of "no true scotsman".

You can define understanding to require such detail that nobody can claim it; you can define understanding to be so trivial that everyone can claim it.

"Why does the sun rise?" Is it enough to understand that the Earth revolves around the sun, or do you need to understand quantum gravity?

measurablefunc · 2026-02-10T22:50:57 1770763857

Good point. OP was saying "no one knows" when in fact plenty of people do know but people also often conflate knowing & understanding w/o realizing that's what they're doing. People who have studied programming, electrical engineering, ultraviolet lithography, quantum mechanics, & so on know what is going on inside the computer but that's different from saying they understand billions of transistors b/c no one really understands billions of transistors even though a single transistor is understood well enough to be manufactured in large enough quantities that almost anyone who wants to can have the equivalent of a supercomputer in their pocket for less than $1k: https://www.youtube.com/watch?v=MiUHjLxm3V0.

Somewhere along the way from one transistor to a few billion human understanding stops but we still know how it was all assembled together to perform boolean arithmetic operations.

famouswaffles · 2026-02-10T23:22:06 1770765726

Honestly, you are just confused.

With LLMs, The "knowing" you're describing is trivial and doesn't really constitute knowing at all. It's just the physics of the substrate. When people say LLMs are a black box, they aren't talking about the hardware or the fact that it's "math all the way down." They are talking about interpretability.

If I hand you a 175-billion parameter tensor, your 'knowledge' of logic gates doesn't help you explain why a specific circuit within that model represents "the concept of justice" or how it decided to pivot a sentence in a specific direction.

On the other hand, the very professions you cited rely on interpretability. A civil engineer doesn't look at a bridge and dismiss it as "a collection of atoms" unable to go further. They can point to a specific truss and explain exactly how it manages tension and compression, tell you why it could collapse in certain conditions. A software engineer can step through a debugger and tell you why a specific if statement triggered.

We don't even have that much for LLMs so why would you say we have an idea of what's going on ?

stickfigure · 2026-02-11T00:38:04 1770770284

It sounds like you're looking for something more than the simple reality that the math is what's going on. It's a complex system that can't simply be debugged through[1], but that doesn't mean it isn't "understood".

This reminds me of Searle's insipid Chinese Room; the rebuttal (which he never had an answer for) is that "the room understands Chinese". It's just not satisfying to someone steeped in cultural traditions that see people as "souls". But the room understands Chinese; the LLM understands language. It is what it is.

[1] Since it's deterministic, it certainly can be debugged through, but you probably don't have the patience to step through trillions of operations. That's not the technology's fault.

famouswaffles · 2026-02-11T02:42:38 1770777758

>It sounds like you're looking for something more than the simple reality that the math is what's going on.

Train a tiny transformer on addition pairs (i.e i.e '38393 + 79628 = 118021') and it will learn an algorithm for addition to minimize next token error. This is not immediately obvious. You won't be able to just look at the matrix multiplications and see what addition implementation it subscribes to but we know this from tedious interpretability research on the features of the model. See, this addition transformer is an example of a model we do understand.

So those inscrutable matrix multiplications do have underlying meaning and multiple interpretability papers have alluded as much, even if we don't understand it 99% of the time.

I'm very fine with simply saying 'LLMs understand Language' and calling it a day. I don't care for Searle's Chinese Room either. What I'm not going to tell you is that we understand how LLMs understand language.

dTal · 2026-02-11T14:42:25 1770820945

Your ultra-reductionism does not not constitute understanding. "Math happens and that somehow leads to a conversational AI" is true, but it is not useful. You cannot use it to answer questions like "how should I prompt the model to achieve <x>". There are many layers of abstraction within the network - important, predictive abstractions - which you have no concept of. It is as useful as asking a particle physicist why your girlfriend left you, because she is made of atoms.

Incidentally, your description of LLMs also describes all software, ever. It's just math, man! That doesn't make you an expert kernel hacker.

stickfigure · 2026-02-11T21:21:02 1770844862

It sounds like you're looking for the field of psychology. And like the field of psychology, any predictive abstraction around systems this complicated will be tenuous, statistical, and full of bad science.

You may never get a scientific answer to "how should I prompt the model to achieve <x>", just like you may never get a capital-S scientific answer to "how should I convince people to do X". What would it even mean to "understand people" like this?

You demand too much.

measurablefunc · 2026-02-10T23:28:19 1770766099

No one relies on "interpretability" in quantum mechanics. It is famously uninterpretable. In any case, I don't think any further engagement is going to be productive for anyone here so I'm dropping out of this thread. Good luck.

famouswaffles · 2026-02-10T23:41:05 1770766865

Quantum mechanics has competing interpretations (Copenhagen, Many-Worlds, etc.) about what the math means philosophically, but we still have precise mathematical models that let us predict outcomes and engineer devices.

Again, we lack even this much with LLMs so why say we know how they work ?

Dylan16807 · 2026-02-11T03:17:10 1770779830

Unless I'm missing what you mean by a mile, this isn't true at all. We have infinitely precise models for the outcomes of LLMs because they're digital. We are also able to engineer them pretty effectively.

famouswaffles · 2026-02-11T03:59:37 1770782377

The ML Research world (so this isn't simply a matter of being ignorant/uninformed) was surprised by the performance of GPT-2 and utterly shocked by GPT-3. Why ? Isn't that strange ? Did the transformer architecture fundamentally change between these releases ? No, it did not at all.

So why ? Because even in 2026, nevermind 18 and 19, the only way to really know exactly how a neural network will perform trained with x data at y scale is to train it and see. No elaborate "laws", no neat equations. Modern Artificial Intelligence is an extremely empirical, trial and error field, with researchers often giving post-hoc rationalizations for architectural decisions. So no, we do not have any precise models that tell us how a LLM will respond to any query. If we did, we wouldn't need to spend months and millions of dollars training them.

Dylan16807 · 2026-02-11T05:43:42 1770788622

We don't have a model for how an LLM that doesn't exist will respond to a specific query. That's different from lacking insight at all. For an LLM that exists it's still hard to interpret but it's very clear what is actually happening. That's better than you often get with quantum physics when there's a bunch of particles and you can't even get a good answer for the math.

And even for potential LLMs, there are some pretty good extrapolations for overall answer quality based on the amount of data and the amount of training.

famouswaffles · 2026-02-11T13:23:14 1770816194

>We don't have a model for how an LLM that doesn't exist will respond to a specific query.

We don't have a model for a LLM that does exist will respond to a specific query either.

>For an LLM that exists it's still hard to interpret but it's very clear what is actually happening.

No, it's not and I'm getting tired of explaining this. If you think it is, write your paper and get very rich.

>That's better than you often get with quantum physics when there's a bunch of particles and you can't even get a good answer for the math.

You clearly don't understand any of this.

>And even for potential LLMs, there are some pretty good extrapolations for overall answer quality based on the amount of data and the amount of training.

Oh really ? Lol

Dylan16807 · 2026-02-11T19:01:05 1770836465

> We don't have a model for a LLM that does exist will respond to a specific query either.

Yes we do... It's math, you can calculate it.

> No, it's not and I'm getting tired of explaining this. If you think it is, write your paper and get very rich.

Why would I get rich for explaining how to do math?

> You clearly don't understand any of this.

Could you be more specific?

Quantum physics is stupidly hard to calculate when you approach realistic situations.

A real LLM takes a GPU a fraction of a second.

They're both hard to interpret, please realize I'm agreeing that LLMs are hard to interpret. But they're easier than QM on some other fronts.

And mentioning copenhagen or many-worlds doesn't show that quantum mechanics are easy to interpret, that's about as useful as saying an LLM works like neuron activation.

> Oh really ? Lol

Here's one of many posts about it. https://cameronrwolfe.substack.com/p/llm-scaling-laws

camgunz · 2026-02-10T20:18:14 1770754694

The culture that brought you "speedrunning computer science with JavaScript" and "speedrunning exploitative, extractive capitalism" is back with their new banger "speedrunning philosophy". Nuke it from orbit; save humanity.

camgunz · 2026-02-10T11:41:16 1770723676

Comments on the original article: https://news.ycombinator.com/item?id=46945755

camgunz · 2026-02-10T11:39:37 1770723577

July 2025: https://www.wired.com/story/silicon-valley-china-996-work-sc...

camgunz · 2026-02-09T11:10:49 1770635449

Get enough people in the room and they can describe "the system". Everything OP lists (QAM, QPSK, WPA whatever) can be read about and learned. Literally no one understands generative models, and there isn't a way for us to learn about their workings. These things are entirely new beasts.