Ok, I’ll bite: how is that different from humans?

strken · 2026-03-07T15:47:21 1772898441

Human behaviour is goal-directed because humans have executive function. When you turn off executive function by going to sleep, your brain will spit out dreams. Dream logic is famous for being plausible but unhinged.

I have the feeling that LLMs are effectively running on dream logic, and everything we've done to make them reason properly is insufficient to bring them up to human level.

seanmcdirmid · 2026-03-07T16:07:36 1772899656

Isn’t a modern LLM with thinking tokens fairly goal directed? But yes, we hallucinate in our sleep while LLMs will hallucinate details if the prompt isn’t grounded enough.

zarzavat · 2026-03-07T16:17:22 1772900242

The thing about dream logic is that it can be a completely rational series of steps, but there's usually a giant plot hole which you only realise the second you wake up.

This definitely matches my experience of talking to AI agents and chatbots. They can be extremely knowledgeable on arcane matters yet need to have obvious (to humans) assumptions pointed out to them, since they only have book smarts and not street smarts.

tovej · 2026-03-07T16:16:30 1772900190

Assuming this is not a rhetorical question: no, it is not. The only "goal" is to maximize plausibility.

seanmcdirmid · 2026-03-07T16:30:53 1772901053

Again, how is that different from humans? I’m not going around trying to prove my code correct when I write it manually.

tovej · 2026-03-07T22:10:17 1772921417

I write code to solve a problem. Not code that looks like it solves the problem if a non-technical client squints at it.

And if you don't prove your code, do you not design at all then? Do you never draw state diagrams?

Every design is an informal proof of the solution. Rarely I write formal proofs. Most of the time I write down enough for myself to be convinced that the desing solves the problem.

seanmcdirmid · 2026-03-08T00:32:19 1772929939

Yes, you can dedicate extra tokens to draw state diagrams, the LLM can actually do that, if you don't have it generating one or more design documents before you are writing code you are doing that wrong. I still don't get how that is different from what humans are doing.

> Most of the time I write down enough for myself to be convinced that the desing solves the problem.

Again, why do you assume we aren't doing the same thing with LLMs?

1. Spec given

2. Ask LLM to write a bunch of design documents based off of spec

3. Ask LLM to identify edge cases

4. Ask LLM to device edge cases in to a test plan involving N tests

5. Ask LLM to write tests

6. Ask LLM to write commented code

7. Ask LLM to run tests on code, and determine on failing tests if test or code is wrong, go back to the appropriate step to fix test and/or code.

Whenever I hear someone here on HN imply that the only way to code with an AI is via vibe coding I just die a bit more inside.

tovej · 2026-03-08T08:04:21 1772957061

You completely misunderstood what I wrote.

It was a response to you saying: "Im not going around trying to prove my code correct when I write it manually."

How did you manage to forget what you wrote previously?

Also, in this post you are now suddenly taking the exact opposite position, contradicting your previous point.

seanmcdirmid · 2026-03-08T17:49:22 1772992162

I did not contradict my previous point. But now I’m confused in how you think we use LLMs to write code. You made it sound like we just get it todump out code without any process in between.

tovej · 2026-03-09T21:16:45 1773091005

You most definitely did contradict yourself. First you said you don't prove anything about the code you write, then you said you do. But that's fine. We can agree to disagree.

And I have not made any statements about how you use LLMs, only about how the LLMs produce code. All statements about how you use LLMs have been made by you, not me. I haven't discussed it since it is not related to the arguments, which are: 1) whether LLMs are goal-oriented and 2) whether humans and LLMs both merely maximize plausibility when writing/generating code.

Both claims that you made. Note, however, that if you are correct in your own points, then you should indeed be able to "just dump out code without any process in between". So if anyone is claiming this, it's you.

abdullahkhalids · 2026-03-08T00:15:30 1772928930

You are correct. However, humans sometimes do write stuff that "looks like it solves the problem". A prime example of this is a student who doesn't know how to answer a question. So they make up a plausible sounding answer.

As a exam grader, you can easily tell when a student has the mindset of "solving a problem" but made a mistake, and when they had the mindset of "looks like it solves the problem" and just wrote some stuff.

tsunamifury · 2026-03-07T15:59:28 1772899168

It’s amazing how much you get wrong here. As LLM attention layers are stacked goal functions.

What they lack is multi turn long walk goal functions — which is being solved to some degree by agents.

strken · 2026-03-07T23:37:09 1772926629

I don't argue that thinking and attention are missing. I argue that they are trying to do the job of human executive function but aren't as good at it.

nemo44x · 2026-03-07T15:58:43 1772899123

LLMs are literally goal machines. It’s all they do. So it’s important that you input specific goals for them to work towards. It’s also why logically you want to break the problem into many small problems with concrete goals.

andai · 2026-03-07T16:00:38 1772899238

Do you only mean instruct-tuned LLMs? Or the base (pretrained) model too?

nemo44x · 2026-03-07T17:03:33 1772903013

The entire system and the agent loop allows for more complex goal resolution. The LLM models language (obviously) and language is goal oriented so it models goal oriented language. It’s an emergent feature of the system.

satvikpendem · 2026-03-07T15:56:46 1772899006

A prompt for an LLM is also a goal direction and it'll produce code towards that goal. In the end, it's the human directing it, and the AI is a tool whose code needs review, same as it always has been.

basch · 2026-03-07T16:54:10 1772902450

Id argue humans have some sort of parallelness going on that machines dont yet. Thoughts happening at multiple abstraction levels simultaneously. As I am doing something, I am also running the continuous improvement cycle in my head, at all four steps concurrently. Is this working, is this the right direction, does this validate?

You could build layers and layers of LLMs watching the output of each others thoughts and offering different commentary as they go, folding all the thoughts back together at the end. Currently, a group of agents acts more like a discussion than something somewhat omnipotent or omnitemporal.

whoamii · 2026-03-07T15:56:43 1772899003

Some of my best code comes from my dreams though.

spiderfarmer · 2026-03-07T15:57:09 1772899029

And yet LLM’s are incredibly useful as they are right now.

strken · 2026-03-07T23:40:54 1772926854

And yet they're going to be better in a decade, which will require understanding why they aren't perfect today.

apical_dendrite · 2026-03-07T15:49:54 1772898594

The volume is different. Someone submitted a PR this week that was 3800 lines of shell script. Most of it was crap and none of it should have been in shell script. He's submitting PRs with thousands of lines of code every day. He has no idea how any of it actually works, and it completely overwhelms my ability to review.

Sure, he could have submitted a ill-considered 3800 line PR five years ago, but it would have taken him at least a week and there probably would have been opportunities to submit smaller chunks along the way or discuss the approach.

switchbak · 2026-03-07T16:19:11 1772900351

It’s harder when the person doing what you describe has the ability to have you fired. Power asymmetry + irresponsible AI use + no accountability = a recipe for a code base going right to hell in a few months.

I think we’re going to see a lot of the systems we depend on fail a lot more often. You’d often see an ATM or flight staus screen have a BSOD - I think we’re going to see that kind of thing everywhere soon.

satvikpendem · 2026-03-07T15:54:04 1772898844

Just block that user, that seems to be the way.

somewhereoutth · 2026-03-07T15:50:03 1772898603

Humans have a 'world model' beyond the syntax - for code, an idea of what the code should do and how it does it. Of course, some humans are better than others at this, they are recognized as good programmers.

satvikpendem · 2026-03-07T15:54:29 1772898869

Papers show that AI also has a world model, so I don't think that's the right distinction.

tovej · 2026-03-07T16:20:40 1772900440

Could you please cite these papers. If by AI you mean LLMs, that is not supported by what I know. If you mean a theoretical world-model-based AI, that's just a tautological statement.

satvikpendem · 2026-03-07T16:31:03 1772901063

https://arxiv.org/abs/2305.11169

https://arxiv.org/abs/2506.02996

salawat · 2026-03-07T17:38:43 1772905123

Their world model is completely a byproduct of language though, not experience. Furthermore, they by deliberate design do not maintain any form of self-recognition or narrative tracking, which is the necessary substrate for developing validating experience. The world model of an LLM is still a map. Not the territory. Even though ours has some of the same qualities arguably, the identity we carry with us and our self-narrative are incredibly powerful in terms of allowing us to maintain alignment with the world as she is without munging it up quite as badly as LLM's seem prone to.

satvikpendem · 2026-03-07T17:42:14 1772905334

How do you know ours is any different, that we are not in a simulation or a solipsistic scenario? The truth is that one cannot know, it's a philosophical quandary that's been debated for millennia.

topaz0 · 2026-03-07T17:56:03 1772906163

It is absolutely obvious how different it is from interacting with any LLM about the ways that it is wrong.

satvikpendem · 2026-03-07T19:34:38 1772912078

Nope, appeal to obviousness is not a sound argument. There are many things people thought were obvious that were wrong.

topaz0 · 2026-03-07T19:36:54 1772912214

It wasn't an argument. There isn't much point in going to a lot of trouble to make an argument to someone so clearly determined to ignore the truth. It is nevertheless true.

satvikpendem · 2026-03-07T19:46:14 1772912774

Just saying something is true doesn't make it so. Truth requires justification, and if you can't provide that, then there's no reason to believe it's true. For someone making a claim, the onus is on them to provide evidence.

Otherwise I'll just say I'm right and you're wrong, after all, that's what you're saying.

salawat · 2026-03-08T22:03:23 1773007403

Simple. I have two sets of data I can pull from to validate a claim an LLM makes. I have the linguistic corpora we produce (artificial memory, analogical to latent space built by an LLM). You are correct in that this modality is shared. I also, however, have internal self-narrative and experiential state that is non-linguistic, but sensory/perception driven. An LLM can try to convince me that a bunch of mathematicians would come up with a system that requires one to make many copies of the same bitwise representation of a block for loading by the execution framework due to munging of the latent space via quantization. However, I have recollections of my time amongst Mathematicians and theorists. I can replay my lived perceptions of those times, and analyze and extract new meaning from them as my neural hardware evolves. Therefore, when that claim is made, my validation of the world as she is comes to a screeching halt to the tune of a recollection of a calculus class where the entire point is to pound into you the utility of fungibility of mathematical representations (substitution), and a further connection to optimization (replace entire cluster of an equation with a letter to process other things first and deal with the internal details later). That synthesizes also to the principle Mathematicians are both lazy, and clever. Alias that bitch, and moving right along. LLM's don't have that without you deliberately injecting that mechanism into their context. They'll in fact just run off the rails.

Now, could an equivalent process be modelled at some point? Probably. It'd be a conscious decision to do so on our part, and given fears over the AI Alignment quandary, it seems a rather fraught direction to carelessly proceed.

tovej · 2026-03-07T22:06:15 1772921175

One conference proceeding paper and one preprint, about LLMs encoding either relative geometric information of objects or simple 2D paths.

One of the papers call this "programming language semantics", but it is using a 2D grid navigation DSL. The semantics of that language are nothing like actual programming language semantics.

These are not the same as the concept being discussed here, a human "world model" of a computer system, through which to interpret the semantics of a program.

satvikpendem · 2026-03-07T22:33:20 1772922800

Well I didn't find any papers off the bat for code world models but if they can create a world model for the task given, such as geometric manipulation, I don't see why they wouldn't in terms of code.

tovej · 2026-03-08T07:57:40 1772956660

Because a "world model" for relative positions in space is just a partial ordering of points.

That's not really a world model.

detourdog · 2026-03-07T16:27:40 1772900860

What I'm surprises me about the current development environment is the acceleration of technical debt. When I was developing my skills the nagging feeling that I didn't quite understand the technology was a big dark cloud. I felt this clopud was technical debt. This was always what I was working against.

I see current expectations that technical debt doesn't matter. The current tools embrace superficial understand. These tools to paper over the debt. There is no need for deeper understanding of the problem or solution. The tools take care of it behind the scenes.

wood_spirit · 2026-03-07T15:48:57 1772898537

It’s not. LLMs are just averaging their internet snapshot, after all.

But people want an AI that is objective and right. HN is where people who know the distinction hang out, but it’s not what the layperson things they are getting when they use this miraculous super hyped tool that everybody is raving about?

mrwh · 2026-03-07T16:06:24 1772899584

The etiquette, even at the bigtech place I work, has changed so quickly. The idea that it would be _embarrassing_ to send a code review with obvious or even subtle errors is disappearing. More work is being put on the reviewer. Which might even be fine if we made the further change that _credit goes to the reviewer_. But if anything we're heading in the opposite direction, lines of code pumped out as the criterion of success. It's like a car company that touts how _much_ gas its cars use, not how little.

wood_spirit · 2026-03-07T16:11:25 1772899885

Review is usually delegated to an AI too

satvikpendem · 2026-03-07T15:53:36 1772898816

By now, a few years after ChatGPT released, I don't think anyone is thinking AI is objective and right, all users have seen at least one instance of hallucination and simply being wrong.

wood_spirit · 2026-03-07T15:59:23 1772899163

Sorry I can think of so many counter examples. I also detect a lot of “well it hallucinates about subject X (that the person knows well, so can spot the hallucination)” but continue to trust it on subjects Y and Z (which the person knows less well so can’t spot the hallucinations).

YMMV.

andai · 2026-03-07T16:02:21 1772899341

> Briefly stated, the Gell-Mann Amnesia effect works as follows. You open the newspaper to an article on some subject you know well. In Murray's case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward-reversing cause and effect. I call these the "wet streets cause rain" stories. Paper's full of them. In any case, you read with exasperation or amusement the multiple errors in a story-and then turn the page to national or international affairs, and read with renewed interest as if the rest of the newspaper was somehow more accurate about far-off Palestine than it was about the story you just read. You turn the page, and forget what you know.

-Michael Crichton

satvikpendem · 2026-03-07T16:04:43 1772899483

Sure, Gell-Mann amnesia exists, but remember that its origin is actually human, in the form of newspaper writers. So, how can we trust humans the same way? In just the same way, AI cannot also be fully trusted.

wood_spirit · 2026-03-07T16:15:18 1772900118

The current way of doing AI cannot be trusted.

that doesn’t mean the future won’t herald a way of using what a transformer is good at - interfacing with humans - to translate to and interact with something that can be a lot more sound and objective.

satvikpendem · 2026-03-07T16:29:03 1772900943

You're falling into the extrapolation fallacy, there is no reason to think that the future won't have the same issues as today in terms of hallucinations.

And even if they were solved, how would that even work? The world is not sound and objective.

wood_spirit · 2026-03-07T17:07:24 1772903244

It’s a thought experiment. I am not saying I believe it will happen.

But right now there are lots of domains where current lauded success is in treating something objective - like code - as tokens for an llm.

We could instead explore using transformers to translate human languages to a symbology that can be reasoned about and applied eg to code.

It’s the talk of conferences. But whether it works better than we have today, or whether it aligns with the incentives or the big players, is another matter

seanmcdirmid · 2026-03-07T16:08:50 1772899730

There are a lot of binary thinkers on HN, but they shouldn’t make up a majority.

rDr4g0n · 2026-03-07T15:57:16 1772899036

It's much easier to fire an employee which produces low quality/effort work than to convince leadership to fire Claude.

satvikpendem · 2026-03-07T16:12:17 1772899937

You can fire employees who don't review code generated though, because ultimately it's their responsibility to own their code, whether they hand wrote it or an LLM did.

It seems to me that it's all a matter of company culture, as it has always been, not AI. Those that tolerate bad code will continue to tolerate it, at their peril.