Hacker Newsnew | past | comments | ask | show | jobs | submit | LetsGetTechnicl's commentslogin

Why the fuck would we ever want an AI-first society

What we want doesn't matter, what they want does.

>The "Moloch problem" or "Moloch trap" describes a game-theoretic scenario where individual agents, pursuing rational self-interest or short-term success, engage in competition that leads to collectively disastrous outcomes . It represents a coordination failure where the system forces participants to sacrifice long-term sustainability or ethical values for immediate survival, creating a "race to the bottom"

https://www.slatestarcodexabridged.com/Meditations-On-Moloch


  It could also end up freeing us from every commercial dependency we have. Write your own OS, your own mail app, design your own machinery to farm with.

Lmfao LLM's can barely count rows in a spreadsheet accurately, this is just batshit crazy.

edit: also the solution here isn't that every one writes their own software (based on open source code available on the internet no doubt) we just use that open source software, and people learn to code and improve it themselves instead of off-loading it to a machine


This is one of those things where people who don't know how to use tools think they're bad, like people who would write whole sentences into search engines in the 90s.

LLMs are bad at counting the number of rows in a spreadsheet. LLMs are great at "write a Python script that counts the number of rows in this spreadsheet".


Do you think asking any LLM in the next 100 years to "write a Python script that generates an OS" will work?

Yes, for some definition of OS. It could build a DOS-like or other TUI, or a list of installed apps that you pick from. Devices are built on specifications, so that's all possible. System API it could define and refine as it goes. General utilities like file management are basically a list of objects with actions attached. And so on... the more that is rigidly specified, the better it will do.

It'll fail miserably at making it human-friendly though, and attempt to pilfer existing popular designs. If it builds a GUI, it's be a horrible mashup of Windows 7/8/10/11, various versions of OSX / MacOS, iOS, and Android. It won't 'get' the difference between desktop, laptop, mobile, or tablet. It might apply HIG rules, but that would end up with a clone at best.

In short, it would most likely make something technically passable but nightmareish to use.


Given 100 years though? 100 years ago we barely had vacuum tubes and airplanes.

Given a century the only unreasonable part is oneshotting with no details, context, or follow up questions. If you tell Linus Torvalds "write a python script that generates and OS", his response won't be the script, it'll be "who are you and how did you get into my house".


Considering how simple "an OS" can be, yes, and in the 2020s.

If you're expecting OSX, AI will certainly be able to make that and better "in the next 100 years". Though perhaps not oneshotting off something as vague as "make an OS" without followup questions about target architecture and desired features.


Batshit crazy?

3 years ago LLMs couldn’t solve 7x8.

Now they’re building complex applications in one shot, solving previously unsolved math and science problems.

Heck, one company built a (prototype but functional) web browser

And you say it’s crazy that in the future it’ll be able to build a mail app or OS?


JFYI, LLMs still can't solve 7x8, and well possibly never will. A more rudimentary text processor shoves that into a calculator for consumption by the LLM. There's a lot going on behind the scenes to keep the illusion flying, and that lot is a patchwork of conventional CS techniques that has nothing to do with cutting edge research.

To many interested in actual AI research, LLMs are known as the very flawed and limiting technique they are, and the increasing narrative disconnect between this and the table stakes where they are front and center of every AI shop, carrying a big chunk of the global GDP on its back, is annoying and borderline scary.


This is false. You can run a small open-weights model in ollama and check for yourself that it can multiply three-digit numbers correctly without having access to any tools. There's even quite a bit of interpretability research into how exactly LLMs multiply numbers under the hood. [1]

When an LLM does have access to an appropriate tool, it's trained to use the tool* instead of wasting hundreds of tokens on drudgery. If that's enough to make you think of them as a "flawed and limiting technique", consider instead evaluating them on capabilities there aren't any tools for, like theorem proving.

* Which, incidentally, I wouldn't describe as invoking a "more rudimentary text processor" - it's still the LLM that generates the text of the tool call.

[1] https://transformer-circuits.pub/2025/attribution-graphs/bio...


> Heck, one company built a (prototype but functional) web browser

No, they built something which claimed to be a web browser but which didn't even compile. Every time someone says "look an LLM did this impressive sounding thing" it has turned out to be some kind of fraud. So yeah, the idea that these slop machines could build an OS is insane.


I personally observe AI creation phenomenally good code, much better than I can write. At insane speed, with minimal oversight. And today’s AI is the worst we will ever have.

Progress in AI can easily be measured by the speed at which the goalposts move - from “it can’t count” to “yeah but the entire browser it wrote didnt compile in the CI pipeline”


I have absolutely no need and yet I want ittttt

  [...] since I work at an AI lab and stand to gain a great deal if AI follows through on its economic promise.
And there it is.

> added that user activity on the site will not be tracked.

Oh I'm sure


That's what happens when people outsource their mental capacity to a machine


Using "low cost" and LLM's in the same sentence is kind of funny to me.


Is this not a recipe for model collapse?


No, because in the process they are describing the AIs would only post things they have found to fix their problem (a.k.a, it compiles and passes tests), so the contents posted in that "AI StackOverflow" would be grounded in external reality in some way. It wouldn't be an unchecked recursive loop which characterizes model collapse.

Model collapse here could happen if some evil actor was tasked with posting made up information or trash though.


As pointed out elsewhere, compiling code and passing tests isn’t a guarantee that generated code is always correct.

So even “non Chinese trained models” will get it wrong.


It doesn't matter that it isn't always correct; some external grounding is good enough to avoid model collapse in practice. Otherwise training coding agents with RL wouldn't work at all.


And how do you verify that external grounding?


What precisely do you mean by external grounding? Do you mean the laws of physics still apply?


I mean it in the sense that tokens that pass some external filter (even if that filter isn't perfect) are from a very different probability distribution than those that an LLM generates indiscriminately. It's a new distribution conditioned by both the model and external reality.

Model collapse happens in the case where you train your model indefinitely with its own output, leading to reinforcing the biases that were originally picked up by the model. By repeating this process but adding a "grounding" step, you avoid training repeatedly on the same distribution. Some biases may end up being reinforced still, but it's a very different setting. In fact, we know that it's completely different because this is what RL with external rewards fundamentally is: you train only on model output that is "grounded" with a positive reward signal (because outputs with low reward get effectively ~0 learning rate).


Oh interesting. I guess that means you need to deliberately select a grounding source with a different distribution. What sort of method would you use to compare distributions for this use case? Is there an equivalent to an F-test for high dimensional bit vectors?


Should've called it Slopbook


Isn't the whole issue here that because the agent trusted Anthrophic IP's/URL's it was able to upload data to Claude, just to a different user's storage?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: