Hacker Newsnew | past | comments | ask | show | jobs | submit | rstuart4133's commentslogin

> Our neighbors are exactly the ones to blame.

This is a bad road to go down.

If you start blaming people rather than processes, the obvious fix is to disenfranchise the people (or worse). If you blame the process and then change it to get a better outcome, everyone wins.

There is a lot of low-hanging bad fruit in how the USA runs it's democracy. You allow gerrymandering, you allow politicians to make it difficult for people to vote. The small voter turnout means the fringe single issue voters get a disproportionate say. You use first past the post, which means candidate the majority think is the "least worst" may not get elected. (No voting system is perfect, but FPP is by far the worst.) Your political donation laws favour corporates, who by definition have no interest in voter welfare.


> Not learning from new input may be a feature.

Learning is OpenClaw's distinguishing feature. It has an array of plugins that let it talk to various services - but lots of LLM applications have that.

What makes it unique is it's memory architecture. It saves everything it sees and does. Unlike an LLM context its memory never overflows. It can search for relevant bits on request. It's recall is nowhere near as well as the attention heads of an LLM, but apparently good enough to make a difference. Save + Recall == memory.


> Context is the plateau. It's why RAM prices are spiking.

Yes, context is the plateau. But I don't think it the bottleneck is RAM. The mechanism described in "Attention is all you need" is O(N^2) where N is the size of the context window. I can "feel" this in everyday usage. As the context window size grows, the model responses slow down, a lot. That's due to compute being serialised because there aren't enough resources to do it in parallel. The resources are more likely compute and memory bandwidth than RAM.

If there is a breakthrough, I suspect it will be models turning the O(N^2) into O(N * ln(N)), which is generally how we speed things up in computer science. That in turn implies abstracting the knowledge in the context window into a hierarchical tree, so the attention mechanism only has to look across a single level in the tree. That in turn requires it to learn and memorise all these abstract concepts.

When models are trained the learn abstract concepts which they near effortlessly retrieve, but don't do that same type of learning when in use. I presume that's because it requires a huge amount of compute, repetition, and time. If only they could do what I do - go to sleep for 8 hours a day, and dream about the same events using local compute, and learn them. :D Maybe, one day, that will happen, but not any time soon.


> It seems to me to be a 'solution' to a non-existent problem.

Electronic voting has lots of advantages. It can be end-to-end verified, it can be a great help to disadvantaged people (blind, illiterate), it can deliver results faster, it can probably be made more robust to retail-level tampering than paper ballots provided a paper audit trail it kept (as all electronic systems designed with security in mind do).

The one question mark in my mind: the current US system resisted Trump's efforts to corrupt it pretty well. I think that was because of the inertia created all the people involved in staffing the ballot stations, counting and verifying the votes. The machinations of the electoral college being highly visible put people doing the wrong thing at high risk for decades after Trump leaves the stage.

An automated electronic system could remove a lot of that human inertia. Human efficiency is not an advantage in an electoral system, it's a weakness. You want as many people involved as possible.


> I fear things have changed and Trump'ism is here to stay.

The first time they got Trump lite. They didn't like him, and ditched him after one term.

The next president has the misfortune to get elected during a worldwide recession. They liked the recession even less than they liked Trump lite. So they re-elected Trump lite.

But they've now found the person they re-elected was Trump heavy. He's doubled down on all the things they disliked about Trump lite, and will probably land them in a recession entirely of his own making.

I'd be stunned if Trump heavy or anybody following in his footsteps won an election for a decade or two. Memories will have to fade.


Guess we'll find out at the midterms if you're right, I certainly hope so. We're at the point where even the major policy differences between Democrats and Republicans are insubstantial compared to just getting people in office that aren't openly mustache twirlingly corrupt.

I came here to say a similar thing.

There would be no GPL if anybody could have cheaply and trivially reproduced the software for printers and Lisp machines Stallman was denied access to. There is no reason to force someone to give you the source code if takes no effort to reproduce.

Mind you, that isn't what happened here. The effort involved in getting a LLM to write software comes from three things: writing a clear unambiguous spec that also gives you a clean exported API, more clean unambiguous specs for the APIs you use, and a test suite the LLM can use to verify it has implemented the exported API correctly. Dan got them all for free, from the previous implementation which I'm sure included good documentation. That means his contribution to this new code consisted of little more than pressing the button.

Sadly, if you wrote some GPL software with excellent documentation, a thorough test suite, clean API, and implemented using well understood library the cost of creating a cleanroom reproduction has indeed gone to near zero over the past 24 months. The GPL licence is irrelevant.

Welcome to the brave new world.

PS: Sqlite keeping their test suite proprietary is looking like a prescient masterstroke.

PPS: The recent ruling that an API isn't copyrightable just took on a whole new dimension.


I swear it's the modern version of "beowulf cluster" comment on slashdot.

They also got very tiresome long before fading away.


I'm in the hard disagree camp. I'm heading towards late 60s now, and have been writing software for all of my working life.

I am wondering how your conclusions are so different from mine. One is you only write "in the small [0]". LLMs are at least as good as a human at turning out "web grade" software in the small. Claude CLI is as good an example of this sort of software as anything. Every week or two I hit some small bug. This type of software doesn't need a "principal software engineer".

The second is you never used an LLM to write software in the large. LLMs are amazing things, far far better than humans at reading and untangling code. You can give some obfuscated javascript and they regurgitate commented code with self explanatory variable and function names in a minute or two. Give them a task and they will happily spit out 1000s of lines of code in 10 minutes or so which is amazing.

Then you look closer, and it's spaghetti. The LLM has no trouble understanding the spaghetti of course, and if you are happy to trust your tests and let the LLM maintain the thing from then on, it's a workable approach.

Until, that is, it gets large enough for a few compile loops to exceed the LLM's context window, then it turns to crap. At that point you have to decompose it into modules it can handle. Turns out decomposition is something current LLMs (and junior devs) are absolutely hopeless at. But it's what a principal software engineer is paid to do.

The spaghetti code is the symptom of that same deficiency. If they decide they need code to do X while working on concept Y, they will drop the code for X right beside the code for Y, borrowing state from Y as needed. The result is a highly interconnected ball of mud. Which the LLM will understand perfectly until it falls off the context window cliff, then all hope is lost.

While LLM ability to implement a complex request as simple isolated parts remains, a principal engineer's job is safe. In fact given LLMs are accelerating things, my guess is the demand will only grow. But I suspect the LLM developers are working hard at solving this limitation.

[0] https://en.wikipedia.org/wiki/Programming_in_the_large_and_p...


> During our conversation, she emphasized repeatedly that Microsoft does not primarily view its offerings as consumer products.

Nowadays phones, tablets and game boxes are the consumer products. Currently they outnumber consumer windows desktops by about 6 or 7 to 1 [0]. 5 years ago, it was about 3.5 to 1.

Microsoft doesn't seem to have much control over this. They are executing a pivot in response.

[0] https://learn.g2.com/operating-system-statistics#:~:text=Mic...


> Mac OS X's market share in the US climbed from 12.17% in December 2022 to 22.08% in May 2023. The market share was consistent for the next few months until it dropped to 14.03% in December 2023.

What. MacOS doubled from December to the next may, then cut in half by the next December? I’m skeptical. It also talks about approval ratings for OS X Lion, from 14 years ago. I think that site is powered by a set of dice.

I think your overall point is correct, but I’m doubting that reference as an accurate data source.


> I think my question at this point is what about this is specific to LLMs. Humans should not be forced to wade through reams of garbage output either.

Beware I'm a complete AI layman. All this is from background reading of popular articles. It may well be wrong. It's definitely out of date.

It has to do with how the attention heads work. The attention heads (the idea originated from the "Attention is all you need" paper, arguably the single most important AI paper to date), direct the LLM to work on the most relevant parts of the conversation. If you want a human analogue, it's your attention heads that are tacking the interesting points in a conversation.

The original attention heads output a relevance score for every pair of words in the context window. Thus in "Time flies like an arrow", it's the attention heads that spot the word "Time" is very relevant to "arrow", but not "flies". The implication of this is an attention head does O(N*N) work. It does not scale well to large context windows.

Nonetheless, you see claims of "large" context windows the LLMs marketing. (Large is in quotes, because even a 1M context window begins to feel very cramped in a write / test / fix loop.) But a 1M context-window would require a attention head requiring a 1 trillion element matrix. That isn't feasible. The industry even has a name for the size of the window they give in their marketing: the Effective Context Window. Internally they have another metric that measures the real amount of compute they throw at attention: the Physical Context Window. The bridge between the two is some proprietary magic that discards tokens in the context window that are likely to be irrelevant. In my experience, that bridge is pretty good at doing that, where "pretty good" is up to human standards.

But eventually (actually quickly in my experience), you fill up even the marketed size of the context window because it is remembering every word said, in the order they were said. If it reads code it's written to debug it, it appears twice in the context window. All compiler and test output also ends up there. Once the context window fills up they take drastic action, because it like letting malloc fail. Even reporting a malloc failure is hard because it usually needs more malloc to do the reporting. Anthropic calls it compacting. It throws away 90% of your tokens. It turns your helpful LLM into a goldfish with dementia. It is nowhere near as good as human is at remembering what happened. Not even close.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: