Hacker Newsnew | past | comments | ask | show | jobs | submit | jmacd's commentslogin

I went through the setup process for Openclaw. Near the end I felt like I had wrestled more with setting it up than I would have had to if I had just built it from the ground up. So I pointed Pi at Nanoclaw and asked it to review it and build me a minimal clone. It took a few minutes and I had the core of something that is easier to maintain (for me) than some unknown large and cumbersome system, or whatever Openclaw is.

To each their own.


One concern I have is API key management.

.env files or injecting secrets at startup via a secret manager still risks leaking keys.

I vaguely recall an implementation that substitutes secret placeholders with real secrets only during outgoing calls to approved domains which sounds better. However, you're still trusting an agent on your machine with command execution.


Funny enough ooenclaw is based on Pi.

I’m kind of curious what you do with it. I feel like the real value is integrating it with everything but then even if it’s nanoclaw or simpler majority of the worthy things are on the unsafe side.

Would love to hear your experience as I’m planning to do the exactly same.


The most interesting thing for me is that I built an extension for Pi that has it recognize when it does not know how to do something I am asking and it then attempts to make its own extension and/or skill to enable whatever that functionality is. Best example there is I just told it to make a todo list for me, and so it made a skill that uses a local file to track todos and follow up on them. I instructed it to make an LLM call to find the best suggested follow up timing to remind me.

So... the real value so far is I find it fun? It isn't the "life changing need to go make a tweet!!" level for me.


Pi is great so it's sad to see that it only gained momentum because some trash tool like openclaw uses it.

I think developer of pi and openclaw are friends, not sure if it matters but also Pi has its own small following. I agree with you it’s such an elegant piece of project with an awesome clean architecture. (Also see: oh my pi)

The real lesson is if you ignore security and data disasters agentic AI is easier than anyone expected.


Yeah it does seem a little fragile. Still battling with working out why it pegs CPU at 100% permanently on a VPS I tried using. Literally just from installing the base

I wonder how long npm/pip etc even makes sense.

Dependancies introduce unnecessary LOC and features which are, more and more, just written by LLMs themselves. It is easier to just write the necessary functionality directly. Whether that is more maintainable or not is a bit YMMV at this stage, but I would wager it is improving.


What a bizarre comment. Take something like NumPy - has a hard dependency on BLAS implementations where numerical correctness are highly valued for accuracy and require deep thinking for correct implementation as well as for performance. Written in a different language again for performance so again an LLM would have to implement all of those things. What’s the utility in burning energy to regenerate this all the time when implementations already exist?


What do supply chain attacks look like against one of these containers?


Interesting thought (I think recently more than ever it's a good idea to question assumptions) - but IMO abstractions are important as ever.

Maybe the smallest/most convenient packages (looking at you is-even) are obsolete, but meaningful packages still abstract a lot of complexity that IMO aren't easier to one-shot with an LLM


Concretely, when you use Django, underneath you have CPython, then C, then assembly, and finally machine code. I believe LLMs have been much better trained on each layer than going end-to-end.


The most popular modules downloaded off pip and npm are not singular simple functions and cannot easily be rewritten by an llm.

Scikit-learn

Pandas

Polars


This is like saying Wikipedia doesn't make sense because there's now Grokipedia


there are people (on Hacker News Dot Com, even) who believe this without a shred of shame or irony.


I consider packages over 100k download production-tested. Sure LLM can roll some by themselves but if many edge cases to appear, (which may already be handled by public packages) you will need to handle it.


Don't base anything on just download numbers, not only is it easily game-able, it's enough with like 3 small companies using a package and push commits individually and CI triggering on every new commit for that number to lose any sort of meaning.

Vanity metrics should not be used for engineering decisions.


At times I wonder why x tui coding agent was written in js/ts/python, why not use Go if it's mostly llm coded anyway? But that's mostly my frustration at having to wait for npm to install a thousand dependencies, instead of one executable plus some config files. There's also support libraries like terminal ui that differ in quality between platforms.


Funny because as a non-Go user, the few Go binaries I've used also installed a bunch of random stuff.

This can be fixed in npm if you publish pre-compiled binaries but that has its own problems.


>the few Go binaries I've used also installed a bunch of random stuff.

Same goes for rust. Sometime one package implicitly imports other in different version. And look of rustup tree to resolve the issue just doesn't seem very appealing.


Well you do need to vet dependencies and I wish there was a way to exclude purely vibe coded dependencies that no human reviewed but for well established libraries, I do trust well maintained and designed human developed libraries over AI slop.

Don't get me wrong, I'm not a luddite, I use claude code and cursor but the code generated by either of those is nowhere near what I'd call good maintainable code and I end up having to rewrite/refactor a big portion before it's in any halfway decent state.

That said with the most egregious packages like left-pad etc in nodejs world it was always a better idea to build your own instead of depending on that.


I've been copy-pasting small modules directly into my projects. That way I can look them over and see if they're OK and it saves me an install and possible future npm-jacking. There's a whole ton of small things that rarely need any maintenance, and if they do, they're small enough that I can fix myself. Worst case I paste in the new version (I press 'y' on github and paste the link at the top of the file so I can find it again)


As long as "don't roll your own crypto" is considered good advice, you'll have at least a few packages/libraries that'll need managing.

For a decent number of relatively pedestrian tasks though, I can see it.


LLMs are great at the roll you own crypto foot gun. They will tell you to remember all these things that are important, and then ignore their own tips.


Tokens are expensive and downloading is cheap. I think probably the opposite is true, really, and more packages will be written specifically for LLMs to use because their api uses fewer tokens.


It still takes a little bit of time for an LLM to rewrite all the software in existence from scratch.


That was already the case for a lot of things like is-even.


You have insane delusions about how capable LLMs are but even assuming its somehow true: downloading deps instead of hallucinating more code saves you on tokens


And your opinions on how average people use these tools are 100% accurate?


If average people try vibecoding their dependencies, they’ll fail, simple as that. We’ve already seen how that looks with the “web browsers” that have recently been vibecoded.


There's a new web browser project today that's a heck of a lot more impressive than the previous ones - ~20,000 lines of dependency-free Rust (though it uses system libraries for image and text rendering), does a good job of the Hacker News homepage: https://news.ycombinator.com/item?id=46779522


Thanks for the heads up, that does look much more interesting.

I don't think it really affects the point discussed above for now, because we were discussing average users, and by definition, the first person to code a plausible web browser with an agent isn't an average user - unless of course that can be reliably replicated with any average user.

But on that note, the takeaways on the post you linked are relevant, because the author bucked a few trends to do this, and concluded among other things that "The human who drives the agent might matter more than how the agents work and are set up, the judge is still out on this one."

This will obviously change, but the areas that LLMs need to improve on here are ones they're notoriously weak on, so it could take a while.


at least 5% more accurate than average LLM


best to write assembly instead.


Docker desktop has a pretty nice sandbox feature that will also store your CC (and other) credentials, so you don't have to re-auth every time you create a new container.


Funnily enough, we shipped the Docker Desktop VM a decade ago now (experience report at https://dl.acm.org/doi/10.1145/3747525). The embedded VM in DD is much more stripped down than the one in Claude Cowork (its based on https://github.com/linuxkit/linuxkit), and its more specialised to container workloads rather than just using bubblewrap for sandboxing (system services run in their own isolated namespaces).

Given how many products seem to be using this shipping-Linux-as-a-library-VM trick these days, it's probably a good time for an open source project to step up to supply a more reusable way of assembling this layer into a proper Mac library...


This is one of those announcements that actually just excites me as a consumer. We give our children HomePods as their first device when they turn 8 years old (Apple Watch at 10 years, laptop at 12) and in the 6 years I have been buying them, they have not improved one ounce. My kids would like to listen to podcasts, get information, etc. All stuff that a voice conversation with Chatgpt or Gemini can do today, but Siri isn't just useless-- it's actually quite frustrating!


Siri still can't play an Apple Music album when there is a song of the same name.

Even "Play the album XY" leads to Siri only playing the single song. It's hilariously bad.


Or the even more frustrating:

Me: "Hey Siri, play <well known hit song from a studio album that sold 100m copies"

Siri: "OK, here's <correct song but a live version nobody ever listens to, or some equally obscure remix>"

Being these things are at their core probability machines, ... How? Why?


> Being these things are at their core probability machines, ... How? Why?

Is Siri a probability machine? I didn't think it was an LLM at all right now? I thought it was some horrendous tree of switch statements, hence the difficulty of improving it.

Apple search is comically bad, though. Type in some common feature or app, and it will yield the most obscure header file inside the build deps directory of some Xcode project you forgot existed.


It’s absolutely insane that you can’t say “Siri, play my audiobook” and it play the last audiobook you listened to. Like, come on.


Or when you are driving, someone sends a yes-no question where the answer is no.

Siri: Would you like to answer?

Me: Yes

Siri: ...

Me: No + more words

Siri: Ok (shuts off)


Not exactly the same, but kinda: my gen 1 Google Home just got Gemini and it finally delivers on the promise of like 10 years ago! Brought new life to the thing beyond playing music, setting timers, and occasionally asking really basic questions


It remains to be seen what the existing HomePods will support. There’s been a HomePod hardware update in the pipeline for quite some time, and it appears like they are waiting for the new Siri to be ready.


it's not going to help them. For Siri to be really useful it wouldn't need deep system integration and an external model is not going to provide that. People don't believe me when I said it about Apple Intelligence with open AI


Thats what you get for buying into one ecosystem and sticking with it. All that stuff has been available on Alexa for a decade.


You can tell just from the title alone.

I am currently employing a consultant for something. It's something I don't want to do myself and they are doing what I need, but it's so painfully obvious they are just vanilla ChatGPTing everything it's almost funny at this point.


How can you tell from title alone? I am clueless.


The median age for MAID recipients is over 77.


Both can be true, I think.


50% is not a vast majority, so that's a red herring


50% can be a massive majority if there is a plethora of opinions


How much of your daily intake comes from food that has a barcode on it?


I typically just measure ingredients and log it in Cronometer.


Nearly all food in the US has barcodes.

Produce, meat, etc might not always, but simply search those the same as you'd ask ChatGPt.


Personally everything that is not produce

Even the meat, i usually buy packaged


Any IDE based editor feels like a stopgap to me. We may not be there yet, but I feel that in the future a "vibe coder" isn't even going to look at much code at all. Much of what developers who are relying on Cursor, Windmill, Replit, etc etc are doing is performative as it relates to code. There is just a lot of copy/pasting of console errors and asking for things one way or another.

Casual or "vibe" coding is all about the output. Doesn't work? Roll back. Works well? Keep going. Feeling gutsy? Single shot.


Vibe coding is just a prototyping tool / "dev influencer" gimmick. No one serious is using Cursor for vibe coding, nor will anyone serious ever vibe code. It's for AI assisted development-- in other words, a more powerful intellisense.


I vibed this puzzle game into existence with two breaks* from vibe coding midway through to get it out of a rut: https://love-15.com/

It builds for PC, web, iOS and Android.

It's a simple sliding block puzzle game with a handful of additional game mechanics which you can see if you go into settings to unlock all levels, saved progress and best times/move counts, a level editor, daily puzzles with share results, and theme selection.

I think I found the current limits of vibe coding. There's one bug that I know of which I don't think can be fixed with vibe coding, and so I haven't fixed it as this was largely an experiment to see how far you could get with vibe coding.

I've since inspected the code and I believe the code is just too bad for the LLM to get anywhere at this point. Looking at the git history - I had it commit every time a feature was complete and verified working by me - the code started OK but really went downhill as it got bigger, and it got worse faster over time.

(When I first broke from vibe coding it was hitting a brick wall on progress earlier than expected and I needed to guide it to break the project up into more files, which it is terrible at by the way; I think the one giant file was hitting context length limits, which were smaller at the time than they are now. The second break was at the end to get it over the finish line when it just could not fix some save bugs without introducing new ones, and I did just barely enough technical guidance to help it finish. In neither case did I write code, but I did read code in both cases.)


The people who outsource their jobs to mechanical Turks will be using it until their employer(s) find out.


I felt the same way for a while, but I am really not so sure now. Cursor is definitely drawing on the influencer/growth well to drive some portion of these #s.

It's a lot easier and more scaleable to get 1000 people "vibe coding" than it is to get 10 experienced engineers using you for autocomplete.


Have you tried Bolt.new? You can vibe code with it.


Cursor isnt for vibe coding. I use it. I ask the AI to do something I know how to do but it can do it faster. I check the changes to make sure everything looks good.


Cursor is also great for vibe coding though, and is even referenced in the original tweet that introduced the term.


But this sums up so well why I think the valuation is so riskily high. You're saying that right now IDE UX is so slow and bad that often there are changes you know how to make but it would literally just be too many keystrokes for you to want to do yourself.

As far as I can tell if people like you just had a way to express code ideas with fewer keystrokes, a lot of Cursor's market would pretty much just dry up.


I am currently dealing with a relatively complex legal agreement. It's about 30 pages. I have a lawyer working on it who I consider the best in the country for this domain.

I was able to pre-process the agreement, clearly understand most of the major issues, and come up with a proposed set of redlines all relatively easily. I then waited for his redlines and then responded asking questions about a handful of things he had missed.

I value a lawyer being willing to take responsibility for their edits, and he also has a lot of domain specific transactional knowledge that no LLM will have, but I easily saved 10 hours of time so far on this document.


I am a semi-retired blue collar electrician. Higher IQ, but lowly certifications (definitely not a lawyer).

Currently I have initiated a lawsuit in my US state's small claims civil court, over a relatively simple payment dispute. Without the ability to bounce legal questions/tact/procedurals off of Perplexity, I wouldn't have felt comfortable enough to represent myself in court.

Even if I were to need a lawyer on this simple case, the majority of the "leg work" has already been completed by free, non-pay LLMs.

My court date is early June; I'm both nervous and excited (for restitution)!

----

I have a judge brother and have been arguing for years that law clerking is probably in its last gasps of career-entry; Chief Justice Roberts's end of 2023 SCOTUS report was a refreshing read to share among family members (which argued that LLMs will provide more accessibility to judiciary by commoners).

Personally, I already would rather have a jury of LLMs deciding most legal outcomes (albeit would need to be impartially programmed, if that's even possible). Definitely would make for better democratic accessibility.

I found Bruce Schneier's recent article "Reimagining Democracy" [1] quite an interesting thought experiment (which is about his hosting intellectuals in their discussions of creating entirely new democracies utilizing modern technologies). It'd be super fair if a trusted AI government could lead to better democracies than "modern capitalism" can / has.

[1] schneier.com/blog/archives/2025/04/reimagining-democracy-2.html


This is super interesting, because I've been in similar conflicts (as a renter trying to recover security deposits in court) and been screwed over by the lawyers I've retained (like, literally not even showing up in court) and I probably could have done all this with an LLM myself. When the stakes are low, why not?


I've also had a lawyer not show up (me as criminal defendant), and then try to fleece me for more money (than initially agreed) because we (I) had to reschedule my court date — only to eventually reach a simple plea agreement which any public defender could have secured. LLMs didn't exist when this occurred, well over a decade ago.

>similar conflicts (as a renter trying to recover security deposits in court)

This is basically my current scenario. LL sold the rental I was living in, which I had pre-paid for an entire year, because the septic tank went out. We mutually agreed to end our lease... he then wrote me a check for overpayment... he then canceled the check (without even telling me). As an added bonus, he tried nothing to fix the tank... then sold the disaster to somebody else (I found out only when the new owner showed up on my/his doorstep).

Not my first time in court, but is my first time as Plaintiff. I'm very excited to (potentially) get awarded TREBLE DAMAGES on my few-thousand-dollar initial claim/dispute.

The era of "A lawyer that represents himself has a fool for his client" are rapidly approaching end, particularly within small claims civil courts. I'd love to see entire branches of government replaced with machine-learnt judges.

I've already decided that if the Defendant (in my case) chooses to appeal to our higher court (i.e. not small claims, which he is entitled to do) I will retain an attorney, only because civil procedure is so nuanced.

But I'm trying first, and most of the legwork is already formulated.


I agree. I found it unusable for anything but casual usage due to the rate limiting. I wonder if I am just missing something?


I think it's the small TPM limits. I'll be way under the 10-30 requests per minute while using Cline, but it appears that the input tokens count towards the rate limit so I'll find myself limited to one message a minute if I let the conversation go on for too long, ironically due to Gemini's long context window. AFAIK Cline doesn't currently offer an option to limit the context explosion to lower than model capacity.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: