> everybody who is like me, fully onboarded into AI and agentic tools, seemingly has less and less time available because we fall into a trap where we’re immediately filling it with more things
I do wonder if productivity with AI coding has really gone up, or if it just gives the illusion of that, and we take on more projects and burn ourselves out?
> I do wonder if productivity with AI coding has really gone up
Here's the thing: we never had a remotely sane way to measure productivity of a software engineer for reasons that we all understand, and we don't have it now.
Even if we had it, it's not the sort of thing that management would even use: they decide how productive you are based on completely unrelated criteria, like willingness to work long hours and keeping your mouth shut when you disagree.
If you ask those types whether productivity has gone up with AI, they'll probably say something like "of course, we were able to let go a third of our programmers and nothing really seems to have changed"
"Productivity" became a poisoned word the moment that the suits realized what a useful weapon it was, and that it was impossible to challenge.
>"Productivity" became a poisoned word the moment that the suits realized what a useful weapon it was, and that it was impossible to challenge.
Not impossible to challenge. But most people don't have the legal funds to do so. Those that do tend to get a cushy severance bribe to stay quiet and they move on elsewhere.
That's also why it's a long process to "fire" someone but easy to "lay off" instead. layoffs are never about productivity (so it doesn't matter anyway), and the US is doing absolutely nothing to protect against it like most of the world.
What society and America is about to realize is that it really doesn’t matter how productive you are at software and technological innovations when systemic things outside of the economic system are eroding.
It doesn’t matter how fast we can make our widgets and chatbots when what you need is to have a self sufficient workforce. We have outsourced everything material and valuable for society. Now we are left with industries of gambling, ad machines and pharmaceuticals with a government that is functionally bankrupt and politicians that have completely sold out
> I do wonder if productivity with AI coding has really gone up, or if it just gives the illusion of that, and we take on more projects and burn ourselves out?
It definitely hasn't for me. I spent about an hour today trying to use AI to write something fairly simple and I'm still no further forward.
I don't understand what problem AI is supposed to solve in software development.
> I don't understand what problem AI is supposed to solve in software development.
When Russians invaded Germany during WWII, some of them (who had never seen a toilet) thought that toilets were advanced potato washing machines, and were rightfully pissed when their potatoes were flushed away and didn't come back.
Sounds like you're feeling a similar frustration with your problem.
At some point hearing "you're holding it wrong" and "here's a metaphor for why you're dumb" in response to real shortcomings with AI, and the manic hype behind it, becomes repetitive and feels like there really aren't good arguments or evidence against those shortcomings and hype.
Well following advice from folk on here earlier, I thought I'd start small and try to get it to write some code in Go that would listen on a network socket, wait for a packet with a bunch of messages (in a known format) come in, and split those messages out from the packet.
I ended up having to type hundreds of lines of description to get thousands of lines of code that doesn't actually work, when the one I wrote myself is about two dozen lines of code and works perfectly.
It just seems such a slow and inefficient way to work.
tbh that's not a helpful thing to say. I think a more productive thing would be to ask "What model are you using?" "Are you using it in chat mode or as a dedicated agent?" "Do you have an AGENTS.md or CLAUDE.md?"
I've also been underwhelmed with its ability to iterate, as it tends to pile on hacks. So another useful question is "did you try having it write again with what you/it learned?"
Agreed was a bit rough. Yes they are not great at iterating and keeping long contexts, but you look at what he’s describing and you have to agree that’s exactly the type of problem llm excel at
Shouldn’t have to baby step through the basics when the author is clearly not interested in learning himself
> Shouldn’t have to baby step through the basics when the author is clearly not interested in learning himself
I'd rather assume good faith, because when I first started using LLMs I was incredibly confused what was going on, and all the tutorials were grating on me because the people making the tutorials were clearly overhyping it.
It was precisely the measured and detailed HN comments that I read that convinced me to finally try out Claude, so I do my best to pay it forward :)
I totally agree, and myself have gone through that cycle.
But the guy is being adversarial and antagonistic. Its a 2 way street, sometimes you have to call people out on their BS because I'm not seeing someone argue in good faith, but rather pretending some superior knowledge because hes working on a esoteric protocol like people here don't know how packet headers work
I don't read it as superiority, perhaps bitterness would be the closest word to what I'm reading.
> sometimes you have to call people out on their BS
That's true, but I think that it's often much later than what some people would consider enough. Someone can be bitter, and still have good points. It's very dangerous to preemptively dismiss points, because it means that I won't listen to anyone who disagrees with me. I'm willing to put in the work to interpret someone's response in a productive light because there's often something to find.
There's a framework that I work within when I'm in a discussion. There's three elements: arguments, values, and assumptions. An argument is the face value statements. But those statements come from the values and assumptions of the person.
Values are what people consider most important. In most cases, our values are the same, which is good!
The biggest difference is assumptions. For example, one assumption I have is that free markets are the best method we have to lift individuals out of poverty. This colors how I talk about AI. Another person might assume that free markets have failed, and we need to use a different approach. This colors how they would view AI. So we'll completely talk past each other when arguing AI, because it's more of a proxy war of our assumptions.
>Shouldn’t have to baby step through the basics when the author is clearly not interested in learning himself
Okay. Whip up your favorite model and report back to us with your prompts. I'm pretty anti-AI, but you're going to attract more bees with honey than smoke.
> I think a more productive thing would be to ask "What model are you using?" "Are you using it in chat mode or as a dedicated agent?" "Do you have an AGENTS.md or CLAUDE.md?"
In my case I'd have to say "Don't know, whatever VS Code's bot uses", and "no idea what those are or why I have to care".
The reason I ask about what model is I initially dismissed AI generated code because I was not impressed with the models I was trying. I decided if I was going to evaluate it fairly though, I would need to try a paid product. I ended up using Claude Sonnet 4.5, which is much better than the quick-n-cheap models. I still don't use Claude for large stuff, but it's pretty good at one-off scripts and providing advice. Chances are VS Code is using a crappy model by default.
> no idea what those are or why I have to care
For the difference between chat mode and agent mode, chat mode is the online interface where you can ask it questions, but you have to copy the code back and forth. Agent mode is where it's running an interface layer on your computer, so the LLM can view files, run commands, save files, etc. I use Claude in agent mode via Claude Code, though I still check and approve every command it runs. It also won't change any files without your permission by default.
AGENTS.md and CLAUDE.md are pretty much a file that the LLM agent reads every time it starts up. It's where you put your style guide in, and also where you have suggestions to correct things it consistently messes up on. It's not as important at the beginning, but it's helpful for me to have it be consistent about its style (well, as consistent as I can get it). Here's an example from a project I'm currently working on: https://github.com/smj-edison/zicl/blob/main/CLAUDE.md
I know there's lots of other things you can do, like create custom tools, things to run every time, subagents, plan mode, etc. I haven't ever really tried using them, because chances are a lot of them will be obsolete/not useful, and I'd rather get stuff done.
I'm still not convinced they speed up most tasks, but it's been really useful to have it track down memory leaks and silly bugs.
Heh, I'm a college student, so I can't help with that...
You could also try Gemini 3 pro with Gemini's CLI which is free, though it's not as good at using tools. But, it sounds like you're not interested, which is fine!
Just please don't continue to argue with finer points if you're not interested. I've done my best to engage with your points, but I get the sense that it doesn't matter what I say.
I am curious though, why do you feel so strongly about LLM products?
I should note that I'm not the same person that you were talking to you the chain. So I hope we're not mixing conversations and people. I don't think I've said that much in this chain, so I can't answer much.
But sure:
>why do you feel so strongly about LLM products?
Personally, I work in games. So pretty much everything in the discourse of LLMs and Gen AI has been amplified 5x for me. The layoffs, the gamers' reaction to stuff utilizing AI, the impact on hardware prices, the politics, etc.
Theres a war of consumers and executives, and I'm trapped in the middle taking heat from both. It's tiring and it's clear who to blame for all of this. I want all of this to pop so the true innovation can rise out, instead of all the gold rush going on right now.
Also,game code is very performance sensitive. It's not like a website or app where I can just "add 5 seconds to a load time" unless I'm working on a simple 2D game, nor throw more hardware to improve performance. Even if LLMs could code up the game, I'd spend more time optimizing what it makes than it saved. It simply doesn't help for the kind of software I work with.
I have worked in games in the past, and currently work in games-adjacent. I'm sympathetic to the concerns you've mentioned, especially given how controversial it is (the recent reveal of DLSS5, which I find directionally interesting but executed poorly, is but one of many examples.)
From speaking to my friends in the industry, it seems like uptake for code is happening slowly, but unevenly, and the results are largely dependent on the level of documentation, which is often lacking. (I know of a few people using AI for (high-quality!) work on Godot, and their AIs struggle with many of the implicit conventions present in the codebase.)
With that being said, I would say that LLMs have generally been quite the boon for the (limited) gameplay work that I have done of recent. Because the cost of generation is so cheap [0], it is trivial to try something out, experiment with variations, and then polish it up or discard it entirely.
This also applies to performance work: if it's a metric that the AI can see and autonomously work on, it can be optimised. This is, of course, not always possible - it's hard to tell your AI to optimise arbitrary content - but it's often more possible than not, especially if you get creative. (Asking it to extract a particularly hot loop out from the code it resides within, and then optimising that, for example: entirely feasible.)
I think there are still growing pains, but I'm confident that LLMs will rock the world of gamedev, just like they're doing to other more well-attested fields of programming.
Yeah, that sums up a lot of my thoughts with AI c. 2026.
I do take some schedenfreude knowing that AI training also struggles with the utter lack of documentation here. That may be a win in and of itself if this paradigm forces the games industry to properly care for tech writing.
>Because the cost of generation is so cheap [0], it is trivial to try something out, experiment with variations, and then polish it up or discard it entirely.
Well, that's another thing I'm less confident about. The cost is low, for now. But we also know these companies are in loss leader mode. It'll probably always be cheap for a company to afford agents, but I fear reliance on these giant server models will quickly price out ICs and smaller work environments.
That might be something China beats us too. They seem to be focusing on optimizing models that works on local machines out of necessity, as opposed to running tens of billions of dollars of compute. My other big bias is wanting to properly own as much of my pipeline as possible (to the point where my eventual indie journey is planning around OS tools and engines, despite my experience in both Unity and UE), and current incentives for these companies don't want that.
Crap, you're right. I swear, tiny usernames is both a boon and a curse...
> Personally, I work in games. So pretty much everything in the discourse of LLMs and Gen AI has been amplified 5x for me. The layoffs, the gamers' reaction to stuff utilizing AI, the impact on hardware prices, the politics, etc.
> Theres a war of consumers and executives, and I'm trapped in the middle taking heat from both. It's tiring and it's clear who to blame for all of this. I want all of this to pop so the true innovation can rise out, instead of all the gold rush going on right now.
That makes a lot of sense. I've been pretty fed up with the hyperbole and sliminess, and I can't imagine how difficult it is to be squeezed between angry gamers and naive and dense executives.
When you say "true innovation", is that in terms of non-AI innovation, or non-slimy AI innovation? I guess I personally still believe that LLMs are useful, but only as another tool amongst many others.
I'm also a big believer in human centered UX design, and it's kinda sad that the dominant experience is all textual.
> Also,game code is very performance sensitive
It does seem like game programming is the last bastion of performance, at least in terms of normal hardware, since the game has to go to the consumer's hardware. The "silver bullet" mentality drives me a little crazy because it clearly doesn't work in all situations.
Anyways, I don't know if this response really has a point, but I wanted to at least acknowledge your experience.
>When you say "true innovation", is that in terms of non-AI innovation, or non-slimy AI innovation?
A bit of both. Similar to other tech investment, all the gaming centric accelerators are looking for is AI pitches. Makes me wonder what innovations thr past few years have been overlooked in lieu of the Ai Gold Rush.
But I can see the long term (likely 5+ years out) potebtial of Ai as well. Once we stop using it as a means to steal from and remove artists, I can see all kinds of tedious problems with assets that Ai can accelerate. Generative fill is a glimpse of a genuinely useful tool that helps artists instead of pretending to be an artist itself.
Can it eventually write performant code? Maybe. The other big issue is that 1) a lot of code isn't online to train on and 2) a lot of that code is still a mess to process, with little standards to follow. Maybe it can help with graphics code (which is much more structured) in the near future.
The problem is that I want something that listens on a TCP connection for GD92 packets, and when they arrive send appropriate handshaking to the other end and parse them into Go structs that can be stuffed into a channel to be dealt with elsewhere.
And, of course, something to encode them and send them again.
How would I do that with whatever AI you choose?
I'm pretty certain you can't solve this with AI because there is literally no published example of code to do it that it can copy from.
No idea what you’re talking about but if it has a spec then it doesn’t matter if it’s trained on it. Break the problem down into small enough chunks. Give it examples of expected input and output then any llm can reason about it. Use a planning mode and keep the context small and focused on each segment of the process.
You’re describing a basic tcp exchange, learn more about the domain and how packets are structured and the problem will become easier by itself. Llms struggle with large code bases which pollute the context not straightforward apps like this
One other thing, it might be worthwhile having the spec fresh in the LLM's context by downloading it and pointing the agent at it. I've heard that that's a fruitful way to get it to refresh its memory.
> GD92 packets? No idea what you’re talking about but if it has a spec then it doesn’t matter if it’s trained on it.
Okay, so you're running into the same problem that LLMs are.
> Break the problem down into small enough chunks. Give it examples of expected input and output then any llm can reason about it.
So I have to do lots of grunt work?
> You’re describing a basic tcp exchange, learn more about the domain and how packets are structured and the problem will become easier by itself
I've written dozens of things that deal with TCP. I already have a fully-working example of what I want. The idea was to test if I could recreate it using LLMs.
How is it supposed to work? How does it put in the code I already know I want?
>Okay, so you're running into the same problem that LLMs are.
I can't tell if you are a troll or not, but you can't complain that nobody understands your intentionally vague and obtuse way to describe the problem at hand to pretend you're superior.
You have to rename the file ending to PDF. It's probably the wrong spec, because I'm basing this research on literally four letters that could mean anything since there is zero context given here. I've also found some German documents about chemistry.
If your argument is that LLMs and humans are stupid because they don't know what a "GD92" is, then yeah maybe it's a you problem.
Go and throw the spec into openai codex inside limactl (get it from GitHub) and use zed (the editor) and a SSH remote project to get inside the VM, don't forget to enable KVM for performance. The free tier for openai is fine, but make sure to use codex 5.2.
First ask questions on what the binary encoding is based on. It's probably X.400, then once you've asked enough questions, tell it to implement it. You probably won't have to read the spec at all yourself.
I started the task 56 minutes ago with one prompt, and now I have an implementation I can show you. There's plenty to quibble about - the files splayed over the main directory are quite ugly, and there is no actual test data that we can use on the public internet - but these are all trivially resolvable issues.
I didn't do any additional research for this. I gave it the spec PDF, your instructions upthread, and told it to build a library. You can also consult the transcripts (linked in the README) to see that I have no tricks up my sleeve. I didn't need to decompose the task in any meaningful way: the only input I provided was on minor matters of taste.
As a 20+ year Windows user now happily running desktop Linux for about a year - too little, too late. This company has completely lost my trust over the past 5+ years, with all the ads, upsells, silent telemetry, bugs, background process mess, forced updates, poor performance, etc.
It's not only steered me off of Windows, but Azure, Office, and anything else with the Microsoft name on it. I'll do my best to steer family and business customers off likewise.
Trust is earned over years, and whoever the execs are that pushed all these shitty short-term squeezes on their customers, the company now gets to pay the reputational price.
msft don't earn trust, they actively butchers it and see who'd stay, it's a little bit like scam letters that are full of spelling mistakes, that's the filter. those that'd still fall for it are exactly the people they're looking for. the strategy have worked very well for a very long time, it's so successful they don't really know what broke the current users, and you can see that in this release. they analysed the feedback and the number 1 fix is to "raise the bar" literally by allowing users to position bar anywhere, nothing organisational, nothing cultural, just close random tickets
Arch is really good. Wayland works. Gaming works. For any apps that don't work like Visual Studio I run kvm/qemu + unlicensed win11 vm. I do not give a shit my vm says "activate windows"
Not sure I follow - is the car the coding agent, and the developer the driver?
Agree with OP here, if AI coding tools are as intelligent and amazing as AI influencers and CEOs are saying, just prompt them to "Remake UV but faster & better".
> Agree with OP here, if AI coding tools are as intelligent and amazing as AI influencers and CEOs are saying, just prompt them to "Remake UV but faster & better".
If average dev is more intelligent and amazing than any coding model, just hire a team of average devs and “remake UV but faster & better”.
Average dev might be more intelligent, but likely neither will produce something of UV's quality. Either AI coding claims are way overblown and OpenAI can't easily remake UV, or OpenAI is buying the ecosystem & mindshare rather the code, probably to lock-in, enshittify, and try to squeeze a profit out of a so far money-losing business (AI not Astral, though true for both I guess).
Side note, everyone's talking about having AI agents "conform to the spec" these days. Am I in my own bubble, or - who the hell these days gets The Spec as a well-formed document? Let alone a good document, something that can be formally verified, thouroughly test-cased, can christen the software "complete" when all its boxes are ticked, etc.?
This seems like 1980's corporate waterfall thinking, doesn't jibe with the messy reality I've seen with customers, unclear ideas, changing market and technical environments, the need for iteration and experimentation, mid-course correction, etc.
> who the hell these days gets The Spec as a well-formed document?
The PMs asked ChatGPT to write a well-formed spec.
Sadly, true in too many companies right now.
I do agree with your general point that The Spec can become a crutch for washing your hands of any responsibility for knowing the product, the goals, the company's business, and other contexts. I like to defuse these ideas by reminding the engineers that The Spec is a living document and they are partially responsible for it, too. Once everyone learns that The Spec isn't a crutch for shifting all blame to the product manager, they become more involved in making sure it's right.
As a veteran freelance developer - aside from some occasional big wins, I'd say it's been net neutral or even net negative to my productivity. When I review AI-generated code carefully (and if I'm delivering it to clients I feel that's my responsibility) I always find unnecessary complexity, conceptual errors, performance issues, looming maintainability problems, etc. If I were to let it run free, these would just compound.
A couple "win" examples: add in-text links to every term in this paragraph that appears elsewhere on the page, plus corresponding anchors in the relevant page parts. Or, replace any static text on this page with any corresponding dynamic elements from this reference URL.
Lose examples: constant, but edit format glitches (not matching searched text; even the venerable Opus 4.6 constantly screws this up), unnecessary intermediate variables, ridiculously over-cautious exception-handling, failing to see opportunities to isolate repeated code into a function, or to utilize an existing function that exactly implements said N lines of code, etc.
It can only result in more work if you freelance because it you disclose that you used llm’s then you did it faster than usual and presumably less quality so you have to deliver more to retain the same income except now your paying all the providers for all the models because you start hitting usage limits and claude sucks on the weekends and your drive is full of ‘artifacts’, which incurs mental overhead that is exacerbated by your crippling adhd
And then all of a sudden you’re just arguing with the terminal all day - the specs are written by gpt, delivered in-the email written by gpt. Sometimes they dont even have the time to slice their prompt from the edges of the paste but the only thing i can think of is “i need to make the most of 0.5x off peak claude rates “
The amount of boilerplate people talk about seems like the fault of these big modern frameworks honestly. A good system design shouldn't HAVE so much boilerplate. Think people would be better off simplifying and eliminating it deterministically before reaching for the LLM slot machine.
I'm not so sure I agree. To me it's somewhat magical that I can write even this amount of code and have this stuff just magically work on pretty much every platform via docker, the web platform, etc. Maybe this again is me having started with embedded, but I am blown away at the ratio of actual code to portability we currently have.
Elsewhere I've seen a post from the author talking about how his old articles hit so many of Wikipedia's identified signs of AI-generated text. As somebody who's own style hits many of those same stylistic/rhetorical techniques, I definitely sympathize.
Every "classic computing" language mentioned, and pretty much in history, is highly deterministic, and mind-bogglingly, huge-number-of-9s reliable (when was the last time your CPU did the wrong thing on one of the billions of machine instructions it executes every second, or your compiler gave two different outputs from the same code?)
LLMs are not even "one 9" reliable at the moment. Indeed, each token is a freaking RNG draw off a probability distribution. "Compiling" is a crap shoot, a slot machine pull. By design. And the errors compound/multiply over repeated pulls as others have shown.
I'll take the gloriously reliable classical compute world to compile my stuff any day.
I do wonder if productivity with AI coding has really gone up, or if it just gives the illusion of that, and we take on more projects and burn ourselves out?
reply