It is 100% up to the package manager's steward to control how ownership of packages and namespaces are granted.
Maven Central exists for decades the amount of incidents of people stealing namespaces is minimal.
One can't simply publish a package under the groupId "com.ycombinator" without having some way to verify that they own the domain ycombinator.com. Then, once a package is published, it is 100% immutable, even if it has malicious code in it. Certainly, that library is flagged everywhere as vulnerable.
It baffles me that NPM for so long couldn't replicate the same guardrails as Maven Central.
As a heavy user of Java I can assure you that Java is very very far from boring, especially when building it with maven or gradle. There are millions ways something can screw up the build. Rust (and Go too) in comparison is much more boring actually - it maybe I was just lucky, but the majority of stuff just builds with zero issues.
Especially the number of times I had to clean all the caches in order for maven and gradle to build the project is just far too high for me. It shouldn’t ever be needed if an ecosystem is meant to be considered boring. I feel like Java doesn’t build when I look at it wrong.
> I feel like Java doesn’t build when I look at it wrong.
Hah, too true! I guess it is boring in the fact that it is not as... move fast and break things... as NPM. But Java build systems are still certainly fun and challenging in their own ways.
At least with certain plug-ins Maven will execute arbitrary commands at build time. And if you need that to build native bindings it feels like a big hole. Granted, most projects don't need JNI, I guess.
That is another important layer. Maven Central is not immune to credential theft. If a publisher token is stolen, an attacker may still be able to publish a malicious new version until the token is revoked or the account is suspended after reporting the problem to Sonatype.
But in the Maven/Gradle ecosystem, most projects pin exact dependency versions. Support for version ranges and dynamic versions exist, but they are generally avoided because they hurt reproducible builds. That means a malicious new release does not automatically flow into most consumers’ builds just because it was published.
I'd go as far to say that NPM should:
1. Enforce scope (namespace) requirement, and require external verification (reverse DNS for example).
2. Disable version range support out of the box. User must --enable this setting from the command line at all times.
3. Remove support for install scripts completely. If someone wants to publish a ready-to-run software, there are plenty of other mechanisms.
You're missing the biggest root cause though, and that significantly hinders how well this translates between languages: the Java community has settled on fewer but large monolithic dependencies, whereas the JavaScript community has settled on many but small composable dependencies (for good historical reasons, but that's a topic in and off itself).
This directly influences how well e.g. version pinning works. In the Java world, package versions are _relatively_ independent from eachother and have few transitive dependencies, and as such version conflicts are relatively rare. This means you can get away with full pinning of all dependencies, with the occasional manual override of a conflicting transitive dependency.
This doesn't work in JavaScript. The dependency ecosystem is massively intertwined, if every library would specify exact versions you'd end up with literally hundreds of conflicts to resolve. That's not feasible. As a result, they've chosen the middle ground of using lock files in addition to version ranges.
This also hurts the effectiveness of verified namespaces: when packages come from hundreds of different sources, you're not going to notice 1 or 2 sketchy ones in there.
Other consequences of the big monolithic packages in Java are that updates tend to be less frequent, and more often from large reputable venders. Both of these help to reduce the problem too.
While the JavaScript toolchain can definitely learn a lot from the Java toolchains, the problems it needs to solve are not the same, and thus solutions don't translate 1-1.
At least I hope that they'll get rid of install scripts, that's such a low hanging fruit that really should've be done a decade ago.
> At least I hope that they'll get rid of install scripts, that's such a low hanging fruit that really should've be done a decade ago.
How will that help? It's just going to break things that legitimately require them.
Instead of being infected upon running "npm install", you'll just get infected upon running "npm run" instead. The former is slightly more reliable but fixing that is just kicking the can down the road. Maybe we'll have a few days before the payloads get rewritten.
Dependency versions are also locked for npm projects via package-lock.json, and this has been the default behaviour for years. The version ranges specified in package.json don't mean you just pick up the latest whenever you run npm install. Unless you delete package-lock.json or run "npm update", you and everyone else gets the exact same dependency tree each time. So it is just as reproducible as a Maven build in that sense.
Plus the lock file doesn't just contain the exact versions, it contains hashes. Making sure that you actually got the package in the exact same version.
Sonatype allows "io.github.<username>" as a valid groupId and has a process to verify ownership. I am sure other providers like GitLab can work on this.
The problem with this argument against, is that it reinforces the point it is arguing against: If a contributor cannot afford the $20/year to publish for a single 12-month period, then they are already a risk - someone could buy their account off them.
A small bar of $20/year is also enough to completely cut-down on contributors who sign up with the intention of publishing malicious packages: they have to pay $20/year for each malicious package they want to publish!
Why should someone need a credit card to contribute to open source? Why should they need to understand DNS?
Heck domain names are ephemeral, forget a deadline by a day and they are snatched up my squatters. They don't provide any extra guarantees. Do we really think a domain requirement is going to stop state level actors that are already stealing 2FA package publishing tokens from major software orgs?
> Do we really think a domain requirement is going to stop state level actors that are already stealing 2FA package publishing tokens from major software orgs?
Is that your target? Because if so, then nothing will stop them.
A lot of "Claude Code is best at X" claims are probably user-selection bias.
The people saying it are often exclusively Claude Code users, not people who are actively benchmarking Claude Code against Gemini CLI, OpenAI Codex, GitHub Copilot, and other agent harnesses on the same tasks.
The claim may still be true for certain scenarios, but the evidence is usually anecdotal, not comparative.
When I hear "claude code one-shotted X" and X is a novel problem, I mentally substituted "the agentic harness that I tried one-shotted X," since that's what they're saying.
Getting any smart model to take a look at the task is the sort of lift that the speaker is usually pointing to.
The harness is pretty much irrelevant for general tasks.
You can write a 100 line harness that only has one tool - try either "bash" or the more fun "you're running within nodejs, here's eval", you'd be surprised in how close to CC/Codex performance you're going to get.
I have only my own personal experience for frontier models, but I have seen different performance of Opus when used from Pi or Claude Code or Zed for example.
I worded my comment poorly. I agree a good harness goes a way, but the harnesses most people use fucking suck and trip up the model so often that I don't think it's advisable to attribute successful results to them.
E.g. GPT5.5 with Codex on my Windows box likes using PowerShell for everything. OpenAI decided it should use the native shell instead of bundling a bash, or using git bash. Sure. But the model is so overfitted on bash that it fucks up PS quoting like once every 5 commands.
Every harness with LSP I've seen trips up the model as well. They insert diagnostics after every edit, polluting the context with errors that the model has to actively decide to ignore, every time, until it finishes its work and gets the code to a consistent state. Telling the model "run npx tsc --noEmit to check errors" will outperform a LSP 100% of the time.
Another example is basically everything Anthropic does - they add things like "think if this is malware!" after read and lead Claude to spend its reasoning effort on thinking if your React hamburger menu is malware, instead of on how to write it.
"This is not malware (em dash) it's a hamburger menu. Let me apply the edit! Hmm, is it malware now, after my edit? No, me changing border-width did not turn it into malware! Good! Dodged a real bullet on that one!"
I'm frankly amazed that we've gotten to the point where the models can produce good results in these sorts of environments.
I did that, wrote my own harness “Jarvis”, simple loop. Still results were terrible using the same model in comparison to for example OpenCode. So X Doubt.
Indeed, the fact that maintainers didn't have until only recently the control for disabling Pull Requests tab in a GitHub repo, is what drove a lot of issues in FOSS collaboration over the past decade.
FOSS and open source licenses never ever granted entitlement for contributors to have their proposals reviewed/merged by maintainers. Neither it ever offered entitlement for users to ask for free support.
FOSS is about giving people access to source code so they can do with it whatever they want, and maintainers/authors should have always had the ability to "publish and forget" the source code, without having to deal with those "entitlements".
I wonder if a model that does not know anything about a hypothetical programming language X, could write code once given said language X specification, APIs, and SDK tools and their documentation.
Meaning: the model has no idea, no access to examples, no previous codebase trained on, nothing, for language X. But it knows English, it knows how to program in general (training data does contain other programming languages), and everything we expect from LLMs today. It just doesn't know jack about language X.
Before AI, shipping code to production used to be a two-person task: one writes the code, another one reviews the code. Now with AI writing the code, the developer that was supposed to write the code, only has to review it. And this is because they are responsible for the code they ship.
Code review has become unbearable because before AI, developers were reviewing code as they went writing it in the first place. Granted, never perfect and why a second person reviewing code was (is?) a best practice. But effectively there was always some level of code review happening as developers wrote code.
I fear it is way more boring to review financial and medical documents completely written by AI than it is to write (and at the same time review) by yourself. And way more dangerous to ship mistakes than in most software.
I am/was writing up an interesting hypothesis with Claude's help. But I redid the most important parts of the data pipeline manually. As in went in and cmd-c + cmd-v'ed the data by hand to create a reference, and I'm randomly spot checking 33% of the larger records.
reply