Hacker Newsnew | past | comments | ask | show | jobs | submit | Rudybega's commentslogin

That's the nature of abstraction. Everything you create on a computer is built on a towering stack of black boxes.


yet some abstractions are more deterministic than others


and all are wrong, but some are more useful than others


Anthropic and the military had a contract. The military wanted to change the terms of that contract. Anthropic said no, which is their clearly defined contractual right. They got labeled a supply chain risk. How is this anything other than a shakedown? Does contract law mean anything to this administration?


The other such labeled companies have contracts too.


10 USC 3252 has only been used once, against Acronis AG, a Swiss company with Russian connections.

Acronis did not have DOD contracts.

Other companies (Huawei) have been deemed risks under different laws, or by Congress, but they also didn't have direct DOD contracts.

Do you have any evidence for your assertion? Did you check if it is true before posting?


no, the other such labeled companies are foreign owned firms like Huawei that the government never intended to do business with in the first place


You don't actually believe in the core tenets of the USA if you think that the government should have or should exercise unchecked, abusive power.


The power should be checked by the people and the government we have established. It shouldn't have its power checked by private corporations.


The government wants to use AI to decide who to kill. Fuck that.


1. The restriction applies to even writing documentation, adding comments, scanning for bugs, or even scanning for security vulnerabilities in systems for fully autonomous weapons. As automated vulnerability discovery gets stronger and stronger it is critical that have the ability to have a strong defense.

2. It is a principled take on that private companies shouldn't be making the decisions what their tools can and can't be used for in such an important sector.


Corporations are people my friend.


A small group of people who may hold views that conflict with the citizens of a country.


Then the government should end their contract with Anthropic. The terms of the contract were clear.

Designating them a supply chain risk is unprecedented authoritarian strong-arming.


Well yeah, they're running a small, outdated, older model. That's not really the point. This approach can be used for better, larger, newer models.


If only there were some way to test it, like swapping the two nouns in the sentence. Alas.


There aren't remote "drivers" in the Philippines, that's not how Fleet Response works. You can see how it works here if you're curious: https://waymo.com/blog/2024/05/fleet-response but the TLDR is that they give the Waymo driver options in confusing situations (things like, you can go use this driveway on the side to pass this blocked traffic).


Directing the movements of a car is now “fleet response”, not “driving”. Give them a fucking innovation award or something.


If a passenger says to you, "go around this car by using that private driveway", are they driving the car?


MMLU performance caps out around 90% because there are tons of errors in the actual test set. There's a pretty solid post on it here: https://www.reddit.com/r/LocalLLaMA/comments/163x2wc/philip_...

As far as I can tell for AIME, pretty much every frontier model gets 100% https://llm-stats.com/benchmarks/aime-2025


Here's the score for new AIME's, where we know the answers aren't in training.

https://matharena.ai/?view=problem&comp=aime--aime_2026

As for MMLU, is your assertion that these AI labs are not correcting for errors in these exams and then self-reporting scores less than 100%?

As implied by the video, wouldn't it then take 1 intern a week max to fix those errors and allow any AI lab to become the first to consistently 100% the MMLU? I can guarantee Moonshot, DeepSeek, or Alibaba would be all over the opportunity to do just that if it were a real problem.


I mean, Waymo gives a lot of examples of the situations, in their blog post about Fleet Response where they detail this, released May 21, 2024. They're very explicit that the Waymo Driver autonomous system is in control the entire time.

This isn't something new.

https://waymo.com/blog/2024/05/fleet-response


Thanks for this link. I’ve failed to find specifics on this for a while but this is pretty good, particularly the example about which lane to choose when cones are set up.


Very helpful, thank you.


There are two compilers that can handle the Linux kernel. GCC and LLVM. Both are written in C, not Rust. It's "in distribution" only if you really stretch the meaning of the term. A generic C compiler isn't going to be anywhere near the level of rigour of this one.


There is tinycc, that makes it three compilers.

There is a C compiler implemented in Rust from scratch: https://github.com/PhilippRados/wrecc/commits/master/?after=... (the very beginning of commit history)

There are several C compilers written in Rust from scratch of comparable quality.

We do not know whether Anthropic has a closed source C compiler written in Rust in their training data. We also do not know whether Anthropic validated their models on their ability to implement C compiler from scratch before releasing this experiment.

That language J I proposed does not have any C compiler implemented in it at all. Idiomatic J expertise is scarce and expensive so that it would be a significant expense for Anthropic to have C compiler in J for their training data. Being Turing-complete, J can express all typical compiler tips and tricks from compiler books, albeit in an unusual way.


TinyCC can't compile a modern linux kernel. It doesn't support a ton of the extensions they use. That Rust compiler similarly can't do it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: