More

Rudybega · 2026-04-06T22:20:15 1775514015

That's the nature of abstraction. Everything you create on a computer is built on a towering stack of black boxes.

anhner · 2026-04-06T22:50:12 1775515812

yet some abstractions are more deterministic than others

charlie0 · 2026-04-07T02:51:23 1775530283

and all are wrong, but some are more useful than others

Rudybega · 2026-03-05T20:57:21 1772744241

Anthropic and the military had a contract. The military wanted to change the terms of that contract. Anthropic said no, which is their clearly defined contractual right. They got labeled a supply chain risk. How is this anything other than a shakedown? Does contract law mean anything to this administration?

cakealert · 2026-03-05T21:04:32 1772744672

The other such labeled companies have contracts too.

mediaman · 2026-03-05T21:44:24 1772747064

10 USC 3252 has only been used once, against Acronis AG, a Swiss company with Russian connections.

Acronis did not have DOD contracts.

Other companies (Huawei) have been deemed risks under different laws, or by Congress, but they also didn't have direct DOD contracts.

Do you have any evidence for your assertion? Did you check if it is true before posting?

timmmmmmay · 2026-03-05T21:21:12 1772745672

no, the other such labeled companies are foreign owned firms like Huawei that the government never intended to do business with in the first place

Rudybega · 2026-02-28T00:51:58 1772239918

You don't actually believe in the core tenets of the USA if you think that the government should have or should exercise unchecked, abusive power.

charcircuit · 2026-02-28T01:42:15 1772242935

The power should be checked by the people and the government we have established. It shouldn't have its power checked by private corporations.

spankalee · 2026-02-28T01:43:32 1772243012

The government wants to use AI to decide who to kill. Fuck that.

charcircuit · 2026-02-28T01:49:34 1772243374

1. The restriction applies to even writing documentation, adding comments, scanning for bugs, or even scanning for security vulnerabilities in systems for fully autonomous weapons. As automated vulnerability discovery gets stronger and stronger it is critical that have the ability to have a strong defense.

2. It is a principled take on that private companies shouldn't be making the decisions what their tools can and can't be used for in such an important sector.

bigtex88 · 2026-02-28T02:27:59 1772245679

Corporations are people my friend.

charcircuit · 2026-02-28T02:28:43 1772245723

A small group of people who may hold views that conflict with the citizens of a country.

Rudybega · 2026-02-28T00:16:23 1772237783

Then the government should end their contract with Anthropic. The terms of the contract were clear.

Designating them a supply chain risk is unprecedented authoritarian strong-arming.

Rudybega · 2026-02-21T02:32:58 1771641178

Well yeah, they're running a small, outdated, older model. That's not really the point. This approach can be used for better, larger, newer models.

Rudybega · 2026-02-19T20:53:28 1771534408

If only there were some way to test it, like swapping the two nouns in the sentence. Alas.

Rudybega · 2026-02-12T19:51:30 1770925890

There aren't remote "drivers" in the Philippines, that's not how Fleet Response works. You can see how it works here if you're curious: https://waymo.com/blog/2024/05/fleet-response but the TLDR is that they give the Waymo driver options in confusing situations (things like, you can go use this driveway on the side to pass this blocked traffic).

badgersnake · 2026-02-13T14:42:37 1770993757

Directing the movements of a car is now “fleet response”, not “driving”. Give them a fucking innovation award or something.

Rudybega · 2026-02-13T20:35:53 1771014953

If a passenger says to you, "go around this car by using that private driveway", are they driving the car?

Rudybega · 2026-02-11T21:50:39 1770846639

MMLU performance caps out around 90% because there are tons of errors in the actual test set. There's a pretty solid post on it here: https://www.reddit.com/r/LocalLLaMA/comments/163x2wc/philip_...

As far as I can tell for AIME, pretty much every frontier model gets 100% https://llm-stats.com/benchmarks/aime-2025

RC_ITR · 2026-02-12T23:44:13 1770939853

Here's the score for new AIME's, where we know the answers aren't in training.

https://matharena.ai/?view=problem&comp=aime--aime_2026

As for MMLU, is your assertion that these AI labs are not correcting for errors in these exams and then self-reporting scores less than 100%?

As implied by the video, wouldn't it then take 1 intern a week max to fix those errors and allow any AI lab to become the first to consistently 100% the MMLU? I can guarantee Moonshot, DeepSeek, or Alibaba would be all over the opportunity to do just that if it were a real problem.

Rudybega · 2026-02-10T22:55:04 1770764104

I mean, Waymo gives a lot of examples of the situations, in their blog post about Fleet Response where they detail this, released May 21, 2024. They're very explicit that the Waymo Driver autonomous system is in control the entire time.

This isn't something new.

https://waymo.com/blog/2024/05/fleet-response

pgwhalen · 2026-02-10T23:00:31 1770764431

Thanks for this link. I’ve failed to find specifics on this for a while but this is pretty good, particularly the example about which lane to choose when cones are set up.

tehjoker · 2026-02-10T23:05:10 1770764710

Very helpful, thank you.

Rudybega · 2026-02-06T16:43:13 1770396193

There are two compilers that can handle the Linux kernel. GCC and LLVM. Both are written in C, not Rust. It's "in distribution" only if you really stretch the meaning of the term. A generic C compiler isn't going to be anywhere near the level of rigour of this one.

thesz · 2026-02-06T17:30:23 1770399023

There is tinycc, that makes it three compilers.

There is a C compiler implemented in Rust from scratch: https://github.com/PhilippRados/wrecc/commits/master/?after=... (the very beginning of commit history)

There are several C compilers written in Rust from scratch of comparable quality.

We do not know whether Anthropic has a closed source C compiler written in Rust in their training data. We also do not know whether Anthropic validated their models on their ability to implement C compiler from scratch before releasing this experiment.

That language J I proposed does not have any C compiler implemented in it at all. Idiomatic J expertise is scarce and expensive so that it would be a significant expense for Anthropic to have C compiler in J for their training data. Being Turing-complete, J can express all typical compiler tips and tricks from compiler books, albeit in an unusual way.

Rudybega · 2026-02-06T20:15:29 1770408929

TinyCC can't compile a modern linux kernel. It doesn't support a ton of the extensions they use. That Rust compiler similarly can't do it.