More

jp0001 · 2026-05-21T19:40:57 1779392457

My guess in the North East in the winter there will be similar stories.

jp0001 · 2026-05-15T03:22:58 1778815378

LLMs are going to produce amazing Rube Goldberg style vulnerabilities for years to come. It's already starting, this instance isn't the case, but it's happening.

shpx · 2026-05-15T06:47:31 1778827651

Maybe it's physically impossible to build a theoretically secure system, just as it's (presumably) impossible to have a cell that isn't susceptible to any virus. Maybe this whole time we've been getting away with a type of security by obscurity, where the obscurity is just no one having the time and focus to actually analyze the code.

JacobKfromIRC · 2026-05-15T07:43:51 1778831031

Suppose the following:

1. Any given system has a finite number of findable vulnerabilities.

2. All findable vulnerabilities are fixable (if not in software then with a new hardware revision).

3. Fixing a vulnerability while keeping the same intended functionality introduces on average less than 1 other findable vulnerability.

4. It is possible to cease adding new features to a system and from that point forward only focus on fixing vulnerabilities.

If all 4 are true, then perfect security seems possible, in some sense. I think some vulnerabilities might not be fixable, if you include things like the idea that users can be tricked into revealing their passwords. If you restrict the definition of vulnerability to some narrower meaning that still captures most of what people mean when they say computer vulnerability, then I think those 4 statements are probably true.

Perfect security might be near impossible in practice because vulnerabilities will get more difficult to find and fix over time, but I think we should expect the discovery of vulnerabilities to eventually become arbitrarily slow in a hypothetical system that prioritized security above all else.

saagarjha · 2026-05-15T08:40:43 1778834443

Systems generally evolve to add vulnerabilities.

fsflover · 2026-05-15T12:46:09 1778849169

It's probably impossible to achieve security through correctness, but security through compartmentalization can work. See: https://qubes-os.org.

lowdude · 2026-05-15T08:04:31 1778832271

I would rather claim that building a theoretically secure system is prohibitively expensive. At the end of the day, Mythos et al. are just better tools for finding vulnerabilities that will eventually be available to both offensive and defensive actors.

If you imagine you had a vulnerability scanner as fast and convenient as a linter, it would be much cheaper to write secure code right away. Probably not perfectly secure, but still secure enough to make sure finding exploits stays expensive.

lugu · 2026-05-15T08:12:15 1778832735

I would find it funny if one day we found it irresponsable to write hand generated production code. Just like it would be irresponsable to build a significan building without running numerical simulations.

cheevly · 2026-05-15T12:06:51 1778846811

This day is probably not long off. My prediction is before the end of 2028.

mixdup · 2026-05-15T13:18:16 1778851096

it's probably less about how you write the code to begin with and more about letting a tool hammer on it

if you want to be a one man show handcrafting an artisan iOS app that will be fine, but you should probably let Claude bang against it for a while to shake out whatever bugs

txhwind · 2026-05-15T07:06:54 1778828814

another "obscurity": I'm not valuable enough to be attacked, compared with the cost. But what if cost has been reduced a lot?

tweakimp · 2026-05-15T06:31:54 1778826714

Do you mean by vibecoding these vulnerabilities into the kernel or by finding them?

nashadelic · 2026-05-15T12:39:46 1778848786

hyperbolic but it might be safe to assume any local data on a connected device is going to be accessible.

iknowSFR · 2026-05-15T12:46:30 1778849190

Genuine question as I’m far less technical than the crowd here. Has this not always been the case?

jp0001 · 2026-05-01T14:35:52 1777646152

That website was not for me.

jp0001 · 2026-04-25T14:45:25 1777128325

Count 1 and 2 makes sense. A good lawyer can get 3-5 thrown out in a plea deal.

jp0001 · 2026-04-25T14:42:17 1777128137

I'm still having problems trusting my compiler.

jp0001 · 2026-04-25T13:45:41 1777124741

Max x20 user here. As long as Opus 4.6 is available and they fix Opus 4.7, I'll stay with Anthropic. Tho, I'd imagine in 5 years we'll have Opus 4.6 equivalent performance available in an at home consumer model.

jp0001 · 2026-04-16T16:09:30 1776355770

It's easier to produce vulnerable code than it is to use the same Model to make sure there are no vulnerabilities.

Kim_Bruning · 2026-04-17T11:29:28 1776425368

> It's easier to produce vulnerable code than it is to use the same Model to make sure there are no vulnerabilities.

I once had a car where the engine was more powerful than the brakes. That was one heck of an interesting ride.

So now we have a company that supplies a good chunk of the world's software engineering capability.

They're choosing a global policy that works the same as my fun car. Powerful generative capacity; but gating the corrective capacity behind forms and closed doors.

Anthropic themselves are already predicting big trouble in the near term[1] , but imo they've gone and done the wrong thing.

Pandora is an interesting parable here: Told not to do it, she opens the box anyway, releases the evils, then slams the lid too late and ends up trapping hope inside.

Given their model naming scheme, they should read more Greek Mythos. (and it was actually a jar ;-)

[1] https://thehill.com/policy/technology/5829315-anthropic-myth...

velcrovan · 2026-04-16T16:20:46 1776356446

It's not likely that reviewing your own code for vulnerabilities will fall under "prohibited uses" though.

convnet · 2026-04-16T18:03:43 1776362623

> its cyber capabilities are not as advanced as those of Mythos Preview (indeed, during its training we experimented with efforts to differentially reduce these capabilities)

I wonder if this means that it will simply refuse to answer certain types of questions, or if they actually trained it to have less knowledge about cyber security. If it's the latter, then it would be worse at finding vulnerabilities in your own code, assuming it is willing to do that.

Kim_Bruning · 2026-04-17T10:13:34 1776420814

I can confirm from experience that reviewing your own code for vulnerabilities has fallen under "prohibited uses" starting with Opus 4.6 as recently as April 10; forcing me to spend a day troubleshooting and quarantining state from my search system.

"This request triggered restrictions on violative cyber content and was blocked under Anthropic's Usage Policy. To learn more, provide feedback, or request an exemption based on how you use Claude, visit our help center: https://support.claude.com/en/articles/8241253-safeguards-wa..."

"stop_reason":"refusal"

To be fair, they do provide a form at https://claude.com/form/cyber-use-case which you can use, and in my case Anthropic actually responded within 24 hours, which I did not expect.

I admit I'm now once bitten twice shy about security testing though.

Opus 4.7 was still 'pausing' (refusing) random things on the web interface when I tested it yesterday, so I'm unable to confirm that the form applies to 4.7 or how narrow the exemptions are or etc.

vorticalbox · 2026-04-17T12:59:55 1776430795

i've not had the issue with codex, i was testing a public api i work on for issues, codex was happy to attempt to break it but did refuse to create a script that would automate the issue it found.

nicce · 2026-04-16T18:46:12 1776365172

There is no way model can know the origin of the code.

xlbuttplug2 · 2026-04-16T17:22:41 1776360161

May not be very effective if so.

I'm assuming finding vulnerabilities in open source projects is the hard part and what you need the frontier models for. Writing an exploit given a vulnerability can probably be delegated to less scrupulous models.

whatisthiseven · 2026-04-16T17:41:31 1776361291

Currently 4.7 is suspicious of literally every line of code. May be a bug, but it shows you how much they care about end-users for something like this to have such a massive impact and no one care before release.

Good luck trying to do anything about securing your own codebase with 4.7.

jp0001 · 2026-04-16T16:07:31 1776355651

WTF. `Opus 4.7 is the first such model: its cyber capabilities are not as advanced as those of Mythos Preview (indeed, during its training we experimented with efforts to differentially reduce these capabilities). We are releasing Opus 4.7 with safeguards that automatically detect and block requests that indicate prohibited or high-risk cybersecurity uses. `

Seriously? You're degrading Opus 4.7 Cybersecurity performance on purpose. Absolute shit.

zb3 · 2026-04-16T16:37:12 1776357432

And since Opus 4.7 has degraded cybersecurity skills, using it might result in writing actually less safe code, since practically, in order to write secure code you need to understand cybersecurity. Outstanding move.

jp0001 · 2026-04-15T21:06:06 1776287166

I'm starting to think that Opus and Mythos are the same model (or collection of models) whereas Mythos has better backend workflows than Opus 4.6. I have not used Mythos, but at work I have a 5 figure monthly token budget to find vulnerabilities in closed-source code. I'm interested in mythos and will use it when it's available, but for now I'm trying to reverse engineer how I can get the same output with Opus 4.6 and the answer to me is more tokens.

jp0001 · 2026-04-07T17:44:15 1775583855

I took three weeks off from tech, read books from last century, and travelled Europe. Coming back, reading LLM generated content and code feels like nails on a chalkboard. Taste, it does not have taste.

PunchyHamster · 2026-04-07T18:28:17 1775586497

It is so tiring...

literallyroy · 2026-04-07T18:46:28 1775587588

It’s strange how easy it is to spot.