They're probably going to do an aerial insertion via helicopter (Ospreys technically), which doesn't require transiting Hormuz. These big amphibious assault ships are built for both maritime and aerial insertions.
You're right, you are using it wrong. An LLM can read code faster than you can, write code faster than you can, and knows more things than you do. By "you" I mean you, me, and anyone with a biological brain.
Where LLMs are behind humans is depth of insight. Doing anything non-trivial requires insight.
The key to effectively using LLMs is to provide the insight yourself, then let the LLM do the grunt work. Kind of like paint by numbers. In your case, I would recommend some combination of defining the API of the library you want yourself manually, thinking through how you would implement it and writing down the broad strokes of the process for the LLM, and collecting reference materials like a format spec, any docs, the code that's creating these packets, and so on.
> An LLM can read code faster than you can, write code faster than you can, and knows more things than you do.
I don't agree. It can't write code at all, it can only copy things it's already seen. But, if that is true, why can't it solve my problem?
> The key to effectively using LLMs is to provide the insight yourself, then let the LLM do the grunt work
Okay, so how do I do that? Remember, I want to do ZERO TYPING. I do not want to type a single character that is not code. I already know what I want the code to do, I just want it typed in.
I just don't think AI can ever solve a problem I have.
You're intentionally missing the point. Every time a bomb drops we're rolling the dice. Hits on civilian targets are inevitable, just like bugs are inevitable. The only solution is not to go to war at all. Don't blame the person who dropped the bomb, blame the people who ordered the bombs to be dropped.
There's a hell of a difference between "we don't like your terms so we're going to use a different supplier" and "we don't like your terms, so we're going to use the power of the federal government to compel you to change them". The president is the commander-in-chief of the military, but Anthropic is not part of the military! Outside serving the public interest in a crisis, the president has no right to compel Anthropic to do anything. We are clearly not in a crisis, much less a crisis that demands kill bots and domestic surveillance. This is clear overreach, and claiming a constitutional justification is mockery.
I'd encourage you to look up the Defense Production Act. Its powers are probably broad enough that the President could unilaterally force Anthropic to do this whether or not it wants to. It's the same logic that would allow him to force an auto manufacturer to produce tanks. And the law doesn't care whether we are in a crisis or not. It's enough that he determine (on his own) that this action is "necessary or appropriate to promote the national defense."
However, it looks like Trump isn't going to go that route-- they're just going to add Anthropic to a no-buy list, and use a different AI provider.
Ok? And? Trump could use the DPA to force Ford to make tanks in a war, just like how Trump could use the DPA to force Anthropic to make AI in a war. Are we in a war? No. We are not in a crisis.
Yes. "Show Code", not "Show CPU cycles". There's a difference. Writing code is not the same as running code. It looks to you like it ran the code. But you have no proof that it did. I've seen many times LLM systems from companies that claimed that their LLMs would run code and return the output claiming that they ran some code and returned the output but the output was not what the shown code actually produced when run.
In my experience, models do not tend to write their own HTML output. They tend to output something like Markdown, or a modified version of it, and they wouldn't be able to write their own HTML that the browser would parse as such.
What, in your view, does sending one markup language instead of another markup language tell you about whether the back-end executed some code or only pretended to?
The front-end display is a representation of what the back-end sends it. Saying "but the back-end doesn't send HTML" is as meaningless as saying that about literally any other SPA website that builds its display from API requests that respond with JSON.
You cannot know that anything it shows you was generated by executing the code and isn't merely a simulacrum of execution output. That includes images.
I have a custom skill-creator skill that contains this:
> A common pitfall is for Claude to create skills and fill them up with generated information about how to complete a task. The problem with this is that the generated content is all content that's already inside Claude's probability space. Claude is effectively telling itself information that it already knows!
> Instead, Claude should strive to document in SKILL.md only information that:
> 1. Is outside of Claude's training data (information that Claude had to learn through research, experimentation, or experience)
> 2. Is context specific (something that Claude knows now, but won't know later after its context window is cleared)
> 3. Aligns future Claude with current Claude (information that will guide future Claude in acting how we want it to act)
> Claude should also avoid recording derived data. Lead a horse to water, don't teach it how to drink. If there's an easily available source that will tell Claude all it needs to know, point Claude at that source. If the information Claude needs can be trivially derived from information Claude already knows or has already been provided, don't provide the derived data.
Sincerely, perhaps you should publish on arxiv before a researcher reads it to run it and write a study.
It's fairly common we notice these types of threads where one thing is being postulated and then there's comments upon comments of doer's showing what they have done.
The AI world moves at a blistering pace. Academic publishing does not. In this particular case the "random dude on HN" is probably six to nine months ahead of the academic publication, not in the sense of being that much smarter but literally just being that much further progressed through time relative to the academic publication pipeline.
Accuracy is relevant though, and testing your assumptions before heading out, or keeping track of the particular changes (if any) aroudn what you're publishing is another thing.
Still, you have a more valid point :). Publishing is about publishing, not necessarily progress.
I just want folks on HN to remember they might be the cutting edge, or the tip of the arrow more times than they realize.
Does this not assume that Claude can pick out the best of what it knows?
Claude's training data is the internet. The internet is full of Express tutorials that use app.use(cors()) with no origin restriction. Stack Overflow answers that store JWTs in localStorage, etc.
Claude's probability space isn't a clean hierarchy of "best to worst." It's a weighted distribution shaped by frequency in training data.
So even though it "knows" stuff, it doesn't necessarily know what you want, or what a professional in production environment do.
This is really good! I like how it reads like a blog post, it feels like I'm learning a skill on how to write good skills. Maybe that's another heuristic, a skill should read like an interesting blog post, highlighting non-obvious information.
reply