More

alyxya · 2026-05-23T09:37:17 1779529037

An alternative I've tried is to ask Claude Code to create programming exercises or challenges on some topic and then ask it to check my work after every step while giving feedback. It worked surprisingly well, so this is the right call for CodeCrafters because AI is far more effective and cheaper at creating educational content and being a personal tutor.

0xpgm · 2026-05-25T10:38:24 1779705504

It would depend on the complexity of the challenge. Knowing that a knowledgeable human took the time to verify the project/exercise gives a learner confidence that they are in the right track, not going down a path of hallucinations

alyxya · 2026-05-22T17:50:46 1779472246

Once they have their own coding agent which they seem to be working towards, I may start predominantly using their models. They seem to be doing all the "right" things, open sourcing models, publishing research, and keeping prices low for everyone.

ammar_x · 2026-05-22T18:25:23 1779474323

You can use V4 Pro with Claude Code [1].

I tried it and it's impressive.

[1]: https://api-docs.deepseek.com/quick_start/agent_integrations...

KronisLV · 2026-05-22T20:11:56 1779480716

I'm working on a custom launcher for hooking up Claude Code with various providers (groups env variables in profiles) cause DeepSeek doesn't have vision and sometimes I need browser use with screenshots or Opus reasoning, for other tasks it's fine: https://ccode.kronis.dev/

  # After installed (or when run portably with ./ccode)
  ccode init-config
  ccode edit-config
  
  # Run with default profile
  ccode
  # Run with named profile
  ccode --deepseek
  
  # Set default profile
  ccode set-default-profile deepseek

Also turns out that with a local proxy you can get Remote Control working and see the DeepSeek sessions in the desktop app, screenshots on the page. Other than that, I'm happy that it works pretty well and the discount is enough to make me consider going from Anthropic's Max subscription to Pro and using it only where DeepSeek is insufficient. With that proxy I eventually hope to be able to transparently switch models mid-task, if I need Opus for like 5 turns or something.

Overall though I'm not sure exactly how well Claude Code would stack up against OpenCode, since the latter overall feels a bit less hacky with 3rd party models and is even getting niche but nice features like a locally runnable web version: https://opencode.ai/docs/web/

BiraIgnacio · 2026-05-23T01:27:55 1779499675

I've been using V4 flash consistently with Claude. Pretty great fast and darn cheap. I use it about 3h/day and so far haven't crossed $1 USD/week.

FWIW, I this is what I have in my settings.json

  "env": {
    "ANTHROPIC_AUTH_TOKEN":"sk-nope_not_real",   
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
    "CLAUDE_CODE_EFFORT_LEVEL": "low",
    "CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING": "1",
    "CLAUDE_CODE_DISABLE_THINKING": "0",
    "CLAUDE_CODE_ENABLE_AWAY_SUMMARY": "0",
    "CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "8000",
    "CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "4000",
    "BASH_MAX_OUTPUT_LENGTH": "20000",
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "60",
    "CLAUDE_CODE_AUTO_COMPACT_WINDOW": "200000",
    "CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS": "1"
  }

oezi · 2026-05-23T07:16:26 1779520586

3h/day and how many parallel agents? 1/3/10?

I think out tokens would be a better metric.

BiraIgnacio · 2026-05-26T00:04:08 1779753848

Max 2 parallel agents, usually.

As for out tokens, it's about 200k/day

hawtads · 2026-05-23T03:08:35 1779505715

Why not use higher thinking effort?

BiraIgnacio · 2026-05-26T00:04:59 1779753899

Just cause that level seems to be working fine for me and it's usually faster.

ed_mercer · 2026-05-23T03:37:31 1779507451

Hi, is it comparable to Opus?

chewz · 2026-05-23T09:33:01 1779528781

V4 Pro is between Sonnet and Opus. But it is cheap. Slow but very cheap. Very diligent.

I run a proxy that allows me switching back to Opus when necessary.

Deepseek isn't like Z.ai which is bit cheaper only on the surface. Or like Qwen 3.7 Max which is Opus-level but very expensive.

Deepseek is my favorite since V3 but V4 is definitely catch-up to newer Anthropic models

itsthecourier · 2026-05-23T09:34:14 1779528854

thank you so much for sharing ir

rjh29 · 2026-05-22T20:39:42 1779482382

How does the cost compare using the API vs the $20/month plans with other providers?

I did some back of the envelope calculations and it seems like you would pay $5/month using DeepSeek directly or $15-20 with OpenRouter or similar. But would be interested to hear real world usage.

0xbadcafebee · 2026-05-22T22:26:17 1779488777

It is still more expensive per-request than the common Anthropic and OpenAI subscriptions, but the math changes a lot based on your specific use case. https://codeberg.org/mutablecc/calculate-ai-cost/src/branch/...

But as usual, there are far cheaper subscriptions with higher limits than Anthropic and OpenAI, that also provide DeepSeek v4 Pro. So you should use those subscriptions first until you max them out, then look at a different subscription.

iammrpayments · 2026-05-23T07:08:36 1779520116

I don’t even use Claude that much and was hitting limits in the 20$ using sonnet, I’ve deposited 5$ with deepseek and haven’t hit the limit after spending 60million+ tokens. So no way it’s more expensive.

nchmy · 2026-05-24T06:29:12 1779604152

The link you shared is just a large table of data, which is hard to browse on a phone.

Could you please elaborate on the far cheaper subscriptions that we should be using?

stavros · 2026-05-22T22:52:21 1779490341

I've been using it pretty extensively over a month and I'm at maybe $7. It thinks for quite a while, but the results have been better than Sonnet for me.

maxdo · 2026-05-22T21:22:14 1779484934

I'm not curious what tasks you tested it for. Im working on coding agent writing code dynamically on request for customers. i'd say code itself very simple and aggressively cached, and patternalized, e.g. we adding lots of hints to the system.

the only real family models that work were claude and openai, surprisingly, for tasks that needs faster speed, gpt 5.4 is very impressive. Deep seek was very average , doing things somewhere in gemini flash 3.0 domain.

thisisit · 2026-05-22T19:08:14 1779476894

I am curious - Is there a way to switch between models depending on the task? Because I believe Deepseek V4 is not multimodal and it will be good to switch back to Claude if vision or other capabilities are required.

mewse-hn · 2026-05-22T20:34:16 1779482056

I was looking into something similar because I wanted to test a local model for doing basic coding and smart model (deepseek) for planning.

It's basically not possible with claude code, the api endpoint is a single environment variable and whatever models are on that endpoint are what's available.

HOWEVER, if you run a proxy like LiteLLM, you can configure it to send requests to different api endpoints on the back end and expose them as different "models" on the front end, then configure claude code to switch between those virtual models.

thisisit · 2026-05-22T21:16:04 1779484564

Found this: https://github.com/farion1231/cc-switch

It allows for switching models in Claude Code.

mewse-hn · 2026-05-22T21:21:05 1779484865

Right that says it has a proxy feature so it can probably do what I was describing with LiteLLM

mvanbaak · 2026-05-23T00:47:38 1779497258

Check out the project called superpowers. It can use different models for different agents. I use it witb opencode to have different models for reaearch, planning, execution, testing etc

longsword · 2026-05-22T23:08:58 1779491338

There is a tool called deepclaude, which runs a proxy in the background capable of doing this, by simply doing /model in Claude.

maxdo · 2026-05-22T21:25:36 1779485136

i've been trying that, in reality every time you try to save it, it's not worth it, the cost of mistake is so high , you can spent 2-3h on just wrong assumption, you lost your time and all the burned tokens.

firecall · 2026-05-23T02:04:24 1779501864

It seems you can use the Claude Code CLI harness without a Claude Pro subscription now, which I don't think you could a before?

I've been using Deepseek v4 with Cline in VS Code as a replacement for Github Copilot, and it's not been too bad.

hbarka · 2026-05-22T21:45:54 1779486354

The npm install of Claude Code deprecated, since Feb 2026.

Scarbutt · 2026-05-22T18:35:59 1779474959

Surprised Anthropic hasn't done anything to restrict Claude Code from using other providers.

cortesoft · 2026-05-22T18:44:38 1779475478

At this point in the AI wars, it is probably better to have more users of Claude code rather than restrict which LLMs it can connect to. Claude code is probably (currently at least) stickier than the LLM model itself. Getting people into the Claude code ecosystem is worth it.

Later, they can always lock it down more or add Claude LLM only features to it.

wolttam · 2026-05-22T19:06:19 1779476779

The value of Claude Code the harness isn't that great. There's a lot of other good harnesses out there.

crooked-v · 2026-05-22T19:36:45 1779478605

And it gets dragged down by Anthropic actively injecting unhelpful things into prompts without telling users about them (https://github.com/anthropics/claude-code/issues/58262).

rane · 2026-05-22T19:47:04 1779479224

I thought so, and then I tried Opencode and Codex and started to appreciate Claude Code a lot more. They've actually done great work with the small details.

intuxikated · 2026-05-22T23:18:52 1779491932

I actually have't looked back since trying opencode The ability to properly see what the agent is doing in tool calls and subagents is really unmatched, CC strips all reasoning and return values, only displaying tool calls, and you're unable to expand a single subagent, it's expand everything and scroll endlessly or show everything collapsed with basically no info at all (read x files, ran x commands) Just seems like extremely basic features are missing

chandureddyvari · 2026-05-22T19:34:27 1779478467

What’s your favourite harness? Is there any benchmarks for harness like LLMs have for swe verified?

Mkengin · 2026-05-23T18:00:16 1779559216

There Seen to be more and more harness benchmarks out there, pretty interesting read:

https://neuralnoise.com/2026/harness-bench-wip/

wolttam · 2026-05-22T20:59:34 1779483574

You can check my profile for which one I like most :) I do think there have been efforts to benchmark different harnesses.

Personally I'm not going to choose one harness or another based on +/- a few percentage points in a benchmark. I'm going to use one the one that I find the most ergonomic, that isn't too bloated, etc. The models are the primary lever, not the harness.

koolba · 2026-05-22T19:14:26 1779477266

Good or better? Curious which would be in either bucket.

wolttam · 2026-05-22T19:19:24 1779477564

Probably a matter of taste. I prefer the harness I wrote, I don't want to go near Anthropic's bloated mess of a harness with a 10-meter pole.

odiroot · 2026-05-23T15:56:23 1779551783

IMHO the ergonomics of their tooling are not great. I'd rather use Codex or even OpenCode. Configuration alone is very arcane with lacking documentation. Sandboxing/permission system is quite confusing too.

HWR_14 · 2026-05-22T21:42:15 1779486135

It went the other way, you can't use other harnesses to connect to the cheaper versions of Claude. So clearly they think their current moat is Claude Code use, not the LLM itself.

wiradikusuma · 2026-05-22T19:13:07 1779477187

That's interesting. I thought Claude Code is not as good, therefore people want to use Claude model with other alternatives. This is the other way around.

Which begs the question, regardless of the model, which Claude Code alternative is better? (I keep saying "Claude Code alternative" because I don't know the term... LLM CLI?)

flexagoon · 2026-05-22T20:11:42 1779480702

AFAIK the two most popular open source harnesses right now are OpenCode and Pi. They take a pretty different approach, OpenCode includes a lot of features while Pi is very minimal by design and focused on extensibility, to the point where many people are just asking Pi to write a plugin for itself whenever they want it to have a new feature. I personally like Pi's philosophy more and I think its developer justified the choices really well in his blog post:

https://mariozechner.at/posts/2025-11-30-pi-coding-agent/#to... (the pi-coding-agent section)

rjh29 · 2026-05-22T20:44:41 1779482681

Author blocks referrals from HN, weirdly dramatic, especially considering they have 1086 karma here. I wonder what we did to them.

flexagoon · 2026-05-22T22:13:35 1779488015

Oh damn, I haven't noticed because my browser removes the referer header. But I think the image on the block page is a pretty good answer to why he did that.

SturgeonsLaw · 2026-05-23T01:38:20 1779500300

What's the image trying to convey? Genuine question, I just come here to read nerd stuff and I'm not aware of any controversy

flexagoon · 2026-05-23T02:26:36 1779503196

The image shows Garry Tan, the CEO of Y Combinator. He has lately been on a huge AI psychosis streak, bragging about things like "shipping 37000 lines of code every day" and "using Claude Code so much it burned out his USB-C power connectors". He's in a lobster suit because he's talking about OpenClaw, an AI agent assistant which those same AI psychosis types lean into too much by giving it full read-write access to all their life and then getting surprised when it accidentally deletes all of their emails.

Pi's developer is obviously not anti-AI, and he definitely doesn't hate OpenClaw, since it's based on Pi. But there's a growing number of people who take those things too far, and a lot of them are on HN. You can easily find them in the comments of any AI-related post here. I assume that's the type of people the image is portraying.

SturgeonsLaw · 2026-05-30T00:23:03 1780100583

Thank you for the explanation!

wrs · 2026-05-22T19:22:36 1779477756

The common term for a tool that wraps an LLM with a workflow is “harness”.

jijji · 2026-05-23T00:12:37 1779495157

I've seen good results with opencode connected to glm 5.1 on ollama cloud... for $20 a month you get similar performance that you get with opus 4.7

copperx · 2026-05-22T21:19:53 1779484793

I love oh-my-pi, but I'm not sure if it's "better". Maybe just as good.

g023 · 2026-05-22T21:38:06 1779485886

I use DeepSeek v4 flash with CoPilot and it works pretty good.

jdasdf · 2026-05-24T21:55:35 1779659735

I'm my experience claude code is kind of shit.

Pi works very well with deepseek though

LaurensBER · 2026-05-22T19:27:14 1779478034

It works very well with OpenCode. My team keeps hitting the 5h limits on other subscriptions and it's pretty good to have Deepseek as a backup. I just put 50 bucks on there and it feels like it'll never run out.

It's not good enough to fully replace any of the frontier models yet but it's definitely great to have as a backup!

lambda · 2026-05-22T17:51:36 1779472296

Why do you need them to provide a coding agent? Just use their model with any off the shelf coding agent. I happen to prefer Pi, but use whatever works for you.

hootz · 2026-05-22T17:56:45 1779472605

Yeah, I'm using Pi with their models through an OpenCode Go subscription and it works pretty well. 10 bucks and V4-Flash is virtually infinite.

alyxya · 2026-05-22T18:15:43 1779473743

I probably have an unfounded assumption that whatever coding agent they make will work really well with their models, better than external harnesses. I don't have a good sense for how all the model + harness combinations compare, nor any good way to compare them myself, but generally believe model companies train their models to work best with their own harness.

wolttam · 2026-05-22T19:08:31 1779476911

I've noticed that models have gotten less finicky with this over time. Harnesses don't need to be complex to get good coding performance from models, they just need to implement some sane primitives for code exploration and editing.

wyre · 2026-05-22T22:08:28 1779487708

It is in the model's provider's interest for you to believe this because they get to lock you into their harness and inference. As models get better they will get better at using any harness, it comes down to how well the harness is actually engineered. I highly recommend you take an hour or two and check out Pi to either solidify or change your assumption. The harness is essentially just another developer tool and can be as opinionated, overly-engineered, minimal as anything else. I would think for DeepSeek, especially, they're efforts are much better spent researching how to make their LLM's better instead of working on engineering a harness that might get some marginal gain building it for their models.

Edit: here is a really good twitter thread about this exact topic: https://xcancel.com/kunchenguid/status/2057700714626105412

apitman · 2026-05-22T18:50:16 1779475816

What's the best way to use it with Pi, OpenRouter?

schaefer · 2026-05-22T20:23:18 1779481398

> What's the best way to use it with Pi, OpenRouter?

I can't claim it's "the best"...

But the Pi.dev and OpenRouter combo is what I'm doing at home, and I love it. Setup was easy, I can use /model to switch between any of the openrouter models and whatever I'm hosting locally via VLLM.

brianwawok · 2026-05-23T00:35:21 1779496521

Open router is a 5% tax? If you use it seriously may as well skip it

schaefer · 2026-05-24T03:44:30 1779594270

I don't have an LLM-positive culture at work. I'm on a bit of an island. Or under a rock.

Anyhow, I'm pulling myself up by my own bootstraps.

For me a 5% overhead is fine... if it gives me better visibility of this rapidly moving field.

the-pellmeister · 2026-05-26T15:26:33 1779809193

OpenCode Go is hard to beat, $10/mo ($5 first month) for up to $60 worth of tokens if used regularly. And no, you don't need to use opencode, they allow any software to use it.

lambda · 2026-05-22T20:20:32 1779481232

I only use local models myself personally. But yeah, OpenRouter would probably be a good option.

lofaszvanitt · 2026-05-22T22:56:50 1779490610

Qwen cli

satvikpendem · 2026-05-22T18:40:40 1779475240

RL with the harness inputs and outputs of users is one of the primary improvers of model performance, a self perpetuating flywheel.

smoe · 2026-05-22T20:27:41 1779481661

Earlier this week I started testing Chinese models on my codebase. I haven’t really looked at interactive coding yet, but more at issue triage, bug auto-fixing, log analytics, etc.

I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.

So far, Kimi and MiMO look the most promising to me. I haven’t tested them rigorously enough to make a strong statement, but my first impression is that, in practice, all those models may be less behind on typical daily tasks than people think.

They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price

try-working · 2026-05-23T02:10:24 1779502224

I just did a little comparison using benchmarks for GPT 5.1 through 5.4 to map out the equivalent capability-level of some of the Chinese models.

Based on these benchmarks, here's a rough mapping:

- Qwen 3.7 ~= GPT 5.3

- Kimi K2.6 ~= GPT 5.15

- DS V4 ~= GPT 5.1

So yes, we have GPT 5 at home now. No need to pay the Legacy Labs anymore.

Here's the benchmark I used since I can't post images here: https://x.com/trydotworks/status/2058004995195490706?s=20

_under_scores_ · 2026-05-22T22:37:42 1779489462

I switched to predomentantly using mimo this week, mostly out of curiosity to see how dependant I was on frontier models. Honestly I cant really tell the difference. I would say I work on pretty average codebases with well know frameworks doing pretty typical things and initial impressions is that mimo, kimi and deepseek can probably handle what I need more or less the same as gpt5.5 or claude.

c0rruptbytes · 2026-05-22T20:58:37 1779483517

I personally really like DS4 Flash - it's the largest I can run locally with decent speeds and I feel like it's good enough to maintain a codebase with less effort

r0b05 · 2026-05-23T04:18:34 1779509914

What hardware and quant do you run it with?

maxdo · 2026-05-22T21:28:20 1779485300

maybe i need to give it second chance, surprisingly Kimi 2.6 consistently fail even to generate valid json plan, where gemma 4 was doing really good, but slow.

JSR_FDED · 2026-05-23T12:49:49 1779540589

Are you going through OpenRouter or direct? I’ve had nothing short of excellent results from Kimi.

jdboyd · 2026-05-23T02:02:07 1779501727

I would prefer a coding agent to be somewhat independent of the model provider. Providers are trading off on quality, features, and price so frequently, and I don't want to keep changing my agent every time.

I am looking forward to things slowing down and stabilizing. I'm not saying that should happen today, just I am looking forward to it.

gaolei8888 · 2026-05-23T02:42:45 1779504165

I think this will happen much sooner than we thought. Maybe it will happen in next 6 months

akritid · 2026-05-23T15:29:44 1779550184

You can take Codex today and ask it to rewrite itself to work with any API

hawtads · 2026-05-23T03:09:17 1779505757

There is OpenCode and Pi, they both work pretty well

tequila_shot · 2026-05-22T18:24:33 1779474273

You no longer need "their coding agent". You can hook up claude code to use Deepseek. Works perfectly.

minimaxir · 2026-05-22T21:02:13 1779483733

Zed's Agent natively supports a DeepSeek API key now. (do not use it through OpenRouter if you want to save the most cost)

potsandpans · 2026-05-22T20:48:15 1779482895

Give pi a try if you haven't already. Avoid vendor harness lock-in.

vinhnx · 2026-05-23T00:45:10 1779497110

You can use DeepSeek with my coding agent VT Code. Recently I've added DeepSeek V4 Pro and DeepSeek V4 Flash support with all providers, via: Official DeepSeek API, HuggingFace, Ollama Cloud, OpenRouter providers.

> https://github.com/vinhnx/vtcode

zozbot234 · 2026-05-22T18:36:06 1779474966

antirez's ds4-agent works quite fine. It runs on any Apple Silicon device with 96GB RAM or more.

rjh29 · 2026-05-22T20:18:59 1779481139

I wonder how many years it'll take for the API token cost to exceed the money spent on ram.

zozbot234 · 2026-05-22T21:46:39 1779486399

The DS4 folks are unofficially testing ways to run the model with lower performance on lower-RAM machines. Similar efforts are going on with llama.cpp. The results are a bit of a challenge, prefill time tends to explode which is a limitation if you care about agentic workflows.

vrganj · 2026-05-23T07:03:34 1779519814

Anything that runs with 64?

zozbot234 · 2026-05-23T07:18:21 1779520701

You can just try it yourself, it will probably run with a heavy slowdown using SSD offload.

raincole · 2026-05-22T19:47:04 1779479224

All the major coding agents already support DeepSeek.

azinman2 · 2026-05-24T23:48:23 1779666503

And not letting you opt out of being their training data.

cultofmetatron · 2026-05-22T18:25:29 1779474329

open code works with them today. I've been using it fulltime for 2 weeks so far.

sunaookami · 2026-05-22T18:47:24 1779475644

Using it with Pi and can only report good thing so far. I'm very impressed by how good it is (also it's way slower than Claude Sonnet and GPT-5.5 and often thinks "too much" before starting).

teekert · 2026-05-23T07:30:16 1779521416

Why not OpenCode? Genuine question, not an expert..

ReptileMan · 2026-05-22T20:20:06 1779481206

Both pi, opencode and zed work amazing with deepseek.

Guillaume86 · 2026-05-22T21:40:24 1779486024

You seem to have tried a few things, if you don't mind I have a few questions as someone currently on Claude Code but would prefer to not lock myself in a commercial ecosystem (and their pricing change regarding headless usage is annoying me):

- how do/would you add the WebSearch tool to your harness? pay for a separate service or does deepseek offer something with their subscriptions?

- do pi/opencode support pasting images in prompts?

- how do you handle reading images? deepseek is not multi modal IIRC? do you pay for another model and route to it?

Any of these missing would really annoy me in day to day use...

wyre · 2026-05-22T22:24:08 1779488648

Brave, Exa, and Tavily all offer a free tier for websearch, after that it comes out to like 1¢/search, very easy to ask pi to build a web search tool using any of these providers.

They support image locations like a file or url, but not regular images (opencode desktop might though?)

Both pi and opencode make it very easy to change models so you can easily call to 5.4-mini or whichever multi-modal LLM for reading images. I'm sure you could even create a skill to automate the process too, having the model use the cli to send the photo to the multi-modal and give it back a description.

ReptileMan · 2026-05-22T21:48:07 1779486487

I use them for pure coding, but I think they do curls when needing something from the host machine.

Guillaume86 · 2026-05-22T22:01:29 1779487289

Yes I'm also using it for coding: I often make the agent use WebSearch in the research phase when deciding on a stack or a library or research best/modern practices to do achieve something. As for images I find it super useful to be able to paste snipped screenshots to show the agent when something is wrong in a UI/frontend or just something I can't copy paste easily.

linzhangrun · 2026-05-23T03:40:58 1779507658

there already is a open-sourced deepseek-tui coding agent. besides, you can always connect to opencode.

jack_pp · 2026-05-22T21:30:51 1779485451

i have done some amazing things for 5 dollars, using opencode. give it a shot, it is incredibly cheap

alyxya · 2026-05-22T17:31:12 1779471072

It could easily be fixed on google's side with a better prompt used for search queries.

alyxya · 2026-05-13T23:16:26 1778714186

I also encountered an issue with my credits. I was previously subscribed to the max plan, claimed credits, then downgraded to the pro plan and noticed I lost my credits. I didn't unsubscribe, just downgraded plans as I wasn't using claude enough to justify needing max.

pycassa · 2026-05-13T23:35:59 1778715359

yes, their "contract" is insane right now, with so many edge cases. giving poor user experience. when they have so many users, these edge cases also compound. They should simplify things to to give peace to their engineers.

alyxya · 2026-05-13T23:10:23 1778713823

This could end up becoming a cat-and-mouse game where users programmatically try to turn their non-interactive usage of Claude Code to appear interactive and Anthropic tries to detect and charge that under API pricing. I don't know if there's a proper solution here because there will always be borderline use cases like using Claude Code on a cloud VM, where it would be nicer to interactively do work through sending and receiving messages on a custom frontend rather than SSHing and using the CLI.

9wzYQbTYsAIc · 2026-05-13T23:38:23 1778715503

Next thing to go will be loops and scheduled tasks, if they keep needing to trim usage to fit in the available compute, I suspect, if ‘claude -p’ is essentially gone, now.

alyxya · 2026-05-11T22:45:46 1778539546

In theory I would expect it to do everything the current frontier models are capable of but with the added benefit of real time interactivity for better collaboration. The biggest benefit may be the real time video input so it can take in that input in parallel with producing outputs steered by the input rather than taking in a video or all images at once and then producing a single output for all of that.

alyxya · 2026-05-11T22:35:47 1778538947

The noteworthy things to me are that the architecture is a transformer that takes in text, image, and audio input and produces text and audio output, all trained together, and it works in near real-time through interleaving inputs and outputs rather than pure generation of the output from a given prompt.

> Time-Aligned Micro-Turns. The interaction model works with micro-turns continuously interleaving the processing of 200ms worth of input and generation of 200ms worth of output. Rather than consuming a complete user-turn and generating a complete response, both input and output tokens are treated as streams. Working with 200ms chunks of these streams enables near real-time concurrency of multiple input and output modalities.

That's probably the main thing that distinguishes it from the multimodal models from other frontier labs as far as I can tell.

NitpickLawyer · 2026-05-12T08:08:15 1778573295

What's really interesting for me about multimodal architectures from the ground up is that we might start to see applications where different modalities are "facets" of the same thing. Like a coding agent that sees "code" + "IDE" + "memory mapping" + feedback from different plugins as different modalities. And it gets to output in them as well - text where it needs to, actions (not <action>call_something(params)</action> like we have today) and so on. Being able to "sit still" until one of the modalities triggers is really interesting.

We can do these things today, but they're "bolted on" as afterthoughts. Yet they work remarkably well. I wonder how well they'd work if trained int his combined regime, from the ground up.

throwaw12 · 2026-05-12T12:59:10 1778590750

> interleaving the processing of 200ms worth of input and generation of 200ms worth of output.

How does this work? Don't LLMs/transformers need whole context to output next chunk of tokens?

alyxya · 2026-05-07T20:57:39 1778187459

I dislike the title because it doesn't clearly state it's a layoff. "Building for the future" gave me the impression that it's about some major new initiative with a roadmap outlining plans.

dang · 2026-05-08T03:37:36 1778211456

Yes. We've since changed the top link to a third-party article. We prefer to do this with corporate press releases* - this is probably the #1 exception to HN's "please post the original source" rule (https://news.ycombinator.com/newsguidelines.html). If anyone sees a better third-party article, we can change it again.

(Edit: it's not really an exception because the purpose of a corporate press release is usually to obscure the main story, which means it's misleading, so by HN rules we should change it.)

(Edit 2: I feel like I should add that this isn't specific to Cloudflare! It's literally a generic problem.)

* https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...

Imustaskforhelp · 2026-05-08T04:07:49 1778213269

Thanks for changing this dang, I and all of us really appreciate the work that you do towards hackernews :-D

Have a nice day!

wavemode · 2026-05-07T21:48:55 1778190535

Maybe I've become cynical and jaded, because when I saw the title I immediately thought to myself "oh, Cloudflare's announcing a layoff."

operatingthetan · 2026-05-07T22:03:50 1778191430

The corporate speak isn't working if people instantly know what it means!

ceejayoz · 2026-05-08T00:05:59 1778198759

It's like slurs; an ever-moving target.

FeteCommuniste · 2026-05-08T00:55:37 1778201737

Even so, "Daddy needs a new yacht" might sound too insensitive.

JustSkyfall · 2026-05-07T21:00:46 1778187646

It's interesting how every time there's a layoff, the blog post always has a title like "Preparing for what's next" or "An update on our workforce" or "Getting ready for the agentic era"!

layer8 · 2026-05-08T08:40:59 1778229659

They should make it “Good news, everyone” like in Futurama.

keybored · 2026-05-07T21:24:56 1778189096

Two days ago: “Today I've made the difficult decision to reduce the size of Coinbase by ~14%” (layoffs) https://news.ycombinator.com/item?id=48021368

kristianp · 2026-05-07T21:47:17 1778190437

The title should be something like "Cloudflare reducing workforce by more than 1,100 employees globally".

dang · 2026-05-08T03:49:26 1778212166

Yes, and such titles (whose purpose is to not say the thing) fall under "misleading" in https://news.ycombinator.com/newsguidelines.html.

We've changed the title along with the URL - see https://news.ycombinator.com/item?id=48058224.

strongpigeon · 2026-05-08T03:59:49 1778212789

I’ll never forget how when I was at Google, every email with subject line “An update on X” meant X was getting axed. Like, just say so in the subject line…

FartyMcFarter · 2026-05-08T08:07:49 1778227669

It got to the point where people were sarcastically posting "An update on <myself>" when sending goodbye emails.

ignoramous · 2026-05-08T00:47:24 1778201244

> "Building for the future" gave me the impression that it's about some major new initiative...

If you'll believe them, it indeed is:

   ... [the Leadership at Cloudflare] have to be intentional in how we architect our company for the agentic AI era ... reimagining every internal process, team, and role across the company.

  ... [This layoff is] not a cost-cutting exercise ... [but] Cloudflare defining how a world-class, high-growth company operates.

  ... We don't want to [mass layoff] again for the foreseeable future. 

  ... [Cloudflare] cannot rest on the workflows and organizational structures that worked yesterday. We're confident that [Cloudflare] will be even faster and more innovative [after layoffs] ...

dd8601fn · 2026-05-08T03:14:33 1778210073

They're architecting their company for an agentic future? They're reimaginging the definition of a world-class, high-growth company? They're not resting on the workflows that worked yesterday? blegh

What the hell does any of that actually mean? Like in real life words? Because that much corporate bullshit really sounds like it is a cost-cutting exercise.

rvz · 2026-05-07T21:30:01 1778189401

This is what the true definition of "AGI" is.

doggo_mate · 2026-05-07T21:02:09 1778187729

Welcome to the corporate world

alyxya · 2026-05-02T06:27:29 1777703249

The blog post was published a couple months ago, and it looks like there hasn't been a follow-up release with the fully trained model. I'm not sure if there's much to take away from an early checkpoint besides the unique architectural choices they made in their model for faster inference.

adrian_b · 2026-05-02T09:24:00 1777713840

Some smaller models from the LFM2.5 family have been published on Huggingface by the end of March, a month ago.

It can be assumed that this larger model takes more time to complete post-training, but it will follow in the near future after those smaller LFM2.5 models.

alyxya · 2026-05-01T09:18:36 1777627116

Despite their attrition, this combined with their cursor partnership is likely going to make them competitive in coding agents soon.

senordevnyc · 2026-05-01T18:12:27 1777659147

If they buy Cursor, I’ll stop using it. I suspect I’m not alone.

hu3 · 2026-05-02T01:29:11 1777685351

If they buy Cursor I might start using it because I'll know the tool will have infinite funding and will be worth my time investment.

Specially because Grok isn't neutered when it comes to security scans.

And it is screamingly fast.