More

jmcodes · 2026-04-21T21:57:22 1776808642

I don't think they do but you can always use OpenCode or Pi Agent.

hannahstrawbrry · 2026-04-21T22:14:04 1776809644

very easy to configure claude code to route to GLM as well.

jmcodes · 2026-04-21T21:55:08 1776808508

Same loved them, told my team about them, got them to switch off of cursor, now I'm telling them to swap to Codex.

Anthropic really pissed me off with their harness crap. They're well within their rights but their communication over it was enough to get me to swap. I don't need extra hurdles when there's a perfectly valid alternative right there. They don't have the advantage they think they do.

operatingthetan · 2026-04-21T22:07:44 1776809264

I think we are inevitably heading to using the cheap Chinese models like Kimi, GLM, and Minimax for the bulk of engineering tasks. Within 3-6 months they will be at Opus 4.6 level.

robertkarl · 2026-04-21T22:11:18 1776809478

This was literally my task today, to try out Qwen 9B locally on my, albeit a bit memory-constrained at 18GB, macbook with pi or opencode. Before reading this update.

operatingthetan · 2026-04-21T22:14:04 1776809644

Minimax coding plan is $10 a month for roughly 3x the $20 Claude Pro CLI usage allowed. That would be good place to start. 200k context though.

jorjon · 2026-04-21T22:26:36 1776810396

MiniMax has its own issues. Server overloads, API errors, and failure to adhere to even the system prompt. It can happily work for hours and get no job done.

sincerely · 2026-04-22T08:41:49 1776847309

Just like me :)

someuser54541 · 2026-04-21T22:22:09 1776810129

Please report back, would be very interested in your findings.

sshine · 2026-04-21T23:09:51 1776812991

I ran OpenCode + GLM-5.1 for three weeks during my vacation. It’s okay. It thinks a lot more to get to a similar result as Claude. So it’s slower. It’s congested during peak hours. It has quirks as the context gets close to full.

But if you’re stuck with no better model, it’s better than local models and no models.

I have to say, OpenCode’s OpenUI has taught me what modern TUIs can be like. Claude’s TUI feels more like it’s been grown than designed. I’m playing around with TUI widgets trying to recreate and improve that experience

taikon · 2026-04-21T23:23:24 1776813804

To be clear, was OpenCode a better in your opinion compared to ClaudeCode?

sshine · 2026-04-22T21:24:11 1776893051

Better UI, worse model (GLM), probably slightly worse agentic runtime.

In spite of how glitchy Claude feels, it makes decisions fast.

TacticalCoder · 2026-04-21T23:25:27 1776813927

> I have to say, OpenCode’s OpenUI has taught me what modern TUIs can be like. Claude’s TUI feels more like it’s been grown than designed.

Claude's TUI is not a TUI. It's the most WTF thing ever: the TUI is actually a GUI. A headless browser shipped the TUI that, in real-time, renders the entire screen, scrolls to the bottom, and converts that to text mode. There are several serious issues and I'll mention two that do utterly piss me off...

1. Insane "jumping" around where the text "scrolls back" then scrolls back down to your prompt: at this point, seen the crazy hack that TUI is, if you tell me the text jumping around in the TUI is because they're simulating mouse clicks on the scrollbar I would't be surprised. If I'm not mistaken we've seen people "fixing" this by patching other programs (tmux ?).

2. What you see in the TUI is not the output of the model. That is, to me, the most insane of it all. They're literally changing characters between their headlessly rendered GUI and the TUI.

> Claude’s TUI feels more like it’s been grown than designed.

"grown" or "hacked" are way too nice words for the monstrosity that Claude's TUI is.

Codex is described as a: "Lightweight coding agent that runs in your terminal". It's 95%+ Rust code. I wonder if the "lightweight" is a stab at the monstrosity that Claude's TUI is.

robertkarl · 2026-04-21T23:23:49 1776813829

For what it's worth: here's my experience in the first 10 minutes of using Qwen locally to write some code. https://github.com/robertkarl/local-qwen-first-10-minutes it includes some token generation numbers and steps to repro.

hank2000 · 2026-04-21T22:14:53 1776809693

how was it? I'm doing this today

robertkarl · 2026-04-21T22:27:46 1776810466

I will report back... but I have to recommend this comment on a post about Qwen 3.6 https://news.ycombinator.com/item?id=47843466 by daemonologist

it goes into detail about llama-server args; quants to try; and layer/kv cache splits. I plan to try the techniques there.

try-working · 2026-04-21T22:26:48 1776810408

Kimi K3 in July-September is the big one.

muyuu · 2026-04-22T00:43:17 1776818597

Kimi 2.6 works roughly like Opus 4.6, when it used to work. Depending on the task, a bit better or a bit worse. And it's MUCH cheaper.

toasty228 · 2026-04-22T12:21:59 1776860519

From this morning: I had a single go file with like 100 loc, I asked it to add debug prints, it thought for 5+ minutes, generating ~1m output token and did not actually update my file.

slopinthebag · 2026-04-22T16:32:26 1776875546

Which harness? Did you use OpenRouter?

maxnevermind · 2026-04-22T00:16:09 1776816969

Anthropic will kick and scream as those are often distilled from their latest models and is cutting into their margin. Though it is not like their hands are clean neither, it is just a different type of stealing, an approved one :-)

AussieWog93 · 2026-04-22T22:07:15 1776895635

This is possibly a hot take but recently I've been having about as much luck with Composer 2 in Cursor as I have with Opus 4.6 in Claude Code.

Opus is obviously the better model, but Cursor's "harness" is doing so much heavy lifting in terms of just magically supplying the broader context the model needs to understand the ramifications of its edits.

kzisme · 2026-04-21T23:20:20 1776813620

How challenging are these to setup locally and have them running?

operatingthetan · 2026-04-22T00:18:54 1776817134

Getting them running is easy (check out LMstudio or ask one for some recommendations). The real question is whether you have the hardware to make them run fast enough to be useful.

kzisme · 2026-04-22T04:13:45 1776831225

The min req is probably crazy I assume but I'll take a peek :)

robertkarl · 2026-04-21T22:19:24 1776809964

One thing I enjoy about Cursor and Codex mac apps is the embedded preview window. I know it's not as hardcore as the terminal/tmux but it's hella convenient. But Cursor bugs me with the opacity around what model I'm using. It seems deliberately to be routing requests based on its perceived complexity. What draws you to codex vs cursor?

jmcodes · 2026-04-10T12:27:20 1775824040

I don't maintain this anymore but I experimented with this a while back: https://github.com/jx-codes/lootbox

Essentially you give the agent a way to run code that calls MCP servers, then it can use them like any other API.

Nowadays small bash/bun scripts and an MCP gateway proxy gets me the same exact thing.

So yeah at some level you do have to build out your own custom functionality.

jmcodes · 2026-03-17T01:59:40 1773712780

It can and it does especially combined with skills (context files). It can hit REST APIs with CURL just fine. MCP is basically just another standard.

Where it comes in handy has mostly been in distribution honestly. There's something very "open apis web era" about MCP servers where because every company rushed to publish them, you can write a lot of creative integrations a bit more easily.

jmcodes · 2026-03-06T22:12:34 1772835154

Not the guy who made it but I immediately wondered if I could use the intermediate steps with some "outline" mode to help me see things in shapes and finally learn to draw a bit.

jmcodes · 2025-11-03T21:01:44 1762203704

Our entire extistence and experience is nothing _but_ input.

Temperature changes, visual stimulus, auditory stimulus, body cues, random thoughts firing, etc.. Those are all going on all the time.

goatlover · 2025-11-03T21:10:14 1762204214

Random thoughts firing wouldn't be input, they're an internal process to the organism.

jmcodes · 2025-11-03T21:15:34 1762204534

It's a process that I don't have conscious control over.

I don't choose to think random thoughts they appear.

Which is different than thoughts I consciously choose to think and engage with.

From my subjective perspective it is an input into my field of awareness.

zeroonetwothree · 2025-11-04T07:01:30 1762239690

Your subjective experience is only the tip of the iceberg of your entire brain activity. The conscious part is merely a tool your brain uses to help it achieve its goals, there's no inherent reason to favor it.

jmcodes · 2025-10-27T21:21:35 1761600095

I don't agree entirely with this. I know why the LLM wrote the code that way. Because I told it to and _I_ know why I want the code that way.

If people are letting the LLM decide how the code will be written then I think they're using them wrong and yes 100% they won't understand the code as well as if they had written it by hand.

LLMs are just good pattern matchers and can spit out text faster than humans, so that's what I use them for mostly.

Anything that requires actual brainpower and thinking is still my domain. I just type a lot less than I used to.

latchup · 2025-10-28T14:02:08 1761660128

> Anything that requires actual brainpower and thinking is still my domain. I just type a lot less than I used to.

And that's a problem. By typing out the code, your brain has time to process its implications and reflect on important implementation details, something you lose out on almost entirely when letting an LLM generate it.

Obviously, your high-level intentions and architectural planning are not tied to typing. However, I find that an entire class of nasty implementation bugs (memory and lifetime management, initialization, off-by-one errors, overflows, null handling, etc.) are easiest to spot and avoid right as you type them out. As a human capable of nonlinear cognition, I can catch many of these mid-typing and fix them immediately, saving an significant amount of time compared to if I did not. It doesn't help that LLMs are highly prone to generate these exact bugs, and no amount of agentic duct tape will make debugging these issues worthwhile.

The only two ways I see LLM code generation bring any value to you is if:

* Much of what you write is straight-up boilerplate. In this case, unless you are forced by your project or language to do this, you should stop. You are actively making the world a worse place.

* You simply want to complete your task and do not care about who else has to review, debug, or extend your code, and the massive costs in capital and human life quality your shitty code will incur downstream of you. In this case, you should also stop, as you are actively making the world a worse place.

johnisgood · 2025-10-29T11:46:59 1761738419

So what about all these huge codebases you are expected to understand but you have not written? You can definitely understand code without writing it yourself.

> The only two ways I see LLM code generation bring any value to you is if

That is just an opinion.

I have projects I wrote with some help from the LLMs, and I understand ALL parts of it. In fact, it is written the way it is because I wanted it to be that way.

latchup · 2025-11-05T20:08:06 1762373286

> So what about all these huge codebases you are expected to understand but you have not written?

You do not need to fully understand large codebases to use them; this is what APIs are for. If you are adventurous, you might hunt a bug in some part of a large codebase, which usually leads you from the manifestation to the source of the bug on a fairly narrow path. None of this requires "understanding all these huge codebases". Your statement implies a significant lack of experience on your part, which makes your use of LLMs for code generation a bit alarming, to be honest.

The only people expected to truly understand huge codebases are those who maintain them. And that is exactly why AI PRs are so insulting: you are asking a maintainer to vet code you did not properly vet yourself. Because no, you do not understand the generated code as well as if you wrote it yourself. By PRing code you have a subpar understanding of, you come across as entitled and disrespectful, even with the best of intentions.

> That is just an opinion.

As opposed to yours? If you don't want to engage meaningfully with a comment, then there is no need to reply.

> I have projects I wrote with some help from the LLMs, and I understand ALL parts of it. In fact, it is written the way it is because I wanted it to be that way.

See, I could hit you with "That is just an opinion" here, especially as your statement is entirely anecdotal But I won't, because that would be lame and cowardly.

When you say "because I wanted it to be that way", what exactly does that mean? You told an extremely complex, probabilistic, and uninterpretable automaton what you want to write, and it wrote it not approximately, but exactly as you wanted it? I don't think this is possible from a mathematical point of view.

You further insist that you "understand ALL parts" of the output. This actually is possible, but seems way too time-inefficient to be plausible. It is very hard to exhaustively analyze all possible failure modes of code, whether you wrote it yourself or not. There is a reason why certifying safety-critical embedded code is hell, and why investigating isolated autopilot malfunctions in aircraft takes experts years. That is before we consider that those systems are carefully designed to be highly predictable, unlike an LLM.

godelski · 2025-10-28T17:34:27 1761672867

The best time to debug is when writing code.

The best time to review is when writing code.

The best time to iterate on design is when writing code.

Writing code is a lot more than typing. It's the whole chimichanga

godelski · 2025-10-28T00:50:39 1761612639

  > I know why the LLM wrote the code that way. Because I told it to and _I_ know why I want the code that way.

That's a different "why".

  > If people are letting the LLM decide how the code will be written then I think they're using them wrong

I'm unconvinced you can have an LLM produce code and you do all the decision making. These are fundamentally at odds. I am convinced that it will tend to follow your general direction, but when you write the code you're not just writing either.

I don't actually ever feel like the LLMs help me generate code faster because when writing I am also designing. It doesn't take much brain power to make my fingers move. They are a lot slower than my brain. Hell, I can talk and type at the same time, and it isn't like this is an uncommon feat. But I also can't talk and type if I'm working on the hard part of the code because I'm not just writing.

People often tell me they use LLMs to do boilerplate. I can understand this, but at the same time it begs the question "why are you writing boilerplate?" or "why are you writing so much boilerplate?" If it is boilerplate, why not generate it through scripts or libraries? Those have a lot of additional benefits. Saves you time, saves your coworkers time, and can make the code a lot cleaner because you're now explicitly saying "this is a routine". I mean... that's what functions are for, right? I find this has more value and saves more time in the long run than getting the LLMs to keep churning out boilerplate. It also makes things easier to debug because you have far fewer things to look at.

jmcodes · 2025-10-02T14:33:10 1759415590

"It sounds like it could be xyz, but let me double check and I'll let you know in a few/hour/day/"

OR

"No clue. Let me do some digging first."

Have literally never had this come up in my ten years working in tech. Big picture advice? Don't overthink small things like this.

You'll just make yourself feel and act more unconfident not less.

jmcodes · 2025-09-30T01:01:09 1759194069

Haha yes, yes it is. I wrote out and implemented that approach in the links below. I've been playing with it for a few hours and I have to say I actually really really like it.

One thing I ran into is that since the RPC calls are independent Deno processes, you can't keep say DuckDB or SQLite open.

But since it's just typescript on Deno. I can just use a regular server process instead of MCP, expose it through the TS RPC files I define, and the LLM will have access to it.

https://github.com/jx-codes/mcp-rpc https://news.ycombinator.com/item?id=45420133

jmcodes · 2025-09-30T00:57:46 1759193866

https://github.com/jx-codes/mcp-rpc

For those of you interested, I wrote out and built an more RPC typescript centric approach to avoid using other MCP servers at all. Would appreciate some thoughts!

https://news.ycombinator.com/item?id=45420133