More

sznio · 2026-04-10T06:37:38 1775803058

i think that a wasteful but good solution would be to tag each token, not use opening/closing tags.

whatever n-dimensional space the tokens occupy, manually add more dimensions, to reflect user/agent, trusted/untrusted input.

it should be much harder for the LLM to fuck up this way if every single word it reads screams "suspicion" or "trust". with tag tokens at the start it can just forget

sznio · 2026-04-10T06:26:54 1775802414

> Wtf has happened?? How can they mess up a fucking Notepad App?

they made it use Electron

sznio · 2026-04-07T07:16:13 1775546173

I had to fall back to that to deliver anything recently - but the last two months were really comfy with me just saying "do x" and just going on a walk and coming back to a working project.

Claude is still useful now, but it feels more like a replacement for bashing on a keyboard, rather than a thinking machine now.

sznio · 2026-03-31T12:15:53 1774959353

oh can one hope...

sznio · 2026-03-30T21:43:52 1774907032

I don't understand the purpose of a tutorial for a natural language ai system.

simonw · 2026-03-31T00:06:07 1774915567

That's like saying there's no point in attending a lecture on "how to get the best out of your time at University" because University courses are taught in spoken language so you could just ask the professors.

rco8786 · 2026-03-30T21:57:12 1774907832

Claude Code is a tool that uses natural language ai systems. It itself is not a natural language ai system.

mrtksn · 2026-03-30T22:06:07 1774908367

The idea that AI can write code like a seasoned software developer but not being able to use its own tooling that can be learned through 11 chapters tutorial doesn't make any sense.

rco8786 · 2026-04-02T11:41:41 1775130101

Ok? I'm just explaining what claude code is, not pontificating about the capabilities of AI.

arbitrary_name · 2026-03-30T22:33:16 1774909996

sounds like you might benefit from a tutorial!

sznio · 2026-03-27T08:45:01 1774601101

On that topic, anyone here got a decent local coding AI setup for a 12GB VRAM system? I have a Radeon 6700 XT and would like to run autocomplete on it. I can fit some models in the memory and they run quick but are just a tad too dumb. I have 64GB of system ram so I can run larger models and they are at least coherent, but really slow compared to running from VRAM.

mongrelion · 2026-03-27T11:14:31 1774610071

Not the answer that you are looking for, but I am a fellow AMD GPU owner, so I want to share my experience.

I have a 9070 XT, which has 16GB of VRAM. My understanding from reading around a bunch of forums is that the smallest quant you want to go with is Q4. Below that, the compression starts hurting the results quite a lot, especially for agentic coding. The model might eventually start missing brackets, quotes, etc.

I tried various AI + VRAM calculators but nothing was as on the point as Huggingface's built-in functionality. You simply sign up and configure in the settings [1] which GPU you have, so that when you visit a model page, you immediately see which of the quants fits in your card.

From the open source models out there, Qwen3.5 is the best right now. unsloth produces nice quants for it and even provides guidelines [2] on how to run them locally.

The 6-bit version of Qwen3.5 9B would fit nicely in your 6700 XT, but at 9B parameters, it probably isn't as smart as you would expect it to run.

Which model have you tried locally? Also, out of curiosity, what is your host configuration?

[1]: https://huggingface.co/settings/local-apps [2]: https://unsloth.ai/docs/models/qwen3.5

kroaton · 2026-03-27T13:26:46 1774618006

For autocomplete, Qwen 3.5 9B should be enough even at Q4_k_m. The upcoming coding/math Omnicoder-2 finetune might be useful (should be released in a few days).

Either that or just load up Qwen3.5-35B-A3B-Q4_K_S I'm serving it at about 40-50t/s on a 4070RTX Super 12GB + 64GB of RAM. The weights are 20.7GB + KV Cache (which should be lowered soon with the upcoming addition of TurboQuant).

mongrelion · 2026-03-27T17:56:41 1774634201

I am definitely looking forward to TurboQuant. Makes me feel like my current setup is an investment that could pay over time. Imagine being able to run models like MiniMax M2.5 locally at Q4 levels. That would be swell.

sznio · 2026-03-28T07:45:58 1774683958

I don't remember exact models, but I tried whatever was available in Ollama. I remember using some really low parameter version of llama

sznio · 2026-03-27T08:40:15 1774600815

Considering how my parents still refer to that area of the world as Yugoslavia, I'm pretty sure the postal system will know how to route it. Will probably be escalated to a human for labeling though.

sznio · 2026-03-26T08:30:54 1774513854

yeah, not the author here.

I found the project on YouTube[1] and wanted to share it - but decided to find something that's text for HN, and in the rush to post I failed to check if the post is even complete. I should've posted the video instead.

[1]: https://youtu.be/sioLAkNQC_I

sznio · 2026-03-24T13:34:22 1774359262

I think that eventually, Win32/WoW64 will be the stable common API for Linux programs - or at least games. I won't be surprised if it outlasts Windows.

sznio · 2026-03-24T13:32:35 1774359155

>No, "learn the FOSS version" is not a solution.

It is a solution. Once you do it, your problem is solved, that makes it the solution. If you aren't willing to go with that, you can stay with Windows and just accept the constant abuse.

As for gaming, I've been on Linux for two years now and I haven't had a single game not work.

maerF0x0 · 2026-03-24T14:13:59 1774361639

And as for a better solution, Teach kids. Once I'm an ornery PTA parent I'm going to push for programming and *nix of some sort to be taught to the school, even if I have to volunteer to do it myself.