More

ashirviskas · 2026-03-17T12:44:51 1773751491

What if it is the quality of data? Internet is full of terrible python/js, but probably not Elixir.

deflator · 2026-03-17T12:52:16 1773751936

Seems plausible. I used to refer to StackOverflow before LLMs and a good amount of the examples there were flawed code presented as working. If the LLM had less junk in its training then it might benefit even though the volume of training on that language is lower.

ashirviskas · 2026-03-17T09:26:48 1773739608

I found it interesting that Elixir scores so high, but I'm not sure whether I can agree with the cause.

Bolwin · 2026-03-17T10:29:30 1773743370

That benchmark is useless for comparing languages because the tasks are not the same across languages

gostsamo · 2026-03-17T10:03:02 1773741782

how can you argue with so many assertive sentences in the article? they leave no space for critical thinking.

ashirviskas · 2026-03-17T12:51:33 1773751893

I'll admit, my brain was DDoSed by the article and I thought that maybe posting it here will get us someone with more DDoS proof brain to dissect it.

ashirviskas · 2026-03-17T00:22:49 1773706969

>The biggest difference: Saturday: Build auth with Claude Sunday: Come back, describe next feature Claude reads REQUIREMENTS.md, sees existing auth schema Builds new feature without touching auth vs. the normal experience of Claude rewriting everything

What do you mean rewriting everything?

When I started properly structuring my projects, it just follows the pattern and doesn't just "rewrite everything". It finds things in places it expects to find.

Your project seems to solve a specific flaw in your flow. And as a npm package, which is super suspicious.

EDIT: Oh, it's just a useless product looking for problems to solve just for some $$$ a month.

ashirviskas · 2026-03-02T02:16:58 1772417818

Author does not know what they're talking about.

> In other words, XML tags have not only a special place at inference level but also during training

Their cited source has 0 proof of that. It's just like python/C/html in training. Doesn't mean it's special. And no, you don't need to format your prompts as python code just because of that.

> In truth, it does not matter that these tags are XML. Other models use ad hoc delimiters (as explained in a previous article; example: <|begin_of_text|> and <|end_of_text|>) and Claude could have done the same. What matters is what these tags represent.

Those strings are just representations of special tokens in models for EOS. What does it have to do with anything this article pretends to know about?

Please don't post such intellectual trash on here :')

Claude analysis of the article:

The author is making an interesting philosophical argument — that XML tags in Claude function as metalinguistic delimiters analogous to quotation marks in natural language, formulaic speech markers in Homer, or recognition sequences in DNA.

The core thesis is about first-order vs. second-order expression boundaries, which is a legitimate linguistic/information-theory concept. But to your actual question — do they understand what tokens are?

No, not in the technical sense you're pointing at. The article conflates two very different things:

1. Tokenizer-level special tokens — things like <|begin_of_text|>, <|end_of_text|>, <|start_header_id|> etc. These are literal entries in the vocabulary with dedicated token IDs. They're not "learned" through training in the same way — they're hardcoded into the tokenizer and have special roles in the attention mechanism during training. They exist at a fundamentally different layer than XML tags in prompt text.

2. XML tags as structured text within the input — these are just regular tokens (<, instructions, >) that Claude learned to attend to during RLHF/training because Anthropic's training data and system prompts heavily use them. They're effective because of training distribution, not because they occupy some special place in the tokenizer.

The author notices that other models use <|begin_of_text|> style delimiters and says Claude "could have done the same" but chose XML instead. That's a category error. Claude also has special tokens at the tokenizer level — XML tags in prompts are a completely separate mechanism operating at a different abstraction layer.

The philosophical observation about delimiter necessity in communication systems is fine on its own. But grafting it onto a misunderstanding of how tokenization and model architecture actually work weakens the argument. They're essentially pattern-matching on surface-level similarities (both use angle brackets!) without understanding the underlying mechanics.

OutOfHere · 2026-03-02T08:00:07 1772438407

If an LLM were to struggle to closely follow instructions that weren't wrapped in XML, I would strongly consider it a sign of a poor model reflecting poor model training.

ashirviskas · 2026-02-20T13:45:49 1771595149

Smaller quant or smaller model?

Afaik it can work with anything, but sharing vocab solves a lot of headaches and the better token probs match, the more efficient it gets.

Which is why it is usually done with same family models and most often NOT just different quantizations of the same model.

Zetaphor · 2026-02-21T17:47:28 1771696048

Smaller quant of the same model. A smaller quant of a different family of model would be practically useless and there wouldn't be any point in even setting it up.

ashirviskas · 2026-02-17T22:58:11 1771369091

Someone ping me in 5 years, I want to see if this aged like milk or wine

JSR_FDED · 2026-02-18T00:25:15 1771374315

“Computer, respond to this guy in 5 years”

ashirviskas · 2026-01-27T00:32:53 1769473973

Apple made lower than 16GB M3 models? Man, can't wait till the cheapest model is at least 128GB.

jaredcwhite · 2026-01-27T01:37:17 1769477837

Yeah, M4 was the generation when the minimum got bumped up to 16GB.

ashirviskas · 2026-01-22T22:18:31 1769120311

Another European chiming in, I enjoyed OPs article.

ashirviskas · 2026-01-21T02:17:26 1768961846

Do you also write your bytecode by human hands? At which abstraction layer do we draw the line?

ashirviskas · 2026-01-21T02:15:16 1768961716

> it has, but python being single threaded (until recently) didn't make it an attractive choice for CLI tools.

You probably mean GIL, as python has supported multi threading for like 20 years.

Idk if ranger is slow because it is written in python. Probably it is the specific implementation.

embedding-shape · 2026-01-21T09:12:12 1768986732

> You probably mean GIL

They also probably mean TUIs, as CLIs don't do the whole "Draw every X" thing (and usually aren't interactive), that's basically what sets them apart from CLIs.

behnamoh · 2026-01-21T03:48:34 1768967314

Even my CC status line script enjoyed a 20x speed improvement when I rewrote it from python to rust.

foltik · 2026-01-21T05:12:12 1768972332

It’s surprising how quickly the bottleneck starts to become python itself in any nontrivial application, unless you’re very careful to write a thin layer that mostly shells out to C modules.