Seems plausible. I used to refer to StackOverflow before LLMs and a good amount of the examples there were flawed code presented as working. If the LLM had less junk in its training then it might benefit even though the volume of training on that language is lower.
>The biggest difference: Saturday: Build auth with Claude Sunday: Come back, describe next feature Claude reads REQUIREMENTS.md, sees existing auth schema Builds new feature without touching auth vs. the normal experience of Claude rewriting everything
What do you mean rewriting everything?
When I started properly structuring my projects, it just follows the pattern and doesn't just "rewrite everything". It finds things in places it expects to find.
Your project seems to solve a specific flaw in your flow. And as a npm package, which is super suspicious.
EDIT: Oh, it's just a useless product looking for problems to solve just for some $$$ a month.
> In other words, XML tags have not only a special place at inference level but also during training
Their cited source has 0 proof of that. It's just like python/C/html in training. Doesn't mean it's special. And no, you don't need to format your prompts as python code just because of that.
> In truth, it does not matter that these tags are XML. Other models use ad hoc delimiters (as explained in a previous article; example: <|begin_of_text|> and <|end_of_text|>) and Claude could have done the same. What matters is what these tags represent.
Those strings are just representations of special tokens in models for EOS. What does it have to do with anything this article pretends to know about?
Please don't post such intellectual trash on here :')
Claude analysis of the article:
The author is making an interesting philosophical argument — that XML tags in Claude function as metalinguistic delimiters analogous to quotation marks in natural language, formulaic speech markers in Homer, or recognition sequences in DNA.
The core thesis is about first-order vs. second-order expression boundaries, which is a legitimate linguistic/information-theory concept.
But to your actual question — do they understand what tokens are?
No, not in the technical sense you're pointing at. The article conflates two very different things:
1. Tokenizer-level special tokens — things like <|begin_of_text|>, <|end_of_text|>, <|start_header_id|> etc. These are literal entries in the vocabulary with dedicated token IDs. They're not "learned" through training in the same way — they're hardcoded into the tokenizer and have special roles in the attention mechanism during training. They exist at a fundamentally different layer than XML tags in prompt text.
2. XML tags as structured text within the input — these are just regular tokens (<, instructions, >) that Claude learned to attend to during RLHF/training because Anthropic's training data and system prompts heavily use them. They're effective because of training distribution, not because they occupy some special place in the tokenizer.
The author notices that other models use <|begin_of_text|> style delimiters and says Claude "could have done the same" but chose XML instead. That's a category error. Claude also has special tokens at the tokenizer level — XML tags in prompts are a completely separate mechanism operating at a different abstraction layer.
The philosophical observation about delimiter necessity in communication systems is fine on its own. But grafting it onto a misunderstanding of how tokenization and model architecture actually work weakens the argument. They're essentially pattern-matching on surface-level similarities (both use angle brackets!) without understanding the underlying mechanics.
If an LLM were to struggle to closely follow instructions that weren't wrapped in XML, I would strongly consider it a sign of a poor model reflecting poor model training.
Smaller quant of the same model. A smaller quant of a different family of model would be practically useless and there wouldn't be any point in even setting it up.
They also probably mean TUIs, as CLIs don't do the whole "Draw every X" thing (and usually aren't interactive), that's basically what sets them apart from CLIs.
It’s surprising how quickly the bottleneck starts to become python itself in any nontrivial application, unless you’re very careful to write a thin layer that mostly shells out to C modules.
reply