Hacker Newsnew | past | comments | ask | show | jobs | submit | stingraycharles's commentslogin

Isn’t sales tax only for consumers? Ie companies reverse charge sales tax or omit it entirely. Or what is this 2% - 10% sales tax you’re referring to?

> Isn’t sales tax only for consumers?

It depends how you defined “consumers”. If you mean “those who consume the good subject to the tax, rather than people who resell the good”, yes, ideally.

If you mean “not businesses” or “individuals but not corporations”, then, no.

> Ie companies reverse charge sales tax or omit it entirely.

Generally, the theory of sales taxes is that people (including corporations) pay sales tax on things they consume as a final good rather than use as an intermediate good in production or simply resell. The exact way in which that is determined varies somewhat between jurisdictions with sales taxes, but generally (assuming paper is subject to sales tax in a jurisdiction), if you are buying paper to print books that you sell, you don't pay sales tax, if you are buying paper to print internal documents that you use in running the business, you do pay sales tax.


Please don’t use AI to write comments on HN.

huh?

You edited your comment. It very much first said something about using regexes as being the most important takeaway and whatnot.

I guess they really do eat their own dogfood and vibe code their way through it without care for technical debt? In a way, it’s a good challenge, but it’s fairly painful to watch the current state of the project (which is about a year old now, so it should be in prime shape).

> is about a year old now, so it should be in prime shape

A 1yo project may be in good shape if written by just one dev, maybe a few. But if you have many devs, I can guarantee it will be messy and buggy. If anything, at 1yo it is probably still full of bugs because not enough time has elapsed for people to run into them.


It's only 510k LoC, at ~100 lines of code a day for a year, this code base would take 23 engineers a year to write. That's for 220 working days in somewhere civilized.

And I'm sure we all know that when working on a greenfield project you can produce a lot more LoC per day than maintaining a legacy one.

Given that vibe code is significantly more verbose, you're probably talking about ~15 engineers worth of code?

I know that's all silly numbers, but this is just attempting to give people some context here, this isn't a massive code base. I've not read a lot of it, so maybe it's better than the verbose code I see Claude put out sometimes.


When you say it’s not a massive codebase, I’m curious, what are you comparing it to?

The previous poster was making out that in a year the code base would be a mess if people had done it.

This is a two-pizza team sized project, so it's not a project that the code quality would inevitably spiral out of control due to communication problems.

A single senior architect COULD have kept the code quality under control.


Boris Cherny, the creator of Claude Code said he uses CC to build CC.

Which makes for an interesting thought / discussion; code is written to be read by humans first, executed by computers second. What would code look like if it was written to be read by LLMs? The way they work now (or, how they're trained) is on human language and code, but there might be a style that's better for LLMs. Whatever metric of "better" you may use.

Just a thought experiment, I very much doubt I'm the first one to think of it. It's probably in the same line of "why doesn't an LLM just write assembly directly"


LLMs read and write human-code because humans have been reading and writing human-code. The sample size of assembly problems is, in my estimate, too small for LLMs to efficiently read and write it for common use cases.

I liken it to the problem of applying machine learning to hard video games (e.g. Starcraft). When trained to mimic human strategies, it can be extremely effective, but machine learning will not discover broadly effective strategies on a reasonable timescale.

If you convert "human strategies" to "human theory, programming languages, and design patterns", perhaps the point will be clear.

But: could the ouroboric cycle of LLM use decay the common strategies and design patterns we use into inexplicable blobs of assembly? Can LLMs improve at programming if humans do not advance the theory or invent new languages, patterns, etc?


But starcraft training is not through mimicking human strategies - it was pure RL with a reward function shaped around winning, which allows it to emerge non-human and eventually super-human strategies (such as the worker oversaturation).

The current training loop for coding is RL as well - so a departure from human coding patterns is not unexpected (even if departure from human coding structure is unexpected, as that would require development of a new coding language).


> It's probably in the same line of "why doesn't an LLM just write assembly directly"

My suspicion is that the "language" part of LLMs means they tend to prefer languages which are closer to human languages than assembly and benefit from much of the same abstractions and tooling (hence the recent acquisition of bun and astral).


The problem with that is that assembly isn't portable, and x86 isn't as dominant as it once was, so then you've got arm and x86(_64). But you could target the LLVM machine if you wanted.

Yes but my point was that they seem to explicitly not care about code quality and/or the insane amount of bloat, and seem to just want the LLM to be able to deal with it.

I've heard somewhere that they have roughly 100% code churn every few months, so yes, they unfortunately don't care about code quality. It's a shame, because it's still the best coding agent, in my experience.

> they unfortunately don't care about code quality.

> It's a shame, because it's still the best coding agent, in my experience.

If it is the best, and if it delivers the value users are asking for, then why would they have an incentive to make further $$$ investments to make it of a "higher" quality if the value this difference could make is not substantial or hurts the ROI?

On many projects I found this "higher quality" not only to be false of delivering more substantial value but actually I found it was hurting the project to deliver the value that matters.

Maybe we are after all entering the era of SWE where all this bike-shedding is gone and only type of engineers who will be able to survive in it will be the ones who are capable of delivering the actual value (IME very few per project).


Yes, but as I said, it’s in a way the ultimate form of dogfooding: ideally they’ll be able to get the LLM smart enough to keep the codebase working well long-term.

Now whether that’s actually possible is a second topic.


They explicitly boast about using claude code to write code: https://x.com/bcherny/status/2007179836704600237

That's how you get "oh this TUI API wrapper needs 68GB of RAM" https://x.com/jarredsumner/status/2026497606575398987 or "we need 16ms to lay out a few hundred characters on screen that's why it's a small game engine": https://x.com/trq212/status/2014051501786931427


Just finished looking at Ink here.. frontend world has no shame. Love the gloating about 40x less RAM as if that amount of memory for a text REPL even approaches defensible. "CC built CC" is not the flex people seem to suggest it is.

From what I understand, the US currently produces more oil than any other country has ever produced.

Guess which nation also has this heavy type of crude oil? Venezuela, which was invaded earlier this year.

If I recall correctly, the US used to have more of this type of oil, that depleted, so now they still have all the refineries on the east coast and need to import it.


Because it’s very cheap to tell an LLM to write a blogpost based on a HN thread, and apparently the HN community upvotes this as well.

I don’t think that’s a good comparison. There isn’t anything preventing Anthropic from, say, detecting whether the user is using the exact same system prompt and tool definition as Claude Code and call it a day. Will make developing other apps nearly impossible.

It’s a dynamic, subscription based service, not a static asset like a video.


> detecting whether the user is using the exact same system prompt and tool definition as Claude Code

Why would it be the exact same one? Now that we have the code, it's trivial to have it randomize the prompt a bit on different requests.


Because they want it to be executed quickly and cheaply without blocking the workflow? Doesn’t seem very weird to me at all.

They probably have statistics on it and saw that certain phrases happen over and over so why waste compute on inference.

More likely their LLM Agent just produced that regex and they didn't even notice.

The problem with regex is multi-language support and how big the regex will bloat if you to support even 10 languages.

Supporting 10 different languages in regex is a drop in the ocean. The regex can be generated programmatically and you can compress regexes easily. We used to have a compressed regex that could match any placename or street name in the UK in a few MB of RAM. It was silly quick.

woah. This is a regex use I've never heard of. I'd absolutely love to see a writeup on this approach - how its done and when it's useful.

You can literally | together every street address or other string you want to match in a giant disjunction, and then run a DFA/NFA minimization over that to get it down to a reasonable size. Maybe there are some fast regex simplification algorithms as well, but working directly with the finite automata has decades of research and probably can be more fully optimized.

This was many moons ago, written in perl. From memory we used Regexp::Trie - https://metacpan.org/release/DANKOGAI/Regexp-Trie-0.02/view/...

We used it to tokenize search input and combined it with a solr backend. Worked really remarkably well.


I think it will depend on the language. There are a few non-latin languages where a simple word search likely won't be enough for a regex to properly apply.

We're talking about Claude Code. If you're coding and not writing or thinking in English, the agents and people reading that code will have bigger problems than a regexp missing a swear word :).

I talk to it in non-English. But have rules to have everything in code and documentation in english. Only speaking with me should use my native language. Why would that be a problem?

Because 90% of training data was in English and therefore the model perform best in this language.

In my experience these models work fine using another language, if it’s a widely spoken one. For example, sometimes I prompt in Spanish, just to practice. It doesn’t seem to affect the quality of code generation.

It’s just a subjective observation.

It just can’t be a case simply because how ML works. In short, the more diverse and high quality texts with reasoning reach examples were in the training set, the better model performs on a given language.

So unless Spanish subset had much more quality-dense examples, to make up for volume, there is no way the quality of reasoning in Spanish is on par with English.

I apologise for the rambling explanation, I sure someone with ML expertise here can it explain it better.


I saw a curious post recently that explored this idea, and showed that it isn’t really the case. The internal layers of the model aren’t really reasoning in English, or in any human language.

Translation in/out of human languages only happens at the edges of the model.

Internal layer activations for the same concept are similar regardless of language, while activations at the top/bottom layers diverge. Meanwhile the pattern is reversed for same language different content.


So we do at least agree on the fact that quality of human language <-> embeddings transition depends on how good target language is represented in the training dataset?

Even if it happens at the edge, on every conversation turn, I may assume non captured small subtleties of meaning over time can accumulate into significant error.


The agent may be translating to English behind the scenes. There are all kinds of hacks layered on top of the underlying models. But however exactly it works, it’s clearly not the case that these things only make use of Spanish training data when you prompt them in Spanish.

Lets take a language with orders of magnitude less training data, like Nahuatl. The following is what happens if I ask Claude Code in Nahuatl to give me an example of a simple program that I can execute inside a web browser (avoiding any direct keyword prompts like 'Javascript' or 'Hello World'). Interestingly, it responds in Spanish. I doubt that it has any relevant training data in Nahuatl. (Apologies if the Nahuatl is wrong – I just used Google translate.)

    ¿Uelis techmakas se neskayotl tlen se programa tlen amo ouij tlen ueliskia nijchiuas ipan se navegador web?

    -----

    Entiendo tu pregunta en náhuatl. Aquí un ejemplo sencillo — una calculadora en HTML puro que puedes abrir directo en el navegador, sin servidor:

    <!DOCTYPE html>
    <html>
    <head>
        <meta charset="UTF-8">
        <title>Calculadora</title>
        <style>
        body { font-family: sans-serif; display: flex; justify-content: center; padding: 40px; }
        .calc { border: 1px solid #ccc; padding: 20px; border-radius: 8px; width: 200px; }
        input { width: 100%; margin-bottom: 10px; font-size: 1.2em; text-align: right; }
        button { width: 22%; margin: 2px; padding: 8px; font-size: 1em; cursor: pointer; }
        .wide { width: 47%; }
        </style>
    </head>
    <body>
    <div class="calc">
        <input type="text" id="pantalla" readonly>
        <div>
        <button onclick="tecla('7')">7</button>
        <button onclick="tecla('8')">8</button>
        <button onclick="tecla('9')">9</button>
        <button onclick="tecla('/')">÷</button>
    
        <button onclick="tecla('4')">4</button>
        <button onclick="tecla('5')">5</button>
        <button onclick="tecla('6')">6</button>
        <button onclick="tecla('*')">×</button>
    
        <button onclick="tecla('1')">1</button>
        <button onclick="tecla('2')">2</button>
        <button onclick="tecla('3')">3</button>
        <button onclick="tecla('-')">−</button>
    
        <button onclick="tecla('0')">0</button>
        <button onclick="tecla('.')">.</button>
        <button onclick="tecla('+')">+</button>
        <button onclick="limpiar()">C</button>
    
        <button class="wide" onclick="calcular()">=</button>
        <button class="wide" onclick="borrar()">⌫</button>
        </div>
    </div>
    
    <script>
        const pantalla = document.getElementById('pantalla');
    
        function tecla(val) {
        pantalla.value += val;
        }
    
        function calcular() {
        try {
            pantalla.value = eval(pantalla.value);
        } catch {
            pantalla.value = 'Error';
        }
        }
    
        function limpiar() { pantalla.value = ''; }
        function borrar() { pantalla.value = pantalla.value.slice(0, -1); }
    </script>
    </body>
    </html>
    
    Guarda esto como calculadora.html y ábrelo en cualquier navegador — no necesita servidor ni dependencias. Es un buen punto de partida para aprender HTML,
    CSS y JavaScript.

> it’s clearly not the case that these things only make use of Spanish training data when you prompt them in Spanish.

It’s not! And I’ve never said that.

Anyways, I’m not even sure what we are arguing about, as it’s 100% fact that SOTA models perform better in English, the only interesting question here how much better, is it negligible or actually makes a difference in real world use-cases.


It’s negligible as far as I can tell. If the LLM can “speak” the language well then you can prompt it in that language and get more or less the same results as in English.

They literally just have to subtract the vector for the source language and add the vector for the target.

It’s the original use case for LLMs.


Thank you. +1. There are obviously differences and things getting lost or slightly misaligned in the latent space, and these do cause degradation in reasoning quality, but the decline is very small in high resource languages.

Claude handles human languages other than English just fine.

In my experience agents tend to (counterintuitively) perform better when the business language is not English / does not match the code's language. I'm assuming the increased attention mitigates the higher "cognitive" load.

They only need to look at one language to get a statistically meaningful picture into common flaws with their model(s) or application.

If they want to drill down to flaws that only affect a particular language, then they could add a regex for that as well/instead.


Did you just complain about bloat, in anything using npm?

Why do you need to do it at the client side? You are leaking so much information on the client side. And considering the speed of Claude code, if you really want to do on the client side, a few seconds won't be a big deal.

Depends what its used by, if I recall theres an `/insights` command/skill built in whatever you want to call it that generates a HTML file. I believe it gives you stats on when you're frustrated with it and (useless) suggestions on how to "use claude better".

Additionally after looking at the source it looks like a lot of Anthropics own internal test tooling/debug (ie. stuff stripped out at build time) is in this source mapping. Theres one part that prompts their own users (or whatever) to use a report issue command whenever frustration is detected. It's possible its using it for this.


> a few seconds won't be a big deal

it is not that slow


It looks like it's just for logging, why does it need to block?

Better question - why would you call an LLM (expensive in compute terms) for something that a regex can do (cheap in compute terms)

Regex is going to be something like 10,000 times quicker than the quickest LLM call, multiply that by billions of prompts


This is assuming the regex is doing a good job. It is not. Also you can embed a very tiny model if you really want to flag as many negatives as possible (I don't know anthropic's goal with this) - it would be quick and free.

I think it's a very reasonable tradeoff, getting 99% of true positives at the fraction of cost (both runtime and engineering).

Besides, they probably do a separate analysis on server side either way, so they can check a true positive to false positive ratio.


As explained in the article, it’s typical margin cutting.

Yeah this seems to be a very bad idea. Seems like the author had the right idea, but the wrong way of implementing it.

There are a few papers actually that describe how to get faster results and more economic sessions by instructing the LLM how to compress its thinking (“CCoT” is a paper that I remember, compressed chain of thought). It basically tells the model to think like “a -> b”. There’s loss in quality, though, but not too much.

https://arxiv.org/abs/2412.13171


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: