Hacker Newsnew | past | comments | ask | show | jobs | submit | troupo's commentslogin

And that is revenue only. In the past 15 or so years most US companies (and especially startups) always talk about revenue only. Wheras only profit should matter.

E.g. what good is 20 billion per year when "OpenAI is targeting roughly $600 billion in total compute spending through 2030". That is $150 billion per year?


The startup game is about building assets and then cashing out on them during exit.

Assets are harder to measure. Facebook used to say something silly like every user was worth $100. That sounded ridiculous for a completely free app but over a decade later, the company is worth more than that. Revenue is an easier way of measuring assets than profit.

Profit doesn't really matter. It gets taxed. But it's not about dodging taxes; it's because sitting on a pile of money is inefficient. They can hire people. They can buy hardware. They can give discounts to users with high CLTV. They can acquire instead of building. It's healthy to have profit close to $0, if not slightly negative. If revenues fall or costs increase, they can make up for the difference by just firing people or cutting unprofitable projects.

Also when they're raising money, it makes absolutely no sense to be profitable. If they were profitable, why would they raise money? Just use the profits.


Give me a billion and I'll have 500M of revenue in no time by selling dollars at 50 cents.

Why are we treating OpenAI and Anthropic differently than say, Amazon or Uber? Both companies invested in growth for many years before making a profit. Most tech companies in the last 2-3 decades lost money for years before making a profit.

Why are we saying that OpenAI and Anthropic can't do the same?


Two reasons. They somewhat broke even, and kept getting investment. The potential for quasi monopoly was obvious.

Openai can't claim either.


How did Uber somewhat break even? They lost $34b before making a profit.

Uber was only on a path to monopoly in the US, not world wide. It’s lost to local competitors in most countries. And it can get disrupted by self driving cars soon.

OpenAI’s SOTA LLM training smells like a natural monopoly or duopoly to me. The cost to train the smartest models keep increasing. Most competitors will bow out as they do not have the revenue to keep competing. You can already see this with a few labs looking for a niche instead of competing head on with Anthropic and OpenAI.


The cost of copying SOTA models though is super cheap and doesn’t take super long.

How do you distill when OpenAI and Anthropic inevitably move to tasks running in the cloud? IE. Go buy this extremely hard to get concert ticket for me.

Distilling might only be effective in the chat bot dominant era. We are about to move to an agents era.

Furthermore, I’m guessing distilling will get harder and harder. Claude Code leak shows some primitive anti distilling methods already. There’s research showing that models know when it’s being benchmarked. Who’s to say Anthropic and OpenAI aren’t able to detect when their models are being distilled?


Worse, Google can afford to outspend them in this game and basically run them both out of money.

It's not even remotely comparable. Uber burnt some $30B over a decade or so.

It's not as much as you think. Google is spending $185b on data centers this year alone. Amazon is spending $200b this year. Total capex for big tech is ~$700b in 2026 and we're not including neo clouds, Chinese clouds, and other sovereign data centers.

Since everyone is trying to get compute from anywhere they can, including OpenAI going to Google, it's hard to tell what is used internally vs externally.

For example, it's entirely possible that Google's internal roadmap for Gemini sees it using $600b of compute through 2030 as well. In that case, OpenAI needs to match since compute is revenue.


What is the point - exactly - of profit?

Profit is money you can't find a use for to grow your business, so you give some of it to the government in the form of tax.

Also there is a big difference between operational expenses and capital expenses like building data centers.

I think OpenAI is being very aggressive on the growth vs conservative financial management spectrum but just saying "only profit should matter" is just wrong.


why should only profits matter? if i had a killer product today that i just need to sell tomorrow, wouldn't you still invest today knowing i'll probably only start to make money tomorrow (or perhaps next week)?

the expectation is that they'll eventually make money. they can't raise forever. only startups are not profitable for a few years. but most companies that have existed for a long while have been profitable

and since they're expected to make a LOT of money, everyone wants a piece of that future pie, pushing up the valuation and amount raised to admittedly somewhat delusional levels like here


> why should only profits matter?

In this case because it's not clear that anybody has actually figured out how to sell inference for more than it costs


It's well know everyone is making great money on inference. The cost is training.

Whether GPT-5 was profitable to run depends on which profit margin you’re talking about. If we subtract the cost of compute from revenue to calculate the gross margin (on an accounting basis),2 it seems to be about 30% — lower than the norm for software companies (where 60-80% is typical) but still higher than many industries.

(They go on to point out that there are other costs that might mean they didn't break even on other costs - although I suspect these costs should be partially amortized over the whole GPT 5.x series, not just 5.0)

https://epochai.substack.com/p/can-ai-companies-become-profi...

https://martinalderson.com/posts/are-openai-and-anthropic-re... (with math working backwards from GPU capacity)

"Most of what we're building out at this point is the inference [...] We're profitable on inference. If we didn't pay for training, we'd be a very profitable company"

https://simonwillison.net/2025/Aug/17/sam-altman/

"There’s a bright spot, however. OpenAI has gotten more efficient at serving paying users: Its compute margin—the revenue left after subtracting the cost of running AI models for those customers—was roughly 70% in October, an increase from about 52% at the end of last year and roughly 35% in January 2024."

https://archive.is/OqIny#selection-1279.0-1279.305 (Note this is after having to pay higher spot rates for compute because of higher than expected demand)


not if your product is selling two dollars for one dollar and as soon as you'll start to charge more I'll switch to one of your twenty competitors

profit isn't a function of having a killer product, it's a function of having no competition


And why do you think twenty competitors can stay competitive for years to come?

Industries always consolidate and winners emerge. SOTA LLMs look like a natural monopoly or duopoly to me because the cost to train the next model keeps going up such that it won't make sense for 20 competitors to compete at the very high end.

TSMC is a perfect example of this. Fab costs double every 4 years (Rock’s Law). It's almost impossible to compete against TSMC because no one has the customer base to generate enough revenue to build the next generation of fabs - except those who are propped up by governments such as Intel and Rapidus. Samsung is basically the SK government.

I don’t see how companies can catch OpenAI or Anthropic without the strong revenue growth.


Google has completely caught OpenAI. Anthropic has a better coding model, but I'm sure Google is working on that too.

no competition is a bit extreme. Limited competition yes due to competitive advantages.

> Wheras only profit should matter

Profit is money you couldn’t figure out how to spend. During growth, you want positive operating margins with nominal profits. When the company/market matures, you want pure profits because shareholders like money. If you can find a way to invest those profits in new areas of growth, that’s better.


Not sure why you’re downvoted.

Everyone wants to treat OpenAI like a car wash business where they need to make a profit almost immediately. I don’t know why people can’t understand that the industry is in a rapid growth stage and investing the money is more important than making a profit now. The profits will come later.


"Here comes another bubble..."


Bringing the advertising to all of humanity.

> So go back about one year, and we could vote about it before it got into the standard, and some of us voted no. Now we have a much harder problem. This is part of the standard proposal.

Offtopic, but this is a problem in the web world, too. Once something is on a standards track, there are almost mechanisms to vote "no, this is bad, we don't need this". The only way is to "champion" a proposal and add fixes to it until people are somewhat reasonably happy and a consensus is reached. (see https://x.com/Rich_Harris/status/1841605646128460111)


> Only if you assume that people who train models are stupid

Someone in the chain will be. Even the smartest people buy a lot of their training datasets. What happens when those get contaminated?


You filter them, duh. And you negotiate a contract where the seller bears some of that risk (or you pay less, if they are not willing to make any such warranties.)

> You filter them, duh.

Filters are also not 100% infallible

> And you negotiate a contract where the seller bears some of that risk

So the training data will be polluted anyway, but "the seller will bear some risk"


> Filters are also not 100% infallible

Why would they need to be?


You want your training data to be clean or contaminated?

A small number of samples can poison LLMs of any size https://www.anthropic.com/research/small-samples-poison

--- start quote ---

In a joint study with the UK AI Security Institute and the Alan Turing Institute, we found that as few as 250 malicious documents can produce a "backdoor" vulnerability in a large language model—regardless of model size or training data volume. Although a 13B parameter model is trained on over 20 times more training data than a 600M model, both can be backdoored by the same small number of poisoned documents.

--- end quote ---


I thought we were talking about model collapse?

Poisoning is a completely different topic.


We were talking about this: https://news.ycombinator.com/item?id=47571715 and literally every single comment under this is talking about that.

It's not a different topic. It's literally the topic of this branch of discussion.



--- start quote ---

> The internet becoming majority bot content basically guarantees this becomes a real problem for the next generation of models.

Only if you assume that people who train models are stupid.

--- end quote ---

And then literally everyone who commented on this, including me, was talking about issues with training data contamination. And you are the only one dismissing it as nothing important that can be easily fixed.


Look at the whole comment, instead of selectively quoting:

> The bigger concern is what happens when AI models start training on AI-generated content at scale. We're already seeing model collapse in research papers where output quality degrades when training data is contaminated with synthetic text. The internet becoming majority bot content basically guarantees this becomes a real problem for the next generation of models.

Model collapse.


Well, and then you have Claude Code which at one point needed 68GB of RAM to run https://x.com/jarredsumner/status/2026497606575398987

:)


> If the trees were in the same space as the panels, they'd be in the midddle of the parking space. What you'd have then is not a car park, but just a plain ordinary park.

Sigh No, it's not. You can, and you should have trees in the middle of parking lots.

Examples (and these are not even good examples):

- https://maps.app.goo.gl/J4Ug8KyFcg8B481z5

- https://maps.app.goo.gl/Dm2faVYNbeWkivNK6

- https://maps.app.goo.gl/7DEYPKQFX8cNPD8n8


> I would go for a 2 or 3 hour walk with my phone using the remote control feature looking every 5 - 10 minutes

2-3 hours "walking" while having to check in every 5-10 minutes?

If I have to check in every 5-10 minutes, I won't taste coffee or hear that there's good music playing.


Just Claude code a push notification feature then

How is this better?

> Claude should be shipped by a custom implementation of

And when that fails for some reason it will happily write and execute a Python script bypassing all those custom tools


> So what do you think the difference is between humans and an agent in this respect?

Humans learn.

Agents regurgitate training data (and quality training data is increasingly hard to come by).

Moreover, humans learn (somewhat) intangible aspects: human expectations, contracts, business requirements, laws, user case studies etc.

> Verifiable domain performance SCALES, we have no reason to expect that this scaling will stop.

Yes, yes we have reasons to expect that. And even if growth continues, a nearly flat logarithmic scale is just as useless as no growth at all.

For a year now all the amazing "breakthrough" models have been showing little progress (comparatively). To the point that all providers have been mercilessly cheating with their graphs and benchmarks.


> Where did I say that? I didn’t even mention money, just the broader resource term. A lot of business are mostly running experiments if the current set of tooling can match the marketing (or the hype). They’re not building datacenters or running AI labs. Such experiments can’t run forever.

I'm just going to ask that you read any of my other comments, this is not at all how coding agents work and seems to be the most common misunderstanding of HN users generally. It's tiring to refute it. RL in verifiable domains does not work like this.

> Humans learn.

Sigh, so do LLMs, in context.

> Moreover, humans learn (somewhat) intangible aspects: human expectations, contracts, business requirements, laws, user case studies etc.

Literally benchmarks on this all over the place, I'm sure you follow them.

> Yes, yes we have reasons to expect that. And even if growth continues, a nearly flat logarithmic scale is just as useless as no growth at all.

and yet its not logarithmic? Consider data flywheel, consistent algorithmic improvements, synthetic data [basically: rejection sampling from a teacher model with a lot of test-time compute + high temperature],

> For a year now all the amazing "breakthrough" models have been showing little progress (comparatively). To the point that all providers have been mercilessly cheating with their graphs and benchmarks.

Benchmaxxing is for sure a real thing, not to mention even honest benchmarking is very difficult to do, but considering "all of the AI companies are just faking the performance data" to be the "story" is tremendously wrong. Consider AIME performance on 2025 (uncontaminated data), the fact that companies have a _deep incentive_ to genuinely improve their models (and then of course market it as hard as possible, thats a given). People will experiment with different models, and no benchmaxxing is going to fool people for very long.

If you think Opus 4.6 compared to Sonnet 3.x is "little progress" I think we're beyond the point of logical argument.


Are you aware that LLms are still the same autocomplete just with different token decisions more data better pre and post training and settings

We have all the data now.

I don’t see where the huge gap should come from, as one person before they said they still make basic errors.

Models got better for a bunch of soft tuning. Language and abstractness is not really the same thing there are a lot of very good speakers that are terrible in logic and abstractness.

Thinking abstract sometimes makes it necessary to leave language and draw or som people even code in another coding language to get it.

We’ve seen it with the compiler project it’s nice looking but if you would want to make a competitive compiler you would be as far as starting fresh


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: