Hacker Newsnew | past | comments | ask | show | jobs | submit | materielle's commentslogin

There’s going to be lots of documentation. It will be AI generated and no human will ever read it. But there will be a lot of it.

I'm about to leave a shallow comment, but I am a bit skeptical of the supposed drop in inference costs. If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop? So the fact that publicly available information is conflicted is probably a sign that at the very least, the numbers aren't amazing.

Yes I know there's no evidence and this is lazy reasoning. But there's probably a bit of truth to this line of thought.


Why on earth would AI labs be bragging about how little the product they sell actually costs them to make? You don't want to do anything that reduces it's perceived value to the user, that might make them less willing to pay for it.

Also, inference costs are bound to go way down with more optimized architectures. GPUs are fundamentally not great at inference. No platform where the weights are streamed from a large pool of memory is. If the models ever quiet down, there will be massive step changes in cost/token, energy/token and tokens/second, as models are etched into silicon ala https://chatjimmy.ai/


A couple of years ago Altman was saying the price of AI compute is going to drop 90% year over year or something like that, so I don't think they're nervous about talking about lowering their costs. They probably just haven't been able to lower their costs.

You have to keep in mind that about 99% of their announcements are targeted towards investors (their most important revenue source..), so they're not going to be afraid to mention metrics that make the business look better.


Jevons paradox. Cheaper tokens does not mean we will spend less.

Cheaper tokens means the company's margins increase, which would be valuable for investors to hear

The main limit to my token spend right now is that I'm running out of hours in a day.

Ah yes, Sam “Not Consistently Candid” Altman

Oh, is that the guy that sold Loopt by claiming it had hundreds of thousands of users and it turned out to have 500 DAU after his exit?

Yep, the very same scammer. Wonder if he's lying about OpenAI too? Maybe about a person blowing a metal instrument?

he lied. he's good at that.

Why would any company brag about their margins ? Yet they do, to attract investors.

The key AI labs are not public companies, they are at liberty to brag about their margins to potential investors in private.

And investors will leak such claims quickly enough that this reasoning cannot plausibly hide big secrets.

It's not a big secret. If you just do the math yourself, it's easy to compute that inference doesn't cost all that much. People just see all the capital investment going around and all the new data centers being built, see that it's spent on "AI", put two and two together and get a three, or "clearly serving AI requests costs an arm and a leg".

The 1 they were missing is that AI requires both training and inference, and training is by far the expensive part. And that in principle you can stop training at any point and keep using the models as they are. (But that means that if other companies keep improving their models, you'll be left behind...)

In contrast, inference is fairly cheap and all the providers have great margins on it. Eventually either investment in training stops having commensurate impact on model quality, and people stop doing that and instead concentrate on making inference faster and even more efficient. Or if that doesn't happen, things will get very weird very quickly.


The market already shows where it will go.

If you want frontier model you will pay more for inference to essentially fund the expensive training.

If you don’t need frontier model you will get dirt cheap inference, which eventually will approach the cost of electricity spent per token.


This is technically correct, but practically false.

They can't stop training as then the AI's knowledge will become out-of-date very quickly. Their knowledge stops the day you stop training.


Yes it seems that this discussion that has sparked such controversy involves an already well defined concept in business.

Net margin versus gross margin.

Net shows profitability after extracting all expenses while gross only extracts the cost of the goods sold. Putting the model training costs into a one time fixed expense provides a much better gross margin.

This is known as COGS reclassification or classification shifting and is a common tactic to mislead investors.

This is why analysts look at Free Cash Flow Margin.

WorldCom and MicroStrategy did this before the Dotcom Bubble imploded.


> If you just do the math yourself, it's easy to compute that inference doesn't cost all that much.

Show us your work, then. If it's so easy to do, this should be a trivial request to accommodate, no?


Just look at large open weights models being served by inference providers.

Kimi 2.6 is a 1 trillion total / 32B active parameter model that's something comparable to Sonnet. Sonnet's API pricing is $5 in, $15 out per million tokens. Deepinfra serves Kimi at $0.75 in, $3.50 out, and about the same at openrouter. So you're looking at a 4-7x multiple that Anthropic is charging compared to market rates that any plebe can get with a credit card.


I'm not sure just how good that looks for Anthropic/OpenAI.

4-7x isn't a tiny markup, but how does that compare to high-margin internet businesses like AdSense? Meta and Google do hundreds of billions in ad revenue a year, and after taking out the publisher's portion (60-80% per some searching), I wonder what the ratio of the remaining tens-of-billions is against the compute cost and headcount required to run it.

And how much room for maintaining or improving that margin do they have if the cheap competitors also continue getting better? Is there a "good enough" point where the easier inference tasks are all moving to vendors massively undercutting them, and then they don't have the volume necessary to justify spending on further cutting-edge development?


> Kimi 2.6 is a 1 trillion total / 32B active parameter model that's something comparable to Sonnet.

No it's not. On some rigged paper maybe. Some such benchmarks say all models group together, which they clearly do not.

> Sonnet's API pricing is $5 in, $15 out per million tokens. Deepinfra serves Kimi at $0.75 in, $3.50 out, and about the same at openrouter. So you're looking at a 4-7x multiple that Anthropic is charging compared to market rates that any plebe can get with a credit card.

That's not saying much. You can get "cloud" at AWS and you can get a VPS. There is likely a 10x difference. It's not "same". Whilst AWS costs more they also don't have 7x margins similarly.


I’m wary of “has not been leaked in a way that was picked up in public news” as proof or disproof of anything.

this is changing soon

Not really, how much of a public company are you when 5% of your capital is public ?

That doesn't matter for the legal requirements.

The short and only kind of wrong version is:

In the US, companies are not allowed to unfairly privilege some investors over others by giving them access to secret information that would let them judge the future prospects of the company. (Except in all the ways they can, but these usually involve some kinds of insider trading rules.) Private companies can handle giving out secrets to investors by literally writing and memo and mailing it to all their investors, if they want to give out some secrets to one of them.

Public companies cannot do that, even if they knew who all their investors were, but must instead consider every member of the public a potential investor, even if they don't already own the stock. Because of this, when public companies want to reveal material information about their future prospects, they must reveal it to everyone.


The percentage is irrelevant for this discussion. As soon as you’re public, you need to report detailed financial numbers.

Plus, you have to do real GAAP accounting, not their made up metrics.

Besides the legal requirement, the reason these companies go public is often to provide liquidity for early investors or employees. So they do want to have as good of a margin story that they can, at least in terms of unit margin.

That's changing with this administration though. Reduced reporting cycles reduce transparency.

It won't impact the disclosure of key business details because it doesn't reduce the level of disclosure needed in the S-1 or the 10-K.

This is an interesting anomaly in the US. In the civilised world all corporations have to file public accounts, as the price for their limited liability. The detail and audit requirements depend on the size, turnover, staff numbers etc. This is because the shareholders are not the only stakeholder. The companies creditors, for instance, who are exposed to the limited liability have a right to see what they are lending to.

To answer the sibling comment, all of these public accounts follow local GAAP or IFRS.

The US still astounds me with its willingness to allow corporations to rip people off!


Creditors in the US can make visibility into financials a requirement for financing if they want. Protecting creditors isn’t a good argument for public reporting.

What about potential employees, can they look? The local community that consents to let the company build and operate in their town? How does that help, if they don't follow have to follow GAAP anyway?

Why are those things relevant to either employees or a town?

Most of the US is at-will so the financial health of the company is unlikely to be the reason you’ll suddenly lose a job.

Same for a town, if you’re structuring a deal that has counterparty risk then you mitigate the risk. If an employer is just leasing some office space in your town, why in the world would you ever even think you had the need to look at their financials?


What are the arguments against public reporting?

As a consumer you are often sending deposits or even the full cost of goods to companies some time before you receive those goods (in effect you become a creditor). You are also dependent upon some of those companies for service and repairs. It seems reasonable that you can check the finances of a company you are creating a business relationship with, I know in the past I've checked company statements.

You are unlikely to have significant enough sway to force that kind of disclosure. Small businesses as consumers have less legal protection and are similarly unlikely to be able to make disclosure a precondition of a deal.


So what. As a customer you can insist on seeing audited financial statements as a condition of purchasing, or purchase from another vendor, or do without. No problem.

Or, in the real world, running a limited liability company could come with some sensible reporting requirements?

Why? And what's sensible about it?

Isn't there a limit on the public markets where if a company has less than a certain percentage of its ownership traded publicly then it is no longer a public company and therefore de-listed?

I remember hearing about a guy trying to squeeze out short sellers of his own company but ended up effectively taking his company private because he bought out like 95% of all the shares.

I wonder how that aligns to these small releases of stock for the public.


There is no legal minimum free float requirement before deregistration in US, however, different exchanges have different rules

Essentially, a stock has to stay above 1$ per share, have a minimum market cap of $15m, minimum 400 shareholders and "adequate" liquidity If it meets those 4 criteria, it's essentially not at risk of deregistration


Growing companies don't brag about their margins, they brag about their growth and revenue. Margin talk is for when you're a mature company squeezing out every bit of profitability you can - if anything it would be a negative sign to be worrying about your margins when you're supposed to still be growing and innovating.

I mean, did anyone expect them to not have margins? Why keep it secret?

> Why on earth would AI labs be bragging about how little the product they sell actually costs them to make? You don't want to do anything that reduces it's perceived value to the user, that might make them less willing to pay for it.

Wouldn't they be bragging about it to investors? It feels like something that would matter a lot to them, and at least OpenAI kinda feels desperate to find them.

There's also the small question about whether a drop in inference cost would actually change anything about profitability, when training seems to get exponentially more expensive.


Because companies that want to go public need to look profitable or potentially profitable. And before they go public they have to release real, actual, legally demonstrable numbers for their costs and revenue anyway.

When they will actually file to go public, their numbers will be intensely scrutinized. That's all that global headlines will be talking about for weeks on end. Why would they create forward expectations before it's necessary?

Of course they don't want to create forward expectations in a volatile macro environment, with the public listing being 6 months out.


Because the most important thing for any pure play AI company right now is to prove they are a viable company. And sure they have proved they can make billions, but also that they can lose billions more. They are going to need even more money and to prove to the next round of investors at an even higher valuation that they are a viable business they need to show not that they can generate revenue, but that they can one day turn a healthy profit. And that is the trillion dollar question.

I doubt having to replace every single chip in your data center every time you release a new model will bring down costs.

Went to that URL asked one question - "how is this different from other AI" and it took 598/6144 tokens, not sure what that means.

Not super clear from the site itself, but this LLM is running on specialized silicon implementing just it. So has super low energy use and blazing speed.

See https://taalas.com/products/

Edit: updated link


Incredible increase over Nvidia! Need to read more.. Thanks!

Because they can think more than one quarter into the future? Why on earth would someone adopt something into their core workflow that was fantastically unprofitable? Uncertainty and business don’t mix. Most people aren’t hype-eating bacteria that only care about maximizing their next paycheck.

One reason is that all the code you write with this goes in your private git. If using AI no longer is possible because of cost, you can still profit a lot from what you did with it before.

For consultants? Sure. What percentage of contractors are consultants? And is that better than going with something in your stack that’s sustainable even if it’s not totally optimal? I’d wager most would say no.

Regardless of profitability there will always be multiple good LLM vendors as well as open-source alternatives (slightly worse but still pretty good). If one vendor fails then it's easy to switch your core workflow to a competitor.

On an individual basis for coding? Sure. If you’re a significant business with agents that do more nuanced work, which is the only kind of customer that will let any of these companies pay back those trillions of dollars as quickly as they need to to stay alive, these are not fungible services.

I wonder if inference costs will go down...

or will it be like microsoft office, where the software bloats to use/fill current hardware?

(and in this case bloats might mean better thinking or pulling in more data)


If inference costs drop 90% or whatever, that would be a massive write-off of hardware even before they gave any returns for it?! Given Chinese and others are snapping at the heels and would also benefit from such reduction in cost.

> Why on earth would AI labs be bragging about how little the product they sell actually costs them to make?

Investor confidence. They have a bit of a need for cash (also an interesting part of the profitability discussion of course).

> Also, inference costs are bound to go way down with more optimized architectures

I agree. Jimmy is incredible, I wonder what non-toy use cases they have. Surely they’ll come out with updated chips soon.

That said, I was apparently a bit over-excited for Groq and Cerebras. I thought they’d quickly dethrone Nvidia for inference, but not so far. Even the GPT spark trial isn’t seeming to go far.


Inference has traditionally been far less expensive than training. One public example is the fact that hobbyists can run StableDiffusion ($600k training costs[1]) on their personal computers.

Speaking to your point, inference being dramatically less costly than training would not be seen as a delta from the norm. The model of providing inference for anything near the operational costs (like a utility would), would the delta from the norm if it were true.

[1] https://x.com/emostaque/status/1563870674111832066


The difference between training and inference is 1) one have to keep intermediate results for backward pass in training and 2) computation for training double because of the backward pass.

Training is also done over batches, which increase memory requirements by several orders of magnitude. This is why training needs costly compute.

One of the ways out of this unfortunate situation is to use something like Stochastic Average Gradient Descent [1]. Examples there are mostly concerned with regularized logistic regression, which makes problem more or less convex. Neural networks are inherently non-convex. Still, maybe some ideas from there can be utilized in the context of neural networks, like use of estimated Lipshitz constant to derive curvature and appropriate learning step.

  [1] https://www.cs.ubc.ca/~schmidtm/Courses/540-W19/L12.pdf

So one way to think about it is roughly,

Training is inference + backwards pass (~2x inference cost) + activations (vram overhead) + optimizer (vram overhead) + gradients (vram overhead).


Multiply "inference + backwards pass (~2x inference cost) + activations (vram overhead)" by batch size (thousands) to get to the actual RAM and compute cost. Optimizer like ADAM adds only two or three model-sized overhead.

And last, but not least, you need only one hidden layer kept in RAM for inference, but you need all of them (61 for Deepseek models) kept in RAM for computing gradient for one sample.


Microbatch size is a hyperparameter, it can be set to 1 and work just as effectively. With gradient accumulation it's equivalent even. Large batch sizes are used to increase parallelism, and sometimes to reduce variance in the loss signal (at the cost of increased bias).

Batch size is frequently limited by compute bottlenecks well before memory.


And of course you do all of this for every object in your training set, which is going to be larger than the total number of uses for any individual user.

Does it matter what is the difference in size of needed inputs for inference vs. training?

It's all got much more complex than that in recent years. Training now involves large amounts of inference for RL rollouts and similar. You can't disentangle them computationally like that. "Inference" is just the word used to mean serving customer traffic now, and "training" means creating the model you serve.

That is an estimate of the relative cost of one training step, but you have to multiply it by the number of training steps, an unknown quantity.

I think in your StableDiffusion example, a lot more than $600k will have been spend on electricity alone for inference (on those personal computers you mention). So inference is more expensive then training.

For equal capability tokens, there has been about a 10x drop in cost every 6 months.

We are still chasing the best because the best is moving rapidly, but it’s a simple thought experiment to work out what the cost to serve an 8B model from 2 years ago is in a world of 2T models.

Note: parameter counts are illustrative. Concretely, qwen3.6 27B delivers opus 4.5 capability at 1/27th the cost on openrouter. Single chip llama3 8b performance can exceed 17k tokens/sec.


8B models would be consider obsolete in the world of 2T models, at least if we're talking about the competitiveness of OpenAI/Anthropic. The only reason why they are valued so highly is their supposed dominance at the top end.

The main story of agent use cases is in enterprise so far. An enterprise will only pay for a model capable of handling the task and no more. Most enterprise's see no need to hire PhDs as factory line workers.

Coding is an interesting case as [1] the pace of progress has been absurd and [2] it's hard to put an upper bound on required capability. However hard to put a bound on and will are different, it's quite possible that the average engineer will cease to see the benefit of rapid progress - or that their employer will be satisfied with lower tier models.

How smart of a model do you need to build a high quality CRUD app for internal users? Or build a scalable web service?


yes, which is why the revenue growth story is not looking so great for Anthropic/OpenAI, when open-source alternatives are not far behind with much lower costs.

> For equal capability tokens, there has been about a 10x drop in cost every 6 months

Is this still happening? Opus 4.5 was six months ago, can you get its capabilities for 1/10 cost now? Are we on track to get the same for 4.6 in a couple months?


Pretty much, Kimi K2.6 is opus 4.6 quality for coding. If you include discounts due to more efficient input caching it is around 1/10th of opus4.6.

https://openrouter.ai/moonshotai/kimi-k2.6

The march of cost efficiency moves on.


Why haven’t I heard of this? Is it available in IDEs like Cursor?

> I am a bit skeptical of the supposed drop in inference costs. If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop?

Unless to the grandparent commenter’s point they’re using it to obscure their large prisoner’s dilemma (training) cost?


> If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop?

Google seems to pretty regularly post about how their TPU and algorithm advancements have been decreasing energy costs for both inference and training.


What other companies brag about lowered costs? Isn’t that just a complicated way of asking customers to demand lower prices?

I get what you’re saying, but the housing market is actually a really subtle issue in my opinion.

Just one example, owning a home protects you against price shocks. As others have pointed out, this can sometimes be a bad thing, because when prices decrease you are also leveraged.

But it’s pretty important to a lot of middle class people that they are protected against forced relocation due to 5x housing price increases.

Of course, there’s other reasons to not own a home.

My point is that localized housing markets have all sorts of factors that are perfectly explainable by economic theory but aren’t just “Econ 101, run the supply and demand” curve.


This brings to mind two thoughts:

First, that this is challenging to scale across large orgs. Even if your plans produce high quality code, that isn’t true for everyone. I’m definitely struggling with slop code being collectively mailed to me for review my our 1,000 engineers that were told to use their AI subscription all at once.

I feel like we should be taking “prompt engineering” more seriously. And when people mail me code to review, it should also include the agentic workflow and plan. So that when code isn’t up to quality, and can have a discussion about the prompts used to generate it.

My second thought is related to your senior engineer comment. This isn’t surprising, because in most engineering orgs, seniority is completely unrelated to code quality. In fact, many orgs incentive the opposite: “senior” devs that push out buggy code quickly and push accountability downhill to the junior devs.


I'm so curious to see how other people prompt but literally no one I work with will share it. They might share plans, but they never show the conversation, which is the most crucial part.

Judging by how they struggle to communicate generally, I can't imagine their prompts are doing much heavy lifting.


Eh, everything is challenging to scale across large orgs. Even before LLMs, the code was a huge ball of spaghetti that barely held together. Now we just get there faster.

About senior engineers, I guess that depends on the org you have experience with. My experience doesn't match yours.


He didn’t hedge at the end. Nate always writes the models before election season then doesn’t touch them apart from actual bug fixes. The model actually organically predicted 30%.

I still think that’s about accurate. Maybe it should’ve been 40%.

Everyone forgets that it was a pretty close election. Clinton could’ve won without the Comey announcement.


I think he did hedge (or "strategically bug fix"). The prediction for Trump went from IIRC around 15 to 30 in the last week or so. It was a big swing, IIRC with a lot of waffle around why it happened but not a lot of verifiable fact.

> I still think that’s about accurate. Maybe it should’ve been 40%.

It wasn't accurate. This is something people misunderstand about these predictions. If the 2016 election was held 100 times, Trump would have won 100 times. It's not the same as rolling dice.

These election predictions don't say that. They say something like "the observations I have agree with scenarios that have Clinton winning, 70% of the time". Which is fine and correct as far as his data and model goes, but none of those scenarios were the reality he was trying to predict. They are all just figments of the model though. Getting down to the brass tacks, he predicted Clinton would win, and he was wrong.

Which is fine, we just can't know anything about his process from that failure. Certainly we can't conclude that it was "accurate", since it was not. If we had a good sample of elections where he used the same process and built up a good record then sure.


That's where you're wrong, the election was very, very close. In fact, if roughly 40k voters (across three states) had switched from Trump to Hillary, she would have won, that's how close it was.

40k voters, that's really not very many. So it's hard to say whether Trump had a 30% chance of winning or 40% or whatever, but the election at most was a toss-up.

Many random events could have resulted in a different outcome.


You misunderstand my point. I am talking about the actual election that happened where these many random events that could have resulted in a different outcome did not happen. I was being a bit facetious maybe in my point. But the point is that the thing that is to be predicted is the actual real event that occurs in this universe. Silver made a prediction, and it was wrong.

"Oh but it was only a 70% prediction"

You can't 70% win an election. Silver's prediction was that Clinton would win, but he was not super confident about it. The prediction was wrong. He was right to not be super confident about it, but the prediction of who would win was still wrong.


Statistical likelihood is a measurement of the known data at the time. If you engage with the content otherwise then it's on you if you have the wrong takeaway. No one who makes a prediction based on a statistical model is going to be right every time. That doesn't mean it's not right to make a prediction. The statistical modeling can help you to be correct more often than not. And if you were going to be truly fair you would note that Nate in fact repeatedly said that it was still very much possible for Trump to win but that the current known polling data and other factors in his model pointed to a loss.

538's own post-mortem's on the event highlight that Trump was a very unusual candidate running in a very unusual election and as such the model was missing a lot of important information. They learned from the experience and adjusted the model going forward. Anyone complaining about that event is really just highlighting that they don't understand how statistical modeling works and are upset about how the model misled them or others which isn't Nate or 538's fault and is entirely on the consumer of their reporting. It's not like they didn't try to educate their consumers in their reporting.


I know what statistical likelihood is. I don't have a problem with them using a model or models and doing some statistics on it to develop these predictions, or even necessarily with the way they report their predictions as a % chance to win. I have a problem with the insinuation that "70% Clinton" is somehow a prediction of a singular real event or that Trump winning is consistent with said prediction "because if we held another 99 of those 2016 elections then Clinton probably would have won about 70 of them therefore I was right".

The prediction is for one single outcome at one point in time. The prediction can not be that Clinton 70% wins it, or wins it 70 out of every 100 times because there is no 100 2016 elections. Those things may apply to his mathematical models, but obviously the models are attempting to predict the real world. Try to weasel out of it as much as you like, but the prediction was that Clinton would win, and the prediction was wrong.

"Oh he was only giving the odds for his model, you don't understand it's your fault he mislead you" -- no. Every analyst and pundit has a model or a system, obviously nobody thinks any of them can see the future. Nate Silver was very explicitly predicting the outcome of the election. As you can see from all his commentary articles that came out along with the numbers.

And yes, 538's vaunted models and data science fell over when encountering situations that had not been seen or anticipated or built on before, obviously. We didn't need Einstein or even Nate Silver to tell us that. That's the problem isn't it. All this hamming up of "data science" and "mathematical models" is meaningless. Your data and math can be perfect and correct, but if they fail to provide an understanding of the world, then they are perfectly useless.


You are asserting an insinuation that 538 never made. That is the disconnect here.

No I'm not.

Just want to say, I appreciate your pragmatic perspective on this. Nate Silver had one job: Predict who would win. And he failed at that. With lots of hand waving he can excuse himself but at the end of the day his visitors wanted an answer and he gave them the wrong answer.

That's the beauty of this brand of pseudoscience. Statistical predictions of singular events like a particular election are totally unfalsifiable. You can just say "I guess we live in 30% world" or whatever, every time.

> Statistical predictions of singular events like a particular election are totally unfalsifiable.

Yes. And the 2nd Law of Thermodynamics was just violated by millions of atoms within my lungs, that happened to increase in energy above the ambient average due to collisions. Clearly thermodynamics is pseudoscience, too!


To give you a trivial example: The simplest way I can put this is that turn out varies based on the weather[1], and turn out is skewed by party. So if it rains on election day you are going to get a different result, and that result can flip the outcome of the election if the election is close. So it’s kind of a nonsense to say. “Trump would have won 100 times out of 100”. Are you saying Nate Silvers model should have had a perfect meteorological model to predict the weather? Or are you saying the election wasn’t close? In which case you’re just wrong on the facts.

The 70% figure is saying “we know most of the information needed to determine what the outcome of the election will be but we don’t know everything so can’t be certain”. There is no process where you can know every factor that determines the result in advance with absolutely accuracy and I don’t know why people expect there would be.

[1] https://www.sciencedirect.com/science/article/pii/S026137942...


It's not nonsense. What's nonsense is to say Nate's prediction for the election was accurate or correct. It trivially was not.

What it would be reasonable to say is if his model had correctly predicted the outcome of a significant sample of elections, then you could say his model has some accuracy or predictive power. But it still would never have been accurate or right in the specific instances it got wrong, that's just a misconception about how statistics and predictive models work. I hope this helps.


What are you even classifying as accurate or correct? Do you take every 51% prediction from FiveThirtyEight and if the result is a win you consider that forecast accurate? And every 49% prediction must result in a loss? This just not how statistical forecasts work.

>What it would be reasonable to say is if his model had correctly predicted the outcome of a significant sample of elections, then you could say his model has some accuracy or predictive power.

I don't know why you're couching that in a hypothetical, FiveThirtyEight has repeatedly done that exercise.

>But it still would never have been accurate or right in the specific instances it got wrong

It is core to the concept of a probability that the result is going to go the opposite way from the prediction sometimes! It's meaningless to call it "wrong".


> What are you even classifying as accurate or correct?

When somebody gives a prediction of the outcome of an election? I classify it as correct if they predicted the candidate who wins.

> Do you take every 51% prediction from FiveThirtyEight and if the result is a win you consider that forecast accurate? And every 49% prediction must result in a loss? This just not how statistical forecasts work.

No, but it is the way to map statistical forecasts to reality. He was quite explicitly predicting the outcome of the actual election. That prediction was incorrect.

The whole rating of the accuracy of these models is really snakeoil dressed up as science. There is a lot less rigorous science and a lot more feelings and adjusting numbers and twiddling formulas retrospectively than you were probably led to believe.

Would a 99-1 for Trump model have been worse or less accurate than a 51-49 for Clinton model? Despite predicting the correct outcome whereas the Clinton model predicted the incorrect outcome?

> I don't know why you're couching that in a hypothetical, FiveThirtyEight has repeatedly done that exercise.

Not really with much rigor. Where are their reproducible published papers and data sets? They made their name with a bit of luck on a fairly predictable election, but were unable to show a significant advantage in their methods across a number of elections.

> It is core to the concept of a probability that the result is going to go the opposite way from the prediction sometimes! It's meaningless to call it "wrong".

No no, that's not true. There are two different things here. Firstly, if you had a model and method of predicting elections that you applied to a sample of elections and showed that it had a good ability to correctly predict, then you can say your model is a good prediction across typical elections. The model getting one wrong does not make it a bad model over a set of elections. It absolutely is wrong for that particular election though. And secondly if you use a model to make a prediction about a particular election, when your prediction turns out to be wrong, it was not retroactively correct because it just followed the model and you claim the model is good. That's just not how statistics or predictions work.


It's so interesting to see how someone could so confidentially wrong and clearly show no knowledge of statistics.

Also, they don’t any plans for the IP, and Nate would’ve paid above-market rate just to take over and preserve the content for posterity. He estimates that they deleted 200,000 hours of human labor.

This is just some Disney suits being extraordinarily petty.


Yes, just to add to this: in the article by Nate [0] he says that he tried to buy the IP but Disney refused because they were unhappy with some of his prior comments.

"I did approach Disney a year or two ago, through my agent, about acquiring the remaining IP. ...

We were told to basically get lost: ABC was annoyed with my critical public comments about their management of FiveThirtyEight. It apparently wasn’t a long conversation, so I don’t have a lot more color to report than that."

[0] https://news.ycombinator.com/item?id=48197703


Pride comes before the fall. Sorry, Nate.

Which brings us to the next layer of modern dependency management insanity:

The fact that basically none of these multi-million dollar companies are vendoring their entire dependency tree.

At most companies, even ones worth millions of dollars, it would be impossible for them to rebuild their software if someone ripped a package off of npm’s registry or whatever.


As recently as 2015 when I attended a middling CS program, we had in-person timed exams where we had to write down DSA implementations on a blank sheet of paper in Java.

We were deducted points for trivial syntax mistakes.

If these stories I keep hearing are true, then university programs have really taken a nose dive recently. This isn’t a “back in my day” thing, but within the past 5 years.

The pace of the purported decline makes me question if some of these stories are sensationalist. But I don’t know, I keep hearing about them.


I'm not sure I understand your comment. Surely you don't think that the details of a particular programming language's syntax are an appropriate criteria for grading an exam? That seems crazy.


Rethink it in written language:

> Surely you don't think that the details of a particular written language's syntax are an appropriate criteria for grading an exam?

Computer science is the science of computing. Programming languages are the language used to implement computer science. Therefore you would expect that students accurately use the programming language to answer questions about computing. Seems reasonable to me.


You don't need programming languages to implement computer science. Pseudo code suffices for exams.


You don't need programming languages to DESCRIBE computer science but to implement it you need some programming language.

Quite literally an "implementation detail."


If instructors are testing implementation details on paper exams then they're really missing the point of CS education. Completely lazy and incompetent, should be terminated.


It's a balancing act.

Some portion of computer science education needs to be practical (implementation details), while some portion needs to be pure computer science (pseudo code).

Obviously projects are a good way to measure implementation details, but they are too easily cheated. Every class I took had exams as 80% or more of the grade. Not every class expected accurate syntax on exams, but most expected code rather than pseudo code (typically C).


Sounds fair to me so long as students were aware going into the test that syntax would be graded


Fairness or lack thereof is not the point. Programming language syntax is trade school stuff. And I don't mean that as a slight against trade schools, but it's a different type of training.


One important piece of context that might make all these stories less confusing for non-googlers:

Code references are less important inside Google editors, because we have a code viewer tool inside the web browser.

Most people read, explore, follow references, and share permalinks to the view-only tool. It’s a lot better than viewing code in GitHub. It’s super fast, is connected to language servers and can actually trace referenced, and overall has a million little features optimized for reading code.

We also have a code reviewer tool, and a separate tool to run and view CI runs.

So what’s left for the editor? Syntax highlighting?

I would tend to view code, run tests and CI, and review in separate tools specialized for their specific use case. The code editor was just a place where I would type in my changes.

I’d imagine this workflow feels weird to people who learned in one-stop-shop IntelliJ and GitHub world. But I can’t emphasize how much better these other tools were compared to GitHib. So a code editor that also lets me read, review, and test code didn’t really matter for me when I had a collection of smaller tools specialized for each individual task.


To make this more concrete, the Chromium source code browser has a subset of the functionality of the internal Code Search tool. For example, you can left click on symbols to go to reference and right click to find all references:

https://source.chromium.org/chromium/chromium/src/+/main:ipc...


In fact a lot of Google software projects have a public version of code search: https://cs.opensource.google/


How is this so much faster than browsing my tiny little repo on Github? What is Github doing so wrong??


Well, indexing and searching is kind of Google's thing. It also helps that they're using a unified build system. They can instrument the compiler to get cross references from the build instead of trying to figure out out by parsing the text themselves.

They open sourced the tool to do it- https://kythe.io - but I think it would be a pain to make it work for anything like GitHub that supports arbitrary languages and build systems with untrusted code


Wow that is a lot faster and nicer than Github.

This is a good example of large companies wouldn't send someone across the street to pick up $1M off the ground. If Google actually released that and a repo to public, they could take Githubs throne. But a few $B business isn't worthit for them.


It's not that the $2B business isn't interesting, it's competing with GitHub would be a major undertaking and the opportunity cost of doing that is probably more than $2B.


Brings back lots of memories. The only way i used that tool was back when Chrome stopped forever history near about the year 2014-2015 and limited users to 90 days of history.

I was trying to wrap around my head like reading the comments trying how to put it back in. Wrong strategy without serious review does more of harm.

More than a decade of this fiasco Google Chrome should bring it back forever history, now they dont have an excuse for this, what i hear (google Chrome now downloads an offline LLM without considering data charges or space requirements in edge deployed server,etc.), and it will help users themselves as now most browsers except Firefox or Safari are Chromium-Based and they too inherited this shiny features with not-so obvious limitations.

The History bug in point - https://issues.chromium.org/issues/40358997


Brings back lots of memories. The only way i used that tool was back when Chrome stopped forever history near about the year 2014-2015 and limited users to 90 days of history.

I was trying to wrap around my head like reading the comments trying how to put it back in. Wrong strategy without serious review does more of harm.

More than a decade of this fiasco Google Chrome should bring it back forever history, now they dont have an excuse for this, what i hear (google Chrome now downloads an offline LLM without considering data charges or space requirements in edge deployed server,etc.), and it will help users themselves as now most browsers except Firefox or Safari are Chromium-Based and they too inherited this shiny features with not-so obvious limitations.

The History bug in point - https://issues.chromium.org/issues/40358997

Fun read seeing even with all the tools in the world there is not enough soul or will to fix a very minor regression made by human error of judgement and other companies just accept that like their lives depend on it.

Good thing after reading the above article came to know its possible to use this https://github.com/ungoogled-software/ungoogled-chromium has a flag to store old history.


This is so lovely to use.


> It’s super fast, is connected to language servers and can actually trace referenced

Nit: not connected to language servers, it's connected to Kythe. LSP doesn't have the same kind of functionality.


This...I noticed a real productivity increase when going from Cider/VSCode to JetBrains/IDEA/IntelliJ for Kotlin code editing. Having a "real" IDE was still a plus, if just for the better code completion.

AI has mostly changed the way I write code, I guess, so I rarely use JetBrains anymore, but a few years ago it was clearly a win to use a real IDE at least for Kotlin programming.


JetBrains has several niches it excels in. DataGrip is by far the most important tool in my toolbox, as it allows me to work with every database type imaginable in one place (Databricks, Postgres, MSSQL, Oracle, etc.).


What tools available to the public would you say is similar to this workflow?


Sourcegraph is the closest external thing I've found to Google's internal web tool for viewing code.


The startup I'm at (ersc.io) is working in this space (version control more than the IDE side of things), because, in my opinion, there just plain isn't any.


FWIW your website is broken in both Firefox and Chrome. Huge rounded black boxes, social media icons larger than the screen, e.g. blog posts get 404 with an XML NoSuchKey error.


The only thing remotely close is a monorepo checkout ... with all the problems that come with that.


To be fair, code search still sucked for navigation comparing to IDE even in 2024. Even cider-v was rather so-so if you needed to navigate complex c++ code.

But I still remember days "edit in IDEA, debug in Eclipse"


Every large tech company of 80s, 90s & 2K (Google, eBay & all) have similar history when it comes to IDE, Source Control, Build Systems etc.,. This is not specific to Google.


If you want a laugh, Google a tutorial for how to read a file. You should also know that all the tutorials are wrong, because they fail to handle at least one footgun or another.

There is no “modern” alternative. If you read Reddit threads, C++ programmers actually believe that it’s a reasonable file reading API.

Most companies that I’ve worked at have just implemented our own on top of the OS syscalls. Which is annoying because it requires at least a Windows and UNIX variant.

Look, I like C++. I’ve been programming in it for years. But some of the stereotypes around C++ programmers are true. I still occasionally run into design decisions so untethered from reality that it still shocks me after all these years.


QFile is mostly fine (it's probably not the fastest, but not slow), and QString is pretty comfy for all kinds of string manipulation :)


C++ has a filesystem API? TIL, never used it.


filesystem api was introduced in C++17


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: