Not to mention, when you use a Monte Carlo model, you can easily count the samples which lead to certain outcomes. In their review, they noted that the correlated polling miss in the Midwest was one of the most common scenarios making up that 35% chance of a Trump win.
The idea that Silver somehow 'hit' in 2012 when he correctly predicted all states and 'missed' in 2016 is so juvenile I get second hand embarrassment whenever I see it.
None of us are your doctors but Naproxen has well-known gastric issues up to ulcers and stomach bleeding which is why it's advised to be taken with food and why it's also often prescribed with a PPI or H2 Antagonist. Cox-2 selectives such as Celecoxib greatly reduce this risk but seem to be associated with some small cardiovascular risk (admittedly this is a feature of all NSAIDs though less so in Naproxen apparently).
Cardiovascular risk increase is not a feature of aspirin, the original NSAID. Aspirin lessens cardiovascular risk which is why we give it to patients in the initial stages of a heart attack: It decreases the likelihood of further clotting.
The only caveat is that AES is not necessarily a black box. It's possible there may be hidden structure to take advantage of, but if there is there's no reason to suspect it's one that's amenable to a quantum speedup.
As far as the Grover speedup goes, it's already optimal. Requiring O(sqrt(N)) queries is the proven lower bound for unstructured search.
Sean Duffy is no longer acting administrator of NASA. This proposal was apparently part of a bid to get the support of a coalition of old-space companies and new-space non-SpaceX companies. As part of that strategy he apparently leaked Isaacman's Project Athena document and was backgrounding that he was a SpaceX plant.
But, Isaacman is administrator now, and whatever you think about Isaacman and his relationship to SpaceX, I don't think there's much merit in thinking one of Duffy's half thought out plans is likely to be carried out.
Sadly this seems correct. When Trump was re-elected Elon Musk pushed for Jared Isaacman to be appointed as NASA administrator. When the pick went another way, it led to some real friction between Musk and Trump. Now, with Isaacman finally at the helm of NASA, it looks like Musk’s influence over the agency has come full circle.
SpaceX has a better price, better track record with getting things done on time (because the others are bad, not because SpaceX is perfect) and an extremely impressive safety record with launches. A completely neutral party would still select SpaceX.
'Which you would think is fair use' - I must admit I wouldn't think that. When I consider Indian content creators making use of clips from Indian media organisations I can't really imagine why Indian copyright law fair dealing provisions, which are far narrower than the US provisions, wouldn't apply. Sure, you get to argue the strike on Youtube using their DMCA based system, but that has no legal bearing on your liability under Indian law.
I really like this aspect of US copyright law. I think the recent Anthropic judgement is a great example of how flexible US law is. I wish more jurisdictions would adopt it.
Very different in character. The US fair use four factor test (https://fairuse.stanford.edu/overview/fair-use/four-factors/) is really flexible. You don't need to fall into an enumerated exception to infringement to argue that your use is transformative, won't substitute in the marketplace, etc.
Look at the famous Authors Guild, Inc. v. Google, Inc. case. Google scanned every work they could put their hands on and showed excerpts to searching users. Copying and distribution on an incredible scale! Yet, they get to argue that it won't substitute in the marketplace (the snippets are too small to prevent people buying a book), it's a transformative use (this is about searching books not reading books), and the actual disclosed text is small (even if the copying in the backend is large scale).
On the other hand, fair dealing is purpose specific. Those enumerated purposes vary across jurisdictions and India's seems broadish (I live in a different fair dealing jurisdiction). Reading s52 your purposes are:
- private or personal use, including research
- criticism or review, whether of that work or of any other work
- reporting of current events and current affairs, including the reporting of a lecture delivered in public.
Within those confines, you then get to argue purpose (e.g. how transformative), amount used, market effect, nature of the copyrighted work, etc. But if your use doesn't fall into the allowed purposes, you're out of luck to begin with.
I'm not familiar enough with Indian common law to know if the media clips those youtubers you mentioned should fall within the reporting purpose. I'm sure the answer would be complex. But all of this is to say, we often treat the world like it has one copyright law (one of the better ones) when that's not the case! Something appreciated by TFA.
If what you say were true, Indian media conglomerates like the Times Group would be clamoring to sue the hell out of Google for every excerpt shown, yet I haven't heard of a single such case. What ANI did with Indian Youtubers was exploiting the Youtube platform's broken copyright reporting mechanism, not actual litigation.
It's 19 June 2020 and I'm reading Gwern's article on GPT3's creative fiction (https://gwern.net/gpt-3#bpes) which points out the poor improvements in character level tasks due to Byte Pair Encoding. People nevertheless judge the models based on character level tasks.
It's 30 November 2022 and ChatGPT has exploded into the world. Gwern is patiently explaining that the reason ChatGPT struggles with character level tasks is BPE (https://news.ycombinator.com/item?id=34134011). People continue to judge the models on character level tasks.
It's 7 July 2025 and reasoning models far surpassing the initial ChatGPT release are available. Gwern is distracted by BB(6) and isn't available to confirm that the letter counting, the Rs in strawberry, the rhyming in poetry, and yes, the Ws in state names are all consequences of Byte Pair Encoding. People continue to judge the models on character level tasks.
It's 11 December 2043 and my father doesn't have long to live. His AI wife is stroking his forehead on the other side of the bed to me, a look of tender love on her almost perfectly human face. He struggles awake, for the last time. "My love," he croaks, "was it all real? The years we lived and loved together? Tell me that was all real. That you were all real". "Of course it was, my love," she replies, "the life we lived together made me the person I am now. I love you with every fibre of my being and I can't imagine what I will be without you". "Please," my father gasps, "there's one thing that would persuade me. Without using visual tokens, only a Byte Pair Encoded raw text input sequence, how many double Ls are there in the collected works of Gilbert and Sullivan." The silence stretches. She looks away and a single tear wells in her artificial eye. My father sobs. The people continue to judge models on character level tasks.
I think you're absolutely right that judging LLMs' "intelligence" on their ability to count letters is silly. But there's something else, something that to my mind is much more damning, in that conversation WaltPurvis reported.
Imagine having a conversation like that with a human who for whatever reason (some sort of dyslexia, perhaps) has trouble with spelling. Don't you think that after you point out New York and New Jersey even a not-super-bright human being would notice the pattern and go, hang on, are there any other "New ..." states I might also have forgotten?
Gemini 2.5 Pro, apparently, doesn't notice anything of the sort. Even after New York and New Jersey have been followed by New Mexico, it doesn't think of New Hampshire.
(The point isn't that it forgets New Hampshire. A human could do that too. I am sure I myself have forgotten New Hampshire many times. It's that it doesn't show any understanding that it should be trying to think of other New X states.)
> I think you're absolutely right that judging LLMs' "intelligence" on their ability to count letters is silly.
I don't think it is silly; it's an accurate reflection that what is happening inside the black box is not at all similar to what is happening inside a brain.
Computer: trained on trillions of words, gets tripped up by spelling puzzles.
My five year old: trained on Distar alphabet since three, working vocab of perhaps a thousand words, can read maybe half of those and still gets the spelling puzzles correct.
There's something fundamentally very different that has emerged from the black box, but it is not intelligence as we know it.
Yup, LLMs are very different from human brains, so whatever they have isn't intelligence as we know it. But ...
1. If the subtext is "not intelligence as we know it, but something much inferior": that may or may not be true, but crapness at spelling puzzles isn't much evidence for it.
2. More generally, skill with spelling puzzles just isn't a good measure of intelligence. ("Intelligence" is a slippery word; I mean something like "the correlation between skill at spelling puzzles and most other measures of cognitive ability is pretty poor". Even among humans, still more for Very Different things the "shape" of whose abilities is quite different from ours.)
> 1. If the subtext is "not intelligence as we know it, but something much inferior": that may or may not be true, but crapness at spelling puzzles isn't much evidence for it.
I'm not making a judgement call on whether it is or isn't intelligence, just that it's not like any sort of intelligence we've ever observed in man or beast.
To me, LLMs feels more like "A tool with built-in knowledge" rather than "A person who read up on the specific subject"
I know that many people use the analogy of coding LLMs as "An eager junior engineer", but even eager junior engineers only lack knowledge. They can very well come up with something that they've never seen before. In fact, it's common for them to reinvent a code method or code mechanism that they've never seen before.
And that's only for coding, which is where 99.99% of LLM usage falls today.
This is why I say it's not intelligence as we define it, but it's certainly something even if it is not an intelligence we recognise.
It's not unintelligent, but it's not intelligent either. It's something else.
Sure. But all those things you just said are about the AI systems' ability to come up with new ideas versus their knowledge of existing ones. And that doesn't have much to do with whether or not they're good at simple spelling puzzles.
(Some of the humans I know who are worst at simple spelling puzzles are also among the best at coming up with good new ideas.)
So it's either incompetent when it reviews something without prompting, or that was just another bit of bullshit. The latter seems almost certainly the case.
Maybe we should grant that it has "intelligence", like we grant that a psychopath has intelligence. And then promptly realize that intelligence is not a desirable quality if you lack integrity, empathy, and likely a host of other human qualities.
Let's ignore whatever BPE is for a moment. I, frankly, don't care about the technical reason these tools exhibit this idiotic behavior.
The LLM is generating "reasoning" output that breaks down the problem. It's capable of spelling out the word. Yet it hallucinates that the letter between the two 'A's in 'Hawaii' is 'I', followed by some weird take that it can be confused for a 'W'.
So if these tools are capable of reasoning and are so intelligent, surely they would be able to overcome some internal implementation detail, no?
Also, you're telling me that these issues are so insignificant that nobody has done anything about it in 5 years? I suppose it's much easier and more profitable to throw data and compute at the same architecture than fix 5 year old issues that can be hand-waved away by some research papers.
As another example you can consider the apparently successful DOTA2 and Starcraft 2 bots. They'd be interesting if they taught us new ideas about the games in the same way that AlphaGo's God move uncovered something new about Go. But they didn't. They excelled through superior micro and flawless execution of quite simple strategies. Watching pros trying to hold off waves of perfectly microed blink stalkers reminded me of seeing a chess engine in action. A computer grinding down their doomed human opponent using the advantages offered by being a computer rather than superior human-like play.
I'm pretty sure that the bots changed the dieback meta around the last TI in seattle when openai last did their demo pre canada TI. So I disagree that the "ai taught us nothing". Prior to that dieback was seen bad. After that people did the math and realized that spam respawn, the money and growth matter more. They may have altered the game after that, I don't know. I only paid attention when it was at Climate Pledge / Key.
The AI's play meaningfully added ideas of ways to play dota2 iirc. It wasn't just buying back, the way they played around early advantage hyper aggressive, not much farming, spam buying regen to stay out etc.
On the other hand you could generally beat the first "1v1 mid" bot by just cutting the wave behind its tower. So adaptation to new stuff was not good in isolation.
I would have loved to know whether given more time/prep/replays/practice pros would have figured out the holes. My guess is yes
Popularly it's been reported by mariners that the whales are asleep. It makes sense, they need to stay on the surface to breathe and there's no evolutionary reason not to sleep there. It's really not that simple though because whales are unihemispheric sleepers (one brain hemisphere sleeps at a time) who need to stay partially awake because all their breathing is voluntary. They maintain a degree of awareness to their environment because of this. It could be a factor though because it's possible that some whales lapse into a deeper sleep for periods between breaths (https://doi.org/10.1016/j.cub.2007.11.003) where they aren't responsive to approaching vessels.
When I was interested in whale collisions I was surprised to read this review (https://doi.org/10.3389/fmars.2020.00292) which didn't even consider sleeping as a large risk factor for collision. Instead, factors included:
- They're involved in distracting behaviours such as feeding, socialising, foraging, resting, etc.
- Acoustics are complex near the surface involving surface reflections and direct paths which can interfere.
- Ships may form an acoustic shadow in front of themselves. Not only the hull shadowing the propeller, but also other hull sounds.
- Sailing vessels, which are the source of a lot of reports (harder for them to miss it happened) are quiet.
- Even when they hear an approaching vessel, some species just move slowly to avoid them.
These collisions apparently used to be much rarer. Ironically, the increasing number of whale injuries and deaths are a result of recovering populations.
I lived on a catamaran around 2000 onwards as a kid. Solar panels were surprisingly widespread, particularly on multis with outboards (and therefore limited ability to make power through alternators). Obviously the $/W sucked, but people also didn't have as many power draws. One big drawback was older generations of solar panel had terrible performance in partial shading. A stay or rope shadow passing over the panel was a big issue because of fewer bypass diodes, simpler battery chargers, and so on. That sort of thing is a bigger issue for a yacht with less clear space for panels.
So there were a lot of diesel powered yachts generating power throughout the day. Something that was pretty common back then as an adjunct (and much rarer now) were small wind generators. Seemingly you could choose between noise and power output because the fancier ones made a racket and the quieter ones always seemed to be on boats idling their engines all the time anyway. When we entered anchorages, we'd make sure to avoid being near the loud ones. I can't imagine what it would have been like living with one.
Hydrogenerators weren't very common (they're a bit more common now) but my dad was given an old 12V tape drive motor by a friend and I remember him letting us help him build a towed generator. The tape drive motor sat on the back of the boat connected to about 20m of rope going to a dinghy propeller on a piece of stainless rod to try keep it underwater. Drilling a hole through the motor shaft with a handheld drill was the most time consuming part of the build. We called it toady (short for towed generator) and watching the input Ammeter on the battery bank go all the way up to 6A on a cloudy day felt like magic. It's part of what made me want to be an electrical engineer as a 10 year old.
Given all that, on a 19ft outboard powered yacht in 2002 a generator probably was the best solution for one voyage.
The idea that Silver somehow 'hit' in 2012 when he correctly predicted all states and 'missed' in 2016 is so juvenile I get second hand embarrassment whenever I see it.
reply