More

dktp · 2026-03-03T18:01:08 1772560868

Opus 4.5 became significantly cheaper than Opus 4.1

dktp · 2026-02-24T13:15:45 1771938945

From recent personal examples

We have a somewhat complicated OpenSearch reindexing logic and we had some issue where it happened more regularly than it should. I vibecoded a dashboard visualizing in a graph exactly which index gets reindexed when and into what. Code works, a little rough around the edges. But it serves the purpose and saved me a ton of time

Another example, in an internal project we made a recent change where we need to send specific headers depending on the environment. Mostly GET endpoint where my workflow is checking the API through browser. The list of headers is long, but predetermined. I vibecoded an extension that lets you pick the header and allows me to work with my regular workflow, rather than Postman or cURL or whatever. A little buggy UI, but good enough. The whole team uses it

I'm not a frontend developer and either of these would take me a lot of time to do by hand

dktp · 2026-02-02T22:58:30 1770073110

My best guess is that Nvidia is unhappy with how OpenAI is fishing for compute with its competitors (Jensen had some opinions on the AMD-OpenAI deal when it was announced). If this actually becomes a feasible reality, it gives OpenAI (and co) negotiating power - which is bad for Nvidia

Nvidia might have wanted more exclusivity/attachment. And OpenAI still seems to have no problem raising money. So maybe there was just a commitment mismatch

Pure speculation though

dktp · 2026-02-02T15:48:22 1770047302

I would agree. I've been using VSCode Copilot for the past (nearly) year. And it has gotten significantly better. I also use CC and Antigravity privately - and got access to Cursor (on top of VSCode) at work a month ago

CC is, imo, the best. The rest are largely on pair with each other. The benefit of VSCode and Antigravity is that they have the most generous limits. I ran through Cursor $20 limits in 3 days, where same tier VSCode subscription can last me 2+ weeks

dktp · 2026-01-17T20:30:48 1768681848

From a very entertaining Matt Levine article (https://archive.is/8QYxl)

> In a science fiction story, if you invented a superintelligent robot and asked it how to make money, it might come up with cool never-before-seen ideas, or at least massive fun market manipulation. But in real life, if you train a large language model on the internet and ask it how to make money, it will say “advertising, affiliate shopping links and porn.” That’s the lesson the internet teaches!

But I think it makes a lot of sense for very popular consumer products. In my honest opinion, I much prefer having services like Google, Youtube, Gmail, Maps, ChatGPT etc exist for free, but with ads, rather than not exist at all. Preferably with an option to pay and remove ads

Nowadays I'm happy to pay for Youtube premium or LLM, but back during my student days I could not really afford it - and I'm glad there was a free tier (with ads)

rcMgD2BwE72F · 2026-01-17T21:12:11 1768684331

>In my honest opinion, I much prefer having services like Google, Youtube, Gmail, Maps, ChatGPT

I don't use any of these except YouTube (if only I could find the content elsewhere…) and I still pay for them when I purchase anything advertised on these properties because, of course, the companies advertising on Google makes all their customers pay for the free (lol) services. All advertising expenses are included in the price of the products, even if you never saw any ads.

We could easily charge for each of these services and still have them. Advertising is not necessary at all. It's just a way to make others pay for your services. It's a free riding problem to externalize costs on those who don't partake in the scheme.

Pay your share and don't call free what others will subsidize. Unless if a public service and we collectively agree on the split (vote and taxes, which we can debate publicly)

dktp · 2026-01-17T21:48:27 1768686507

Right. But a good portion of the world can't afford the premium and having access to these services is still valuable. For every broke student or someone from a poor background, who probably don't make any money for the company (due to not buying advertised stuff), there's someone from a well off background, who will more than subsidize it by virtue of clicking on a lawyer ad (or whatever)

Nowadays I'm happy to pay, but that wasn't always the case. And I personally think that having an ad tier and fee tier is fine. Serves everyone

rcMgD2BwE72F · 2026-01-17T22:49:21 1768690161

I much prefer to subsidize my neighborhood / friends / colleagues / family / … than have the world sink in ads. That enshites everything. It turns all social media into hate machines. And the cost is only externalized and it is definitely not reduced by polluting the mind with all the ads (same as climate change where we're only making the situation worse by procrastinating). The free part and the fake generosity are an illusion.

rcMgD2BwE72F · 2026-01-17T22:52:16 1768690336

Freemium is the way if you're ok with paying forward. Not admium.

The online media I support as subscribers don't display any ad. And it's fine. I don't pay for the content, I pay for journalism.

stogot · 2026-01-18T00:09:19 1768694959

I’ve thought if they ban car commercials and truck ads, the price would go down. How much is an open question? Would they actually want to drop the cost?

dktp · 2026-01-12T18:17:31 1768241851

My guess is that this is bigger lock-in than it might seem on paper.

Google and Apple together will posttrain Gemini to Apple's specification. Google has the know-how as well as infra and will happily do this (for free ish) to continue the mutually beneficial relationship - as well as lock out competitors that asked for more money (Anthropic)

Once this goes live, provided Siri improves meaningfully, it is quite an expensive experiment to then switch to a different provider.

For any single user, the switching costs to a different LLM are next to nothing. But at Apple's scale they need to be extremely careful and confident that the switch is an actual improvement

TheOtherHobbes · 2026-01-12T19:51:11 1768247471

It's a very low baseline with Siri, so almost anything would be an improvement.

anamexis · 2026-01-12T21:58:37 1768255117

The point is that once Siri is switched to a Gemini-based model, the baseline presumably won't be low anymore.

brokencode · 2026-01-13T01:01:22 1768266082

I’m not so sure. Just think about coding assistants with MCP based tools. I can use multiple different models in GitHub Copilot and get good results with similarly capable models.

Siri’s functionality and OS integration could be exposed in a similar, industry-standard way via tools provided to the model.

Then any other model can be swapped in quite easily. Of course, they may still want to do fine tuning, quantization, performance optimization for Apple’s hardware, etc.

But I don’t see why the actual software integration part needs to be difficult.

andsoitis · 2026-01-13T05:54:47 1768283687

> But I don’t see why the actual software integration part needs to be difficult.

That’s not the issue. The issue is that once Gemini is in place as the intelligence behind Siri, the bar is now much higher than today and so you have to be more careful if you consider replacing Gemini, because you’re as likely as not to make Siri worse. Maybe more likely to make it worse.

brokencode · 2026-01-13T14:08:43 1768313323

Oh well that’s a good problem to have, isn’t it? Siri being so good that they don’t want to mess it up.

That gives them plenty of runway to test and optimize new models internally before release and not feel like they need to rush them out because Siri sucks.

inferiorhuman · 2026-01-13T03:05:24 1768273524

Doubt it. Of all the issues I run into with Siri none could be solved by throwing AI slop at it. Case in point: if I ask Siri to play an album and it can't match the album name it just plays some random shit instead of erroring out.

andy_ppp · 2026-01-13T04:03:54 1768277034

Um if I ask an LLM about a fake band it literally say I couldn't find any songs by the fake band did you type is correctly and it's about a millions times more likely to guess correctly. Why do you say it doesn't solve loads of things? I'm more concerned about the problems it creates (prompt injection, hallucinations in important work, bad logic in code), the actual functionality will be fantastic compared to Siri right now!

inferiorhuman · 2026-01-13T04:44:38 1768279478

  Why do you say it doesn't solve loads of things?

Because I'm sitting here twiddling my thumbs waiting for random pages to go through their anti-LLM bot crap. LLMs create more problems than they solve.

  Um if I ask an LLM about a fake band it literally say I couldn't find any
  songs by the fake band did you type is correctly and it's about a millions
  times more likely to guess correctly

Um if Apple wrote proper error handling in the first place the issue would be solve without LLM baggage. Apple made a conscious decision to handle "unknown" artists this way, LLMs don't change that.

eastbound · 2026-01-12T20:19:44 1768249184

Ollama! Why didn’t they just run Ollama and a public model! They’ve kept the last 10 years with a Siri who doesn’t know any contact named Chronometer only to require the best in class LLM?

chankstein38 · 2026-01-12T20:57:38 1768251458

The other day I was trying to navigate to a Costco in my car. So I opened google maps on Android Auto on the screen in my car and pressed the search box. My car won't allow me to type even while parked... so I have to speak to the Google Voice Assistant.

I was in the map search, so I just said "Costco" and it said "I can't help with that right now, please try again later" or something of the sort. I tried a couple more times until I changed up to saying "Navigate me to Costco" where it finally did the search in the textbox and found it for me.

Obviously this isn't the same thing as Gemini but the experience with Android Auto becomes more and more garbage as time passes and I'm concerned that now we're going to have 2 google product voice assistants.

Also, tbh, Gemini was great a month ago but since then it's become total garbage. Maybe it passes benchmarks or whatever but interacting with it is awful. It takes more time to interact with than to just do stuff yourself at this point.

I tried Google Maps AI last night and, wow. The experience was about as garbage as you can imagine.

woah · 2026-01-12T22:03:24 1768255404

Siri on my Apple Home will default to turning off all the lights in the kitchen if it misunderstands anything. Much hilarity ensues

eastbound · 2026-01-13T05:44:10 1768283050

Would be worse if it turned off your car.

antod · 2026-01-13T04:00:38 1768276838

Share and Enjoy

Agingcoder · 2026-01-13T07:41:39 1768290099

Same issue with apple car. ‘ hey Siri, please play ‘call your girlfriend’ on Spotify by (artist)’

‘Sorry I don’t know anyone called ‘your girlfriend’’ The kids find it hilarious

hosh · 2026-01-13T05:29:57 1768282197

I have been getting great results with Gemini 3 Deep Think, though I am not using it as my personal assistant.

crazygringo · 2026-01-13T00:43:31 1768265011

I'm genuinely curious about this too. If you really only need the language and common sense parts of an LLM -- not deep factual knowledge of every technical and cultural domain -- then aren't the public models great? Just exactly what you need? Nobody's using Siri for coding.

Are there licensing issues regarding commercial use at scale or something?

macNchz · 2026-01-13T02:20:49 1768270849

Pure speculation, but I’d guess that an arrangement with Google comes with all sorts of ancillary support that will help things go smoothly: managed fine tuning/post-training, access to updated models as they become available, safety/content-related guarantees, reliability/availability terms so the whole thing doesn’t fall flat on launch day etc.

andy_ppp · 2026-01-13T04:06:45 1768277205

Probably repeatability and privacy guarantees around infrastructure and training too. Google already have very defined splits for their Gemma and in house models with engineers and researchers rarely communicating directly.

JumpCrisscross · 2026-01-13T04:07:07 1768277227

> Why didn’t they just run Ollama and a public model

Same reason they switched to Intel chips in the 2000s. They were better. Then Cupertino watched. And it learned. And it leapfrogged.

If I were Google, my fear would be Apple launching and then cutting the line at TSMC to mass produce custom silicon in the 2030s.

ChrisMarshallNY · 2026-01-13T02:25:07 1768271107

> provided Siri improves meaningfully

Not a high bar…

That said, Apple is likely to end up training their own model, sooner or later. They are already in the process of building out a bunch of data centers, and I think they have even designed in-house servers.

Remember when iPhone maps were Google Maps? Apple Maps have been steadily improving, to the point they are as good as, if not better than, Google Maps, in many areas (like around here. I recently had a friend send me a GM link to a destination, and the phone used GM for directions. It was much worse than Apple Maps. After a few wrong turns, I pulled over, fed the destination into Apple Maps, and completed the journey).

dktp · 2025-12-31T01:17:03 1767143823

OpenAI is (was?) extremely good at making things that go viral. The successful ones for sure boost subscriber count meaningfully

Studio Ghibli, Sora app. Go viral, juice numbers then turn the knobs down on copyrighted material. Atlas I believe was a less successful than they would've hoped for.

And because of too frequent version bumps that are sometimes released as an answer to Google's launch, rather than a meaningful improvement - I believe they're also having harder time going viral that way

Overall OpenAI throws stuff at the wall and see what sticks. Most of it doesn't and gets (semi) abandoned. But some of it does and it makes for better consumer product than Gemini

It seems to have worked well so far, though I'm sceptical it will be enough for long

johnnyanmac · 2025-12-31T02:19:58 1767147598

Going viral is great when you're a small team or even a million dollar company. That can make or break your business.

Going viral as a billion dollar company spending upward of 1T is still not sustainable. You can't pay off a trillion dollars on "engagement". The entire advertising industry is "only" worth 1T as is: https://www.investors.com/news/advertising-industry-to-hit-1...

drowsspa · 2025-12-31T14:43:34 1767192214

I guess we'd have to see the graph with the evolution of paying customers: I don't see the number of potential-but-not-yet clients being that high, certainly not one order of magnitude higher. And everyone already knows OpenAI, they don't have the benefit of additional exposure when they go viral: the only benefit seems to be to hype up investors.

And there's something else about the diminishing returns of going viral... AI kind of breaks the usual assumptions in software: that building it is the hard part and that scaling is basically free. In that sense, AI looks more like regular commodities or physical products, in that you can't just Ctrl-C/Ctrl-V: resources are O(N) on the number of users, not O(log N) like regular software.

raw_anon_1111 · 2025-12-31T01:58:49 1767146329

Selling a bunch of $20 a month subscriptions isn’t going to make a dent in OpenAI losses. Going viral for a day or two doesn’t help.

Normal people are already getting tired of AI Slop

dktp · 2025-12-29T13:21:30 1767014490

Deep research, from my experience, will always add lectures.

I'm trying to create a comprehensive list of English standup specials. Seems like a good fit! I've tried numerous times to prompt it "provide a comprehensive list of English standup specials released between 2000 and 2005. The output needs to be a csv of verified specials with the author, release date and special name. I do not want any other lecture or anything else. Providing anything except the csv is considered a failure". Then it creates it's own plan and I go further clarifying to explicitly make sure I don't want lectures...

It goes on to hallucinate a bunch of specials and provide a lecture on "2000 the era of X on standup comedy" (for each year)

I've tried this in 2.5 and 3. Numerous time ranges and prompts. Same result. It gets the famous specials right (usually), hallucinates some info on less famous ones (or makes them up completely) and misses anything more obscure

geon · 2025-12-29T13:39:44 1767015584

I tried asking for a list of the most common gameboy color games not compatible with the original dmg gameboy. Chatgpt would over and over list dmg compatible games instead. I asked it to cross reference lists of dmg games to remove them and it ”reasoned” for a long time before it showed what sources it used for cross references, and then gave me the same list again.

It also insisted on including ”Shantae” in the list, which is expensive specifically because it is uncommon. I eventually forbid it from including the game in the list, and that actually worked, but it would continue mentioning it outside the list.

Absolute garbage.

tessierashpool9 · 2025-12-29T13:31:45 1767015105

I mean, isn't that a little ridiculous? Aren't those language models already solving complicated exam questions and mathematical problems?

geon · 2025-12-29T13:42:26 1767015746

According to the creators, the models are on a phd level of intelligence, but they can’t get the simplest thing right.

tessierashpool9 · 2025-12-29T13:50:19 1767016219

Overselling is only the tip of the iceberg. The real problem is that a lot of managers base their decision to introduce language models into business processes on cutting edge Pro edition demos, but what is, of course, actually used in production is some cheap Nano/Flash/Mini version.

DANmode · 2025-12-29T17:04:50 1767027890

Too easy.

dktp · 2025-11-29T13:05:09 1764421509

Spotify, Netflix, Amazon Prime, Reddit, Twitter etc all have increasingly profitable ads

I'm sure llm providers will also figure it out in due time. Consumer products are generally a good fit for ads, even if it takes time to reach full potential

raw_anon_1111 · 2025-11-29T15:18:56 1764429536

Every single one of those companies have ridiculously low marginal cost per request compared to ChatGPT and much lower fixed costs and continued development costs.

dktp · 2025-11-25T02:32:40 1764037960

I used to main Pixelbook (1st gen) for about a year. ChromeOS really is enough for the majority of day to day stuff. For development it allows you to run linux environment inside ChromeOS

I can only assume the Aluminium OS would aim to do the same