You have more info about the inflated token use? I’m using codex cli a bunch now, but the reported token usage seems like an order of magnitude higher than, say Claude code with opus.
Idk if it’s because I set codex to xhigh reasoning, but even then it still seems way higher than Claude. The input/output ratio feels large too, eg I have codex session which says ~500M in / ~2M out.
I wish I had hard evidence but it is mostly an observation. I do use Codex a lot and I felt a drastic change from like one-two months ago to this day.
It used to give me precise answers, "surgical" is how I described it to my friends. Now it generates a lot of slop and plenty of "follow ups". It doesn't give me wrong answers, which is ok, but I've found that things that used to take 3-4 prompts now take 8-10. Obviously my prompting skills haven't changed much and, if anything, they've become better.
This is something that other colleagues have observed as well. Even the same GPT5.4 model feels different and more chatty recently. Btw, I think their version numbers mean nothing, no one can be certain about the model that is actually running on the backend and it is pretty evident that they're continuously "improving" it.
Back in business school they used to tell the story of how makers of razor blades would put a good blade as the first and the last blade in the pack. I suspect the LLM services of doing something like that.
I haven't had the time to fully hash this take out, but a big question in the back of my mind has been - is it possible that AI model improvements come partly from finding overhang in things that look hard and impressive to humans but are actually trivial consequences of the training data? If true, then the observable performance of any widely distributed model could get worse over time as it "mines out" the work that's easy for it to do.
Using Codex more for now, and there is definitely some compaction magic.
I’m keeping the same conversation going and going for days, some at almost 1B tokens (per the codex cli counters), with seemingly no coherency loss
Oh yes, upvoting, my top annoyance with anthropic too, email links are a bit ridiculous as a login mechanism.
Anytime I have to login again, it’s the ridiculous dance of figuring out what surface I’m logging into and how to get the magic link to open there, and not mistakenly somewhere else. Never a problem with openAI - input password and 2FA - done, logged in.
On topic of just data requests from OpenAI - this article says “Be aware that this process isn’t instant”
I did notice this an wonder what changed - I do periodic data backups of various services, and up until recently it was impressive, as ChatGPTs email with data zip file link arrived maybe within 1-3 min of the request, for around a ~1GB file.
I have similar amount of data now (even less, I pruned some), yet now the file takes a really long time to prepare and receive.
I started mine monday and it never finished (never got the email saying its ready). I started it again on tuesday and it finished in two hours. Maybe they just had a surge of exports on monday.
I tried to use it right after launch from within Claude Desktop, on a Mac VM running within UTM, and got cryptoc messages about Apple virtualization framework.
That made me realize it wants to also run a Apple virtualization VM but can’t since it’s inside one already - imo the error messaging here could be better, or considering that it already is in a VM, it could perhaps bypass the vm altogether. Because right now I still never got to try cowork because of this error.
Does UTM/Apple's framework not allow nested virtualization? If I remember correctly from x86(_64) times, this is a thing that sometimes needs to be manually enabled.
You are correct on both accounts, as of tahoe 26.3 you can't nest a macOS guest under a macOS guest. However you can nest 2 layers deep with any combo of layer 1 guest so long as the machine is running Sequoia and is M3/M4/M5.
Wow, this should be higher up and with a different title.
People who are paying $200/month for a defined service, and think they are using `gpt5.3-codex`, are getting their requests silently routed to a less capable model without telling the user at all. Why? - because openAI claims gpt5.3-codex is too powerful and dangerous in regards to cybersecurity, and their system randomly flags accounts. And the way to unlock access to a model you thought you already were paying $200/month for, is upload your ID and do identity verification...
I use brew but willing to try out Macports.
How come the package install instructions seem to require sudo under macports? Does that not carry more risk during the install ?
There’s a way somewhere deep in settings to disable those. I still have UberEats notifications for food arrival, but was able to disable all other ones while digging through all the settings
Idk if it’s because I set codex to xhigh reasoning, but even then it still seems way higher than Claude. The input/output ratio feels large too, eg I have codex session which says ~500M in / ~2M out.
reply