> For our first experiment, we used ClickBench, an analytical database benchmark. ClickBench has 43 queries that focus on aggregation and filtering operations. The operations run on a single wide table with 100M rows, which uses about 14 GB when serialized to Parquet and 75 GB when stored in CSV format.
Huge local thinking LLMs to solve math and for general assistant-style tasks. Models like Kimi-2.5-Q3, DeepSeek-XX-Q4/Q5, Qwen-3.5-Q8, MiniMax-m2.5-Q8 etc. that bring me to Claude4/GPT5 territory without any cloud. For coding I have another machine with 3x RTX Pro 6000 (mostly Qwen subvariants) and for image/video/audio generation I have 2x DGX Sparks from ASUS.
We must be twins, i've got the same three working in a cluster.
I was really excited to see where the GB300 Desktops end up, with 768gb ram but now that data is leaking / popping up (dell appears to only be 496gb), we may be in the 60-100k range and that's well out of my comfort zone.
If Apple came out with a 768gb Studio at 15k i'd bite in a heart beat.
Yeah, I didn't want to spend more than 50k for local inference stack. I can amortize it in my taxes so it's not a big deal but beyond it would start eating into my other allocations. I might still get M5 Ultra if it pops up and benchmarks look good, possibly selling M3 Ultra.
Netflix has a stream with close-up cameras, as they were the ones who arranged the whole thing. Unfortunately the commentary and color grading are both terrible: https://www.netflix.com/watch/81987107
Yes, any zero-copy format in general will have this advantage because reading a value is essentially just a pointer dereference. Most of the message data can be completely ignored, so the CPU never needs to see it. Only the actual data accessed counts towards the limit.
Btw: in my project README I have benchmarks against Cap'N Proto & Google Flatbuffers.
Ah, in that context, why not just give the people workerd? People using & running OSS libraries are used to the fact that there might be vulns in libraries they're using, right?
If Simon's users choose to self-host the open source version of his service, they are probably using it to run their own code, and so the sandbox security matters less, and workerd may be fine. The sandbox only matters when Simon himself offers his software as a service, which he could do using Workers for Platforms.
(But this is a self-serving argument coming from me.)
Wait, why not just actually use the Cloudflare Sandboxes product then? Is it too costly or something? Or you need to be able to run without a connection to their cloud?
I'm building software I want other people to be able to run themselves, I don't want to have to tell them to create a Cloudflare account as part of running that software.
We've been looking at using this at Stainless. Our evaluation isn't complete but my personal gut feeling is we'll be incredulous that we ever operated without it.
reply