Yeah I’ve never seen it capture video before, but if you specify in your `AGENTS.md` that you want to test certain types of workflows, it will take progressive screenshots using a sleep interval or by interacting with the DOM.
I've been using playwright-cli (not mcp) for this same purpose. It lacks the video feature, I guess. But at least is local and without external dependencies on even more third parties (in your case, vercel). Perhaps you could allow to use a local solution as an alternative as well?
agent-browser runs locally (it’s a Rust CLI + Node daemon on your machine), so there’s no cloud dependency on Vercel, it’s just built by the Vercel Labs team. Everything stays local :)
are you a creative professional? because I see that argument quite often as if people use Adobe CS daily, and then its mostly people who do basic stuff (that photopea or gimp can handle fine), but they like to feel "pro" by launching their pirated version of photoshop.
Gimp is ass, what takes me 5 minutes in Photoshop needs an hour in GIMP. Also I edit photos occasionally in lightroom.
I actually daily Linux, but I still have to dual boot for gaming and Adobe.
Also Linux isn't flawless either, Fedora broke sleep on my full AMD PC since like a month now and no agent could successfully debug it.
Also, I had several experiments where I was interested in just 5 to 10 websites with application specific information so it works nicely for fast dev to spider, keep a local index, then get very low search latency. Obviously this is not a general solution but is nice for some use cases.
woah dude, take it easy. There are no missing features, there are more feature. You might just not be finding them where they were before. Remember this is still 0.x, why would the devs be stuck and not be able to improve the UI just because of past decisions?
I'm really glad I bought Strix Halo. It's a beast of a system, and it runs models that an RTX 6000 Pro costing almost 5x as much can't touch. It's a great addition to my existing Nvidia GPU (4080) which can't even run Qwen3-Next-80B without heavy quantization, let alone 100B+, 200B+, 300B+ models, and unlike GB10, I'm not stuck with ARM cores and the ARM software ecosystem.
To your point though, if the successors to Strix Halo, Serpent Lake (x86 intel CPU + Nvidia iGPU) and Medusa Halo (x86 AMD CPU + AMD iGPU) come in at a similar price point, I'll probably go with Serpent Lake, given the specs are otherwise similar (both are looking at 384-bit unified memory bus to LPDDR6 with 256GB unified memory options). CUDA is better than ROCm, no argument there.
That said, this has nothing to do with the (now resolved) issue I was experiencing with LM Studio not respecting existing Developer Mode settings with this latest update. There are good reasons to want to switch between different back-ends (e.g. debugging whether early model release issues, like those we saw with GLM-4.7-Flash, are specific to Vulkan - some of them were in that specific example). Bugs like that do exist, but I've had even fewer stability issues on Vulkan than I've had on CUDA on my 4080.
With kv caching, most of the MoE models are very usable in claude code. Active params seems to dominate TG speeds, and unlike PP, TG speeds don't decay much even with context length growth.
Even moderately large and capable models like gpt-oss:120b and Qwen3-Next-80B have pretty good TG speeds - think 50+ tok/s TG on gpt-oss:120b.
PP is the main thing that suffers due to memory bandwidth, particularly for very long PP stretches on typical transformers models, per the quadratic attention needs, but like I said, with KV caching, not a big deal.
Additionally, newer architectures like hybrid linear attention (Qwen3-Next) and hybrid mamba (Nemotron) exhibit much less PP degradation over longer contexts, not that I'm doing much long context processing thanks to KV caching.
My 4080 is absolutely several times faster... on the teeny tiny models that fit on it. Could I have done something like a 5090 or dual 3090 setup? Sure. Just keep in mind I spent considerably less on my entire Strix Halo rig (a Beelink GTR 9 Pro, $1980 w/ coupon + pre-order pricing) than a single 5090 ($3k+ for just the card, easily $4k+ for a complete PCIe 5 system), it draws ~110W on Vulkan workloads, and idles below 10W, taking up about as much space as a Gamecube. Comparing it to an $8500 RTX 6000 Pro is a completely nonsensical comparison and was outside of my budget in the first place.
Where I will absolutely give your argument credit: for AI outside of LLMs (think genAI, text2img, text2vid, img2img, img2vid, text2audio, etc), Nvidia just works while Strix Halo just doesn't. For ComfyUI workloads, I'm still strictly using my 4080. Those aren't really very important to me, though.
Also, as a final note, Strix Halo's theoretical MBW is 256 GB/s, I routinely see ~220 GB/s real world, not 200 GB/s. Small difference when comparing to GDDR7 on a 512 bit bus, but point stands.
I have my own telegram bot that helps me and my wife. Reminders, shopping list, calendar. Small and simple, gets the job done :) At the start of the day it greets with a briefing, can also check weather and stuff
Btw, I'm in the process of training my own small model so that I can run it on my cpu-only VPS and stop paying for API costs
I set $10 on fire the other day as I was running through some tests.
Like old school arcade games "Please insert more ${money} to keep playing...". Local, smaller, specialized (unix philosophy?) seems like the way to go so you don't bk yourself having AGI distill pintrest recipes to just recipes.
I've been using GLM 4.7 with Claude Code. best of both worlds. Canceled my Anthropic subscription due to the US politics as well. Already started my "withdrawal" in Jan 2025, Anthropic was one of the few that was left
Are you using an API proxy to route GLM into the Claude Code CLI? Or do you mean side-by-side usage? Not sure if custom endpoints are supported natively yet.
reply