More

solarkraft · 2026-04-25T12:20:31 1777119631

Looks very interesting, but i’m a bit surprised the most important feature isn’t mentioned: How well does clipboard sharing work?

wcrossbow · 2026-04-25T12:53:31 1777121611

Im not a big fan of Windows but copy pasting a file across 3 nested RDP sessions feels magical every time

debarshri · 2026-04-25T15:30:43 1777131043

I am not sure if you have tried broadcasting feature in terminals, thats magical too.

ktpsns · 2026-04-25T13:33:35 1777124015

To be honest, three nested RDPs sound like a terrible hack. In an ideal world, this would be two port forwardings and one RDP (thinking about ssh, which is still underrepresented in windows world). In an even more ideal world, this would be an IPv6 direct access ;-)

everforward · 2026-04-25T14:25:37 1777127137

There are legit reasons, at least for two nested sessions. A production network that’s airgapped except for a bastion host that acts as a gateway. It’s better than port forwarding because you have to auth to the bastion host before the RDP chaining, and it often takes separate credentials for the second RDP session.

It’s a semi-common setup for higher security environments, and when you have a network of stuff that has known vulnerabilities you can’t patch for whatever reason. Traffic in and out is super carefully firewalled. It’s not great, but it’s better than a 25 year old MySQL with a direct public IP.

embedding-shape · 2026-04-25T14:49:58 1777128598

> airgapped except for a bastion host that acts as a gateway

First time I've heard of an airgapped system you could access remotely. Doesn't that kind of defeat the label "airgapped"? I think I'd just call that "isolated" at that point instead.

debarshri · 2026-04-25T15:32:35 1777131155

This concept is related to PAM. You often have to do ops on infra and need some DMZ to do the ops. In regulated industry you have to record every operations done by the person and have to follow principle of least privilege. This what should happen in an ideal world.

embedding-shape · 2026-04-25T15:39:45 1777131585

> You often have to do ops on infra and need some DMZ to do the ops.

This makes sense, "bastion" hosts and similar things is fairly common too. What's not common is calling those "airgapped", because they're clearly not.

SigmundA · 2026-04-25T15:09:51 1777129791

Logically air gapped :)

https://docs.aws.amazon.com/aws-backup/latest/devguide/logic...

rzzzt · 2026-04-25T15:02:13 1777129333

The moat!

orisho · 2026-04-25T14:26:39 1777127199

It's probably there not as a way to connect networks, but as a way to keep them separate, only allowing RDP between specific computers on different networks.

debarshri · 2026-04-25T15:29:47 1777130987

We have a custom RDP client [1]. So i have some experience building something like this. We do some an implementation similar to this.

Clipboard sharing, uploading and downloading via shared drive is a freerdp feature that should be readily available.

We also have sessions recording which is non-negotiable in PAM.

[1] https://adaptive.live

d3Xt3r · 2026-04-25T12:53:02 1777121582

And desktop scaling. And multi-monitor support. And file transfers. And drive redirection. And peripheral redirection. And...

rvz · 2026-04-25T13:48:46 1777124926

...A test suite, And security audits, And most importantly benchmarks.

What it does have is a license which it is GPLv3. So if anyone adds all those changes, they have to make the source code available with the same software license.

pixel_popping · 2026-04-25T14:38:34 1777127914

In this era tho, licenses (I don't agree with this, but this is what it is) are a matter of "tokens", I speak for a fact knowing multiple relatively-big companies just gobbling GPLv3 projects and rewriting them entirely, some do publish them as well.

solarkraft · 2026-04-25T02:22:13 1777083733

Has anyone made a comprehensive overview of these? A lot of memory solutions keep springing up but I’m not even entirely sure what to evaluate them by (without hands on experience).

mentedb · 2026-04-25T12:32:34 1777120354

I'm biased since I built this, but the things I'd look at: how memories are stored (flat text vs typed), what happens when info conflicts (does it detect contradictions or just store both), and whether it runs locally or cloud only.

My take is that pure memory is just one piece. What I'm really trying to build is a cognitive engine. Typed memories, contradiction detection, pain signals when you're about to repeat a mistake, decay and reinforcement. Less "store and retrieve" and more how memory actually works.

There's an interactive demo at https://demo.mentedb.com if you want to see it in action.

solarkraft · 2026-04-24T10:36:39 1777026999

At home I currently use MiniMax via OpenRouter - it’s pretty good and very cheap. They have a subscription plan, but I’m not ready to commit to it yet.

Another way to keep the ability to try out new models is to buy a reseller subscription like Cursor’s.

amunozo · 2026-04-24T11:07:31 1777028851

I tried OpenRouter but I feel the money flies even with these models, it is not comparable to a subscription but yes, it's very good for trying. Maybe I should test other models alongside GPT 5.5 to see which one fits me.

elbear · 2026-04-24T13:45:37 1777038337

I'm also unemployed. So far the models that I've used the most are Kimi and GLM. I haven't done that much agentic coding though, I've mostly used them for studying math and general conversations and I'm generally happy with their performance.

solarkraft · 2026-04-23T21:06:08 1776978368

New prompt idea: “Make sure to produce 1000x as many lines as you would otherwise need”

solarkraft · 2026-04-23T20:47:57 1776977277

I’m in the process of doing this as well - hackability is such a massive moat.

Care to share what you changed, maybe even the code?

wilj · 2026-04-23T21:18:43 1776979123

I've got to do some cleanup before sharing (yay vibe coding) but the big things I've changed so far:

1) Curated a set of models I like and heavily optimized all possible settings, per agent role and even per skill (had to really replumb a lot of stuff to get it as granular as I liked)

2) Ported from sqlite to postgresql, with heavily extended schema. I generate embeddings for everything, so every aspect of my stack is a knowledge graph that can be vector searched. Integrated with a memory MCP server and auditing tools so I can trace anything that happens in the stack/cluster back to an agent action and even thinking that was related to the action. It really helps refine stuff.

3) Tight integration of Gitea server, k3s with RBAC (agents get their own permissions in the cluster), every user workspace is a pod running opencode web UI behind Gitea oauth2.

4) Codified structure of `/projects/<monorepo>/<subrepos>` with simpler browserso non-technical family members can manage their work easier (agents handle all the management and there are sidecars handling all gitops transparent to the user)

5) Transparent failover across providers with cooldown by making model definitions linked lists in the config, so I can use a handful of subscriptions that offer my favorite models, and fail over from one to the next as I hit quota/rate limits. This has really cut my bill down lately, along with skipping OpenRouter for my favorite models and going direct to Alibaba and Xiaomi so I can tailor caching and stuff exactly how I want.

6) Integrated filebrowser, a fork of the Milkdown Crepe markdown editor, and codemirror editor so I don't even need an IDE anymore. I just work entirely from OpenCode web UI on whatever device is nearest at the moment. I added support for using Gemma 4 local on CPU from my phone yesterday while waiting in line at a store yesterday.

Those are the big ones off the top of my head. Im sure there's more. I've probably made a few hundred other changes, it just evolves as I go.

solarkraft · 2026-04-23T20:40:11 1776976811

If this claim is true (inference is priced below cost), it makes little sense that there are tens of small inference providers on OpenRouter. Where are they getting their investor money? Is the bubble that big?

Incidentally, the hardware they run is known as well. The claim should be easy to check.

parliament32 · 2026-04-23T22:50:48 1776984648

To be clear, I'm talking about subscription pricing. API pricing for Anthropic is probably at-cost.

I dare you to run CC on API pricing and see how much your usage actually costs.

(We did this internally at work, that's where my "few orders of magnitude" comment above comes from)

solarkraft · 2026-04-23T20:31:47 1776976307

I assume they are already storing the cache on flash storage instead of keeping it all in VRAM. KV caches are huge - that’s why it’s impractical to transfer to/from the client. It would also allow figuring out a lot about the underlying model, though I guess you could encrypt it.

What would be an interesting option would be to let the user pay more for longer caching, but if the base length is 1 hour I assume that would become expensive very quickly.

tonyarkles · 2026-04-23T20:47:13 1776977233

Just to contextualize this... https://lmcache.ai/kv_cache_calculator.html. They only have smaller open models, but for Qwen3-32B with 50k tokens it's coming up with 7.62GB for the KV cache. Imagining a 900k session with, say, Opus, I think it'd be pretty unreasonable to flush that to the client after being idle for an hour.

2001zhaozhao · 2026-04-23T23:18:07 1776986287

I wonder whether prompt caches would be the perfect use case of something like Optane.

It's kept for long enough that it's expensive to store in RAM, but short enough that the writes are frequent and will wear down SSD storage

ohcmon · 2026-04-23T20:40:33 1776976833

Yes — encryption is the solution for client side caching.

But even if it’s not — I can’t build a scenario in my head where recalculating it on real GPUs is cheaper/faster than retrieving it from some kind of slower cache tier

solarkraft · 2026-04-23T20:23:04 1776975784

I somewhat disagree that this is due diligence. Claude Code abstracts the API, so it should abstract this behavior as well, or educate the user about it.

mpyne · 2026-04-23T21:58:47 1776981527

> Claude Code abstracts the API, so it should abstract this behavior as well, or educate the user about it.

Does mmap(2) educate the developer on how disk I/O works?

At some point you have to know something about the technology you're using, or accept that you're a consumer of the ever-shifting general best practice, shifting with it as the best practice shifts.

websap · 2026-04-24T00:44:54 1776991494

Does using print() in Python means I need to understand the Kernel? This is an absurd thought.

Nevermark · 2026-04-24T07:21:22 1777015282

That might be an absurd comparison, but we can fix that.

If you were being charged per character, or running down character limits, and printing on printers that were shared and had economic costs for stalled and started print runs, then:

You wouldn’t “need” to understand. The prints would complete regardless. But you might want to. Personal preference.

Which is true of this issue to.

Barbing · 2026-04-24T07:36:01 1777016161

>If you were being charged per character, or running down character limits, and printing on printers that were shared and had economic costs for stalled and started print runs,

and the system was being run by some of the planet’s brightest people whose famous creation is well known to disseminate complex information succinctly,

>then:

You would expect to be led to understand, like… a 1997 Prius.

“This feature showed the vehicle operation regarding the interplay between gasoline engine, battery pack, and electric motors and could also show a bar-graph of fuel economy results.” https://en.wikipedia.org/wiki/Toyota_Prius_(XW10)

zem · 2026-04-23T22:50:00 1776984600

mmap(2) and all its underlying machinery are open source and well documented besides.

mpyne · 2026-04-23T23:04:45 1776985485

There are open-source and even open-weight models that operate in exactly this way (as it's based off of years of public research), and even if there weren't the way that LLMs generate responses to inputs is superbly documented.

Seems like every month someone writes up a brilliant article on how to build an LLM from scratch or similar that hits the HN page, usually with fancy animated blocks and everything.

It's not at all hard to find documentation on this topic. It could be made more prominent in the U/I but that's true of lots of things, and hammering on "AI 101" topics would clutter the U/I for actual decision points the user may want to take action upon that you can't assume the user already knows about in the way you (should) be able to assume about how LLMs eat up tokens in the first place.

computably · 2026-04-24T03:37:32 1777001852

I would say this is abstracting the behavior.

solarkraft · 2026-04-23T10:31:22 1776940282

Not being the Americans is Mistral‘s moat. Cooperating with the exact people who are the reason for the USA‘s loss of trust would force them to do a lot of explaining at home.

solarkraft · 2026-04-22T20:24:24 1776889464

I see a significant chance that they’ll continue to blunder the product side. It might still not matter because of their massive distribution, but leaves them open to disruption by a better product (think IE vs. Chrome).