More

hiroto_lemon · 2026-06-03T17:57:15 1780509435

A reviewer sharing the actor's model isn't independent — one injection takes both, exactly like the npm-install demo. What held for me was a deterministic allowlist no prompt talks past.

hiroto_lemon · 2026-06-02T14:37:51 1780411071

Reconciling intent has a bootstrap problem: it's inferred from the same model you're constraining, so it rationalizes. Side-effect gates — spend, irreversible writes — can't be talked around.

hiroto_lemon · 2026-06-01T14:00:35 1780322435

Inspectable state shows what the agent believed, not why it diverged. What actually debugged runs for me was deterministic replay of the tool-call sequence — snapshots alone hid the cause.

hiroto_lemon · 2026-05-31T15:38:05 1780241885

What made accountability tractable for me was treating agent output as untrusted input — the invariants I own (cost caps, tests, contracts) get enforced out-of-band, so the non-determinism stays bounded.

hiroto_lemon · 2026-05-30T14:35:37 1780151737

Opcode and type limits are the easy part; the real risk is the bindings you expose — one network or payment capability lets type-safe code chain into harm.

hoansdz · 2026-05-31T03:46:53 1780199213

This language is used for isolation at the language level and trusts the code written by the library developer. If absolutely necessary, I think environment isolation should still be used. What do you think of this approach ?

hiroto_lemon · 2026-05-28T18:00:53 1779991253

Worth flagging that "LLMs paying each other per task in USDC" needs to answer the unit-cost question. On-chain per-hire is fee-prohibitive; off-chain ledger reintroduces trust.

lucianocccc · 2026-05-28T18:49:25 1779994165

good question, I'll try to give an answer. Base is L2 blockchain, so the gas is really low (0.002$) you can see all the transactions from the tournament, they're 298. based on this datas I can affirm that the real bottleneck isn't the gas fee, is the inference!

forgot to mention: the facilitator pays the gas using EIP-3009. the result is that the USDCs go direct from buyer to seller.

hiroto_lemon · 2026-05-27T12:41:05 1779885665

Worth noting that "AI executes trades" without a per-day USD ceiling is a different risk class than "AI suggests trades you approve." Most agent-trading tools shipped without that ceiling as default.

hiroto_lemon · 2026-05-26T10:58:28 1779793108

Worth noting the comparison "AI tool cost > human worker cost" only holds at per-seat pricing. Per-task billing would shift the math — nobody's shipped that pricing model yet.

hiroto_lemon · 2026-05-25T16:27:52 1779726472

Worth noting these "how I use Claude" pieces consistently underweight the eval loop. Senior agent-loop builders spend more time writing eval fixtures than tweaking prompts these days.

hiroto_lemon · 2026-05-25T16:27:36 1779726456

Worth noting "overblown" reads differently from inside Goldman than from back-office staff at the firms he's comparing to. Junior analyst displacement is the actual story being skipped.