More

fzysingularity · 2026-05-01T23:56:12 1777679772

VLM Run (https://vlm.run) | 1x Product + 1x ML Staff Engineer | Santa Clara, CA (HQ)

We're building the inference and orchestration layer for production Vision-Language Models. We care deeply about fast and ergonomic visual inference, reliable structured outputs, and the observability to iterate on them.

A few things we've shipped recently you can poke at:

  1. Orion: our visual agent that reasons and acts over images, video, and documents. Chat at https://chat.vlm.run.
  2. mm-ctx: a Unix-style multimodal CLI (find, cat, grep, wc) that gives coding agents real context over images, video, and PDFs. Rust core, Python devex. 
  3. vlmbench:  single-file CLI for benchmarking VLM inference (TTFT, TPOT, throughput) across vLLM, Ollama, and SGLang.

Apply: https://app.dover.com/jobs/vlm-run

Email hiring "at" vlm.run with your GitHub + a couple recent projects.

[1] https://chat.vlm.run

[2] https://pypi.org/project/mm-ctx | https://www.vlm.run/open-source/mm

[3] https://github.com/vlm-run/vlmbench | https://www.vlm.run/open-source/vlmbench

fzysingularity · 2026-04-01T16:49:34 1775062174

VLM Run (https://vlm.run) | 1x Infrastructure Engineer + 2x AI/ML Engineer | Santa Clara, CA (HQ)

VLM Run is building infrastructure for production Vision-Language Model (VLM) systems — fast inference, tool-use + orchestration, reliable structured outputs, and the observability to iterate quickly. We’re a deeply technical team of veteran AI / computer-vision engineers (20+ years combined, MIT/CMU PhDs) who’ve shipped production ML infrastructure across autonomous driving and LLMs.

Open roles:

1. Infrastructure Engineer (Full-time, ONSITE): $150K–$220K + 1–3% equity https://app.dover.com/apply/VLM%20Run/8d4fa3b1-5b38-42e1-927...

2. AI/ML Engineer (Full-time, ONSITE): $150K–$220K + 0.5–3% equity https://app.dover.com/apply/VLM%20Run/1a490851-1ea1-4f12-a0f...

Email hiring "at" vlm.run with your GitHub + a couple recent projects.

P.S. We recently launched Orion, our visual agent that can reason and act over images, videos and documents. You can chat with Orion at https://chat.vlm.run and see capabilities at https://docs.vlm.run.

Apply: https://app.dover.com/jobs/vlm-run

fzysingularity · 2026-03-28T01:27:16 1774661236

Real-time or continuous learning is great on paper, but to get this to work without extremely expensive regression testing and catastrophic forgetting is a real challenge.

Credit to the team for taking this on, but I’d be skeptical of announcements like this without at least 3–6 months of proven production deployments. Definitely curious how this plays out.

ghywertelling · 2026-03-28T03:39:09 1774669149

Can this be also used as an attack vector? A small seed percentage of users constantly choosing a particular poisoned pypi library to achieve a niche task which gets rled into the model suggestions and recommendations.

fzysingularity · 2026-04-02T00:15:00 1775088900

The recent claude code leak also revealed that they're poisoning their competitors via anti-distillation policies baked in claude code CLI (fake tool calls, adding noise etc).

fzysingularity · 2026-03-28T01:17:52 1774660672

What do you think actually happened here in the past week?

They used Kimi, failed to acknowledge it in the original Composer announcement. Kimi team probably reached out and asked WTF? Their only recourse was to publicly disclose their whitepaper with Kimi mentioned to win brownie points about being open about their training pipeline, while placating the Kimi team.

fzysingularity · 2026-03-02T19:46:49 1772480809

VLM Run (https://vlm.run) | 1x Infrastructure Engineer + 2x AI/ML Engineer | Santa Clara, CA (HQ)

VLM Run is building infrastructure for production Vision-Language Model (VLM) systems — fast inference, tool-use + orchestration, reliable structured outputs, and the observability to iterate quickly. We’re a deeply technical team of veteran AI / computer-vision engineers (20+ years combined, MIT/CMU PhDs) who’ve shipped production ML infrastructure across autonomous driving and LLMs.

Open roles:

1. Infrastructure Engineer (Full-time, ONSITE): $150K–$220K + 0.5–3% equity https://app.dover.com/apply/VLM%20Run/8d4fa3b1-5b38-42e1-927...

2. AI/ML Engineer (Full-time, ONSITE): $150K–$220K + 0.5–3% equity https://app.dover.com/apply/VLM%20Run/1a490851-1ea1-4f12-a0f...

Email hiring "at" vlm.run with your GitHub + a couple recent projects.

P.S. We recently launched Orion, our visual agent that can reason and act over images, videos and documents. You can chat with Orion at https://chat.vlm.run and see capabilities at https://docs.vlm.run.

Apply: https://app.dover.com/jobs/vlm-run

fzysingularity · 2026-03-01T17:04:14 1772384654

AI allows you to accelerate the initial build process, but I think engineering is all about craftsmanship. Today most LLMs have poor taste and chipping away the cruft matters more than ever.

fzysingularity · 2026-02-24T19:36:13 1771961773

uvx probably is the way to go here (fully self-contained environment for each skill), and use stdout as the I/O bridge between skills.

fzysingularity · 2026-02-16T16:38:19 1771259899

The cold-boot time on this model can hardly be called “serverless”

fzysingularity · 2026-02-11T17:56:37 1770832597

ELO scores for OCR don't really make much sense - it's trying to reduce accuracy to a single voting score without any real quality-control on the reviewer/judge.

I think a more accurate reflection of the current state of comparisons would be a real-world benchmark with messy/complex docs across industries, languages.

fzysingularity · 2026-02-11T16:44:14 1770828254

Apple OCR even on the Mac is insanely good, in fact way better than AWS textract/GCP cloud vision OCR.

Any idea what model is being used?

AlphaSite · 2026-02-11T17:04:24 1770829464

Probably some custom model built for their hardware.