More

mrkn1 · 2026-06-02T17:21:33 1780420893

I have not. My experience has been a few seconds for an 1024x1024 with medium density of text, FWIW. Feel free to try it on a few test images, model is pretty small and fast, but yeah no formal evals on CPU.

mrkn1 · 2026-06-01T14:06:15 1780322775

It should support 109 languages. More info here: https://huggingface.co/PaddlePaddle/PaddleOCR-VL

mrkn1 · 2026-05-31T12:22:43 1780230163

No, simply, my laptop only has a CPU.

mrkn1 · 2026-05-31T12:21:53 1780230113

I haven't. But the evals of the underlying model are published here, including on Omnibench. https://huggingface.co/PaddlePaddle/PaddleOCR-VL-1.5

mrkn1 · 2026-05-28T14:28:14 1779978494

For 100% local CPU fact checking, I made this: https://news.ycombinator.com/item?id=48301003

gobdovan · 2026-05-28T15:55:18 1779983718

Why should I trust this without a paper, benchmark or at least a human-written README?

mrkn1 · 2026-05-28T01:11:13 1779930673

A lot of my queries are summarize/explain/fact check, and these are covered 100% on my CPU locally [0], reducing frontier model reliance

[0] https://news.ycombinator.com/item?id=48301003

Hiteshjain118 · 2026-05-28T15:12:11 1779981131

The link in your HN is taking me to your list of Show HN posts. I wasn't able to get your github.

mrkn1 · 2026-05-29T01:57:45 1780019865

https://github.com/kouhxp/fftext

mrkn1 · 2026-05-27T15:14:18 1779894858

just released new version that implements your ideas

atmanactive · 2026-05-27T17:30:49 1779903049

Wow, excellent news! This will definitely become a daily driver for me. Thanks!

mrkn1 · 2026-05-24T21:08:37 1779656917

thank you being thorough

clipboard: rn input is treated like any other source, so text gets written to ./textsnaps/clipboard_ocr.txt, and stdout just prints that path. Nothing goes back to the clipboard in this version (stay tuned)

portability: agreed, and it's a small change. textsnap already looks for the checksum manifest next to the script before falling back to the cache, so extending it should be easy. I make a note for next version.

mrkn1 · 2026-05-24T20:44:33 1779655473

Great question. I'm not familiar with docling-serv but pretty different beasts from what I gathered. Docling is a heavier pipeline (actually uses GPU).textsnap is the opposite: single-file CLI, small VLM running on plain CPU cores, one command, no server. Tradeoff is CPU decode is sequential so it's slower on dense pages, and it OCRs one image rather than doing full layout.

If docling-serve is already meeting your needs it's probably not an upgrade. But it installs in one command, so would love to hear how it stacks up on your images, if you end up trying it.

mrkn1 · 2026-05-24T14:50:18 1779634218

thanks! yapsnap is audio to text, and textsnap is image to text. Both have been daily use cases for me for a while. And yes, the feedback on yapsnap encouraged me to also release textsnap on github

freakynit · 2026-05-24T17:06:52 1779642412

Oh.. I didnt even notice it earlier.. you are also the author of yapsnap.. hence the similarity...

I loved the simplicity in both. They both work, without the bloat.