More

ismailmaj · 2026-04-18T02:44:30 1776480270

Unclear if it's the only cause but wafer scale is great for very low latency, but loses to throughput per dollar compared to classic Nvidia like GPUs. I don't think they can reduce the gap, SRAM is just more expensive than HBM and their architecture needs a lot of it.

So, the price makes it necessarily niche to some specific use-cases like HFT or intelligent duplex voice assistants, I'm still semi-bullish personally.

ismailmaj · 2026-04-08T13:55:42 1775656542

Obsolete because of what? Because with limited hardware you’re never aiming for state of the art, and for fine-tuning, you don’t steer for too long anyway.

jandrese · 2026-04-08T14:06:15 1775657175

Because there is a new model that is better, faster, more refined, etc...

If your training time is measured in years or decades it probably won't be practical.

ismailmaj · 2026-03-22T12:55:05 1774184105

I don't know why people mess with tesseract in 2026, attention-based OCRs (and more recently VLMs) outperformed any LSTM-based approach since at least 2020.

My guess is that it's the entry-point to OCR and the internet is flooded by that, just like pandas for data processing.

mettamage · 2026-03-22T13:54:08 1774187648

Painful comparison haha

Leaving a comment so I can more easily find this

And for the people wondering about Pandas, use Polars instead

eichin · 2026-03-22T17:08:28 1774199308

I was surprised to learn (from this article) that there are local models that can do this (not sure if there are any that run on hardware I actually have though, unlike Tesseract which works fine on the scanning hardware I set up for it ~5 years ago.) For privacy reasons, cloud-based OCR is a non-starter...

segmondy · 2026-03-22T23:43:23 1774223003

surprisingly, the ocr models don't need much vram, they are often about 2b, so most 6gb GPU will handle it fine.

petercooper · 2026-03-22T16:48:38 1774198118

Quite, I threw a so-so photo of an old, long receipt at Qwen 3.5 0.8MB (runs in <2GB) and it nailed spitting 20+ items out in under a second. AI is good at many things, but picking modern dependencies not so much.

andai · 2026-03-22T19:29:35 1774207775

Are you running it with Ollama?

petercooper · 2026-03-22T20:12:58 1774210378

LM Studio in this case

segmondy · 2026-03-22T23:42:39 1774222959

yup, deepseek-ocr-2 will have crushed this. then there's glm-ocr, dots-ocr, etc, paddle-ocr-vl, etc

tons of options ...

ismailmaj · 2026-03-11T17:24:34 1773249874

Oh no, cortisol spike in my text-only forum.

ismailmaj · 2026-03-11T14:37:38 1773239858

You drop the memory throughput requirements because of the packed representation of bits so an FMA can become the bottleneck, and you bypass the problem of needing to upscale the bits to whatever FP the FMA instruction needs.

typically for 1-bit matmul, you can get away with xors and pop_counts which should have a better throughput profile than FMA when taking into account the SIMD nature of the inputs/outputs.

WithinReason · 2026-03-11T16:35:16 1773246916

yes but this is not 1 bit matmul, it's 1.58 bits with expensive unpacking

ismailmaj · 2026-03-11T17:02:01 1773248521

The title and the repo uses 1-bit when it means 1.58 bits tertiary values, it doesn't change any of my arguments (still xors and pop_counts).

WithinReason · 2026-03-11T17:56:56 1773251816

How do you do ternary matmul with popcnt on 1.58 bit packed data?

ismailmaj · 2026-03-11T18:09:15 1773252555

Assuming 2 bit per values (first bit is sign and second bit is value).

actv = A[_:1] & B[_:1]

sign = A[_:0] ^ B[_:0]

dot = pop_count(actv & !sign) - pop_count(actv & sign)

It can probably be made more efficient by taking a column-first format.

Since we are in CPU land, we mostly deal with dot products that match the cache size, I don't assume we have a tiled matmul instruction which is unlikely to support this weird 1-bit format.

anematode · 2026-03-11T20:24:43 1773260683

Haven't looked closely, but on modern x86 CPUs it might be possible to do much better with the gf2affineqb instructions, which let us do 8x8 bit matrix multiplications efficiently. Not sure how you'd handle the 2-bit part, of course.

WithinReason · 2026-03-12T07:48:58 1773301738

This is 11 bit ops and a subtract, which I assume is ~11 clocks, while you can just do:

l1 = dot(A[:11000000],B[:11000000]) l2 = dot(A[:00110000],B[:00110000]) l3 = dot(A[:00001100],B[:00001100]) l4 = dot(A[:00000011],B[:00000011])

result = l1 + l2 * 4 + l3 * 16 + l4 * 64

which is 8 bit ops and 4x8 bit dots, which is likely 8 clocks with less serial dependence

ismailmaj · 2026-03-11T11:01:10 1773226870

Any place we can find the code?

lukebechtel · 2026-03-11T16:41:43 1773247303

Unfortunately it hasn't been open sourced. We're debating how / when to do this right now.

ismailmaj · 2026-03-11T17:43:39 1773251019

Confusing, since this is specific to an architecture that no one making money will use (8B is consumer space, not enterprise). The produced code shouldn't hold much interesting IP?

ismailmaj · 2026-03-10T18:01:49 1773165709

I don't see a world where they become threatening and the employees don't become rich from investors flooding in.

sylware · 2026-03-10T19:22:12 1773170532

Where have you been in the last 2 decades?

ismailmaj · 2026-03-10T19:39:33 1773171573

Don’t think that’s a fair interpretation of what I said.

Liquid money rich? No.

Can get pulled for big tech packages? Also no, for most of the employees.

AFAIK, big tech didn’t aggressively poach OpenAI-like talent, they did spend 10M+ pay packages but it was for a select few research scientists. Some folks left and came but it boiled down to culture mostly.

sylware · 2026-03-11T09:20:08 1773220808

What???

microsoft openai is Big Tech.

Are you ok?

ismailmaj · 2026-03-11T11:35:56 1773228956

Ah yes, OpenAI the puppet of Microsoft that is currently declaring war against GitHub, sounds logical.

sylware · 2026-03-12T02:11:08 1773281468

Internal battles or scheme to dodge anti-trust regulations. Don't let them fool you, it is Big Tech.

ismailmaj · 2026-03-10T11:32:38 1773142358

In my experience, tracking objective things like "nutrition" and "sleep hours" is immensely useful to reflect on what went wrong, and tracking subjective things like "mood" or "stress" is useless given hedonic adaptation or heavy swings that make problems obvious, and not need tracking.

What's key is be able to visualize metrics easily on the data and frictionless data entry, I've got a decent setup with iPhone Action + Obsidian + QuickAdd scripts on Obsidian Sync (mobile + laptop). for visualization I use Obsidian Bases and Obsidian notes that run Dataview code blocks and Chart.js, couldn't be happier.

I could track things that are not interesting to reflect on like vitamin D supplementation for accountability but I've never bothered, especially if it's taken ~daily.

nicbou · 2026-03-10T12:49:31 1773146971

I think it's good to track mood swings, because it makes you notice them. After a while it makes you call out your own BS.

j_bum · 2026-03-10T13:50:21 1773150621

Strongly agree with this. I’ve been using Apple’s “mood” log for about two years now, and it is extremely helpful for me to have a concrete view of the history of my general affect.

“This entire month I’ve been feeling good, I want to pinpoint why,” or “it’s clear since stressor X entered my life, my affect is lower; how can I resolve this?”

These long term trends are harder for me to track without data. It might be easy for others, but not me!

Noaidi · 2026-03-10T13:32:07 1773149527

As someone with Schizoaffective Disorder Bipolar Type, if you are not diagnosed with a mood disorder, tracking "swings in you mood" when you have no clinical disorder seems like a disorder of its own.

I have had people tell me they were "manic". Then I showed them videos I took when I was manic and they see what I mean when I tell them they are not manic.

We have come to a place where we do not want even normal fluctuation in mood, and that is a illness of its own, but it is a cultural illness.

Ajedi32 · 2026-03-10T13:50:32 1773150632

Maybe for some it's a lot more extreme than for others, but even if it's not so dramatic as to be categorized as a mental illness wouldn't you want to know if, say, there were a direct correlation between whether you went for your morning run and your mood later in the day?

Noaidi · 2026-03-10T14:28:37 1773152917

Is this something that needs to be tracked to bring into your awareness? We have a memory storage device sitting on top of our spine. When I drink I feel drunk. Easy. If the change is noticeable you will notice it and remember it.

I am just trying to save you time and escape the cycle of "optimizations" which is where all this data logging leads.

nicbou · 2026-03-10T15:04:53 1773155093

Yes? As I wrote, the mere act of writing down your feelings forces you to acknowledge them and see patterns in them. Sometimes while writing something down I realise "wait, I've been through this before" or "every time this person is around, I feel this way". It helped me be more self-aware, for my own good.

It turns out that our memory storage device uses a very lossy form of compression. Memories get simplified and distorted over time. Heck, I can't even remember when something started hurting, so how should I notice a year-long pattern of thinking around a certain topic?

nicbou · 2026-03-10T14:53:12 1773154392

The message here was "journaling is a useful form of introspection".

> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

davidanekstein · 2026-03-10T18:07:07 1773166027

The language you use to describe this is fun, I make an app for self tracking called Reflect and would love your opinion of it, even if it doesnt suit your needs exactly.

https://apps.apple.com/us/app/reflect-track-anything/id64638...

ismailmaj · 2026-03-09T10:16:54 1773051414

Surprised with the push back of the comments, getting effects on Rust would be a dream.

Could even enable some stuff like passing loggers around not by parameters but by effect.

ismailmaj · 2026-03-08T09:45:11 1772963111

Don’t need the EU for that, they are hitting everything in 2026 including unemployment, though nothing passed yet.