Once one starts thinking of them as "concept models" rather than language models or fact models, "hallucinations" become something not to be so fixated on. We transform tokens into 12k+ length embeddings... right at the start. They stop being language immediately.
They aren't fact machines. They are concept machines.
They aren't fact machines. They are concept machines.