Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

awesome! I wonder if it's possible to point AI at this problem and synthesize a bespoke compiler (per-architecture?) for postgresql expressions?
 help



Two things are holding back current LLM-style AI of being of value here:

* Latency. LLM responses are measured in order of 1000s of milliseconds, where this project targets 10s of milliseconds, that's off by almost two orders of magnitute.

* Determinism. LLMs are inherently non-deterministic. Even with temperature=0, slight variations of the input lead to major changes in output. You really don't want your DB to be non-deterministic, ever.


> LLMs are inherently non-deterministic.

This isn't true, and certainly not inherently so.

Changes to input leading to changes in output does not violate determinism.


> This isn't true

From what I understand, in practice it often is true[1]:

Matrix multiplication should be “independent” along every element in the batch — neither the other elements in the batch nor how large the batch is should affect the computation results of a specific element in the batch. However, as we can observe empirically, this isn’t true.

In other words, the primary reason nearly all LLM inference endpoints are nondeterministic is that the load (and thus batch-size) nondeterministically varies! This nondeterminism is not unique to GPUs — LLM inference endpoints served from CPUs or TPUs will also have this source of nondeterminism.

[1]: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...


Yes, lots of things can create indeterminism. But nothing is inherent.

Quoting:

"But why aren’t LLM inference engines deterministic? One common hypothesis is that some combination of floating-point non-associativity and concurrent execution leads to nondeterminism based on which concurrent core finishes first."

From https://thinkingmachines.ai/blog/defeating-nondeterminism-in...


Yes, lots of things can create indeterminism. But nothing is inherent.

> 1000s of milliseconds

Better known as "seconds"...


The suggestion was not to use an LLM to compile the expression, but to use an LLM to build the compiler.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: