Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Have you used Infiniband? 40Gbps cards are $30 on ebay, right now for tech that is what 3 generations back. Currently shipping gear is 200Gbps. Everything flows from, in, out or to the network. Cores don't matter. All future workloads will be done by GPUs.


Sorry for the snark, but what if my workload contains branches?


GPUs have excellent support for branches. Pascal has divergent thread execution, Volta added forward progress guarantees, Ampere improves on this further.

There is a lot of research about doing databases on GPUs, and Apache Spark runs on GPUs today, with much higher performance than on CPUs.


GPU databases are limited primarily by memory constraints on the card (e.g. ~32GB maximum per card for GV100 or whatever) and interconnect latency/bandwidth, not by raw parallel scan speed. If scan speed was all that mattered, we'd have had GPU-like parallel database hardware decades ago. You can crunch rows, but only as long as it fits in memory. Once your working set exceeds the provided RAM and has to page out data to the CPU over PCIe or some other link, the numbers and utilization begin looking much worse. "Every benchmark looks amazing when your working set fits entirely in cache."

But even more than that, for the price of a single high end Tesla (approx. 10k USD), you can build a high-end COTS x86 machine with a shitload of RAM, NVMe, and then install ClickHouse on it. That machine will scale to trillions of rows with ease and millisecond response times, whether or not everything fits in memory. It will cost less money and also cost less energy and it will scale out easier, and have better utilization of the hardware.

I'd wager that unless you have infinite money to dump on Nvidia or exceedingly specific requirements, any GPU database will get soaked by a comparable columnar OLAP store in every dimension.


> GPU-like parallel database hardware decades ago

https://en.wikipedia.org/wiki/Netezza#Technology

Furthermore, that line of reasoning is fallacious. The non-existence of a thing doesn't prove that the idea is bad.


I have in HPC and backup scenarios.

I also ran datacenter services on Sun and IBM hardware that smoked anything offered by Intel platforms. But at the end of the day, commodity beats premium for 80%+ of the market. We run Linux on POWER exclusively as a cost-savings mechanism to reduce the Oracle bill.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: