How I see SQL databases evolving over the next 10 years:
1. integrate an off the shelf OLAP engine
forward OLAP queries to it
deal with continued issues keeping the two datasets in sync
2. rebase OLTP and OLAP engines to use a unified storage layer
storage layer supports both page-aligned row-oriented files and column-oriented files and remote files
still have data and semantic inconsistencies due to running two engines
3. merge the engines
policy to automatically archive old records to a compressed column-oriented file format
option to move archived record files to remote object storage, fetch on demand
queries seamlessly integrate data from freshly updated records and archived records
only noticeable difference is queries for very old records seem to take a few seconds longer to get the results back
Send up a spacecraft with back-to-back / equal area solar panels and radiators (have to reject heat backwards, can't reject it to your neighboring sphere elements!). Push your chip temp as much as possible (90C? 100C?). Find a favorable choice of vapor for a heat pump / Organic Rankine Cycle (possibly dual-loop) to boost the temp to 150C for the radiator. Cool the chip with vapor 20C below its running temp. 20-40% of the solar power goes to run the pumps, leaving 60-80% for the workload (a resistor with extra steps).
There are a lot of degrees of freedom to optimize something like this.
That's a 30x faster just by switching to a zero-copy data format that's suitable for both in memory use and network. JSON services spend 20-90% of their compute on serde. A zero copy data format would essentially eliminate it.
Why don't we use standardized zero-copy data formats for this kind of thing? A standardized layout like Arrow means that the data is not tied to the layout/padding of a particular language, potential security problems like bounds checks are automatically handled by the tooling, and it works well over multiple communication channels.
While Arrow is amazing, it is only the C Data Interface that can be FFI'ed, which is pretty low level. If you have something higher-level like a table or a vector of recordbatches, you have to write quite a bit of FFI glue yourself. It is still performant because it's a tiny amount of metadata, but it can still be a bit tedious.
And the reason is ABI compatibility. Reasoning about ABI compatibility across different C++ versions and optimization levels and architectures can be a nightmare, let alone different programming languages.
The reason it works at all for Arrow is that the leaf levels of the data model are large contiguous columnar arrays, so reconstructing the higher layers still gets you a lot of value. The other domains where it works are tensors/DLPack and scientific arrays (Zarr etc). For arbitrary struct layouts across languages/architectures/versions, serdes is way more reliable than a universal ABI.
Yes, this is a core use case ZFS fits nicely. See slide 31 "Multi-Cloud Data Orchestration" in the talk.
Not only backup but also DR site recovery.
The workflow:
1. Server A (production): zpool on local NVMe/SSD/HD
2. Server B (same data center): another zpool backed by objbacker.io → remote object storage (Wasabi, S3, GCS)
3. zfs send from A to B - data lands in object storage
Key advantage: no continuously running cloud VM. You're just paying for object storage (cheap) not compute (expensive). Server B is in your own data center - it can be a VM too.
For DR, when you need the data in cloud:
- Spin up a MayaNAS VM only when needed
- Import the objbacker-backed pool - data is already there
- Use it, then shut down the VM
The secret is that ZFS actually implements an object storage layer on top of block devices and only then implements ZVOL and ZPL (ZFS POSIX filesystem) on top of that.
A "zfs send" is essentially a serialized stream of objects sorted by dependency (objects later in stream will refer to objects earlier in stream, but not the other way around).
Maybe you should recommend a recipe for configuring playwright with both chromium and lightpanda backends so a given project can compare and evaluate whether lightpanda could work given their existing test cases.
> > which doesn’t ship you megabytes of JavaScript
> that would be a decision the app makes
OK but as soon as some moron with a Product Manager title gets their grubby little fingers on it the app does start shipping megabytes of JS in practice. TUI's can't, that's the advantage.
reply