For high-speed pgsql (on ZFS) enable zfs-compression (lz4) just on the filesystem, and disable zfs-caching (the arc, just leave metadata on) because postgres has it's own cache.
Have not used this in anger but - from what I've read best practice is also to use separate datasets for the WAL and the rest of the data (including committed data) - this helps ZFS understand that WAL and committed are both separate datasets that both need separate QOS handling within ARC rather than just letting the WAL eat the whole cache. It'll try its best to balance them regardless, of course.
You can still snapshot them atomically by making them both children of a parent dataset and performing a recursive snapshot on the parent. so you have dataset:
myservice/pgdata
myservice/pgwal
or
myservice/pgdata
myservice/pgdata/pgwal
or whatever.
So question, given that this article is about LZ4 memory compression for postgres, would you want it enabled for both/either/none of those datasets? Obviously compression-on-compression doesn't generally help at all, but lz4 performance obviously doesn't hurt much at all either, so if there's anything that postgres doesn't compress, maybe it would be faster in a few niche situations.
>would you want it enabled for both/either/none of those datasets?
Well zfs probes and stops when the data is non compressible, that why stuff like mp3 gets not compressed, so if the pg-datas are already compressed zfs would try it and give it up after some kb's.
I've seen it debated both ways as to whether the zfs `recordsize` tunable should be set to 8k to match postgres or if you should keep it to something a little bit higher. Here's a talk by a FreeBSD/hashicorp guy about performance wins with a compromise of 16K [0]; this lets compression work better (the larger the block, the more compressible it is) and doesn't increase writes too much, although it would probably depend on what usecases you put postgres to and the nature of your reads/writes.
The same slides mention `primary_cache` and the suggestion is use `metadata` if db working set fits in RAM and use the default of `all` if it doesn't.
Yes in a virtual machine just one vdev/dataset (if it's about speed, if redundancy...well that depends on you host-filesystem, then maybe two virtual disks?), maybe some tuning with the write-through etc on the Host to that virtual-disk? And raw would be a good thing too.
For high-speed pgsql (on ZFS) enable zfs-compression (lz4) just on the filesystem, and disable zfs-caching (the arc, just leave metadata on) because postgres has it's own cache.
>ZFS-Dataset:
atime=off
compression=lz4
recordsize=8k
logbias=throughput
xattr=sa
redundant_metadata=most <===That's a maybe
primarycache=metadata
>In postgres:
full_page_writes=off <===That's a maybe
>Plus maybe some pgtune:
https://pgtune.leopard.in.ua/#/