I would recommend that (ATM?): For high-speed pgsql (on ZFS) enable zfs-compress...

paulmd · on Nov 8, 2021

Have not used this in anger but - from what I've read best practice is also to use separate datasets for the WAL and the rest of the data (including committed data) - this helps ZFS understand that WAL and committed are both separate datasets that both need separate QOS handling within ARC rather than just letting the WAL eat the whole cache. It'll try its best to balance them regardless, of course.

You can still snapshot them atomically by making them both children of a parent dataset and performing a recursive snapshot on the parent. so you have dataset:

myservice/pgdata

myservice/pgwal

or

myservice/pgdata

myservice/pgdata/pgwal

or whatever.

So question, given that this article is about LZ4 memory compression for postgres, would you want it enabled for both/either/none of those datasets? Obviously compression-on-compression doesn't generally help at all, but lz4 performance obviously doesn't hurt much at all either, so if there's anything that postgres doesn't compress, maybe it would be faster in a few niche situations.

nix23 · on Nov 8, 2021

>would you want it enabled for both/either/none of those datasets?

Well zfs probes and stops when the data is non compressible, that why stuff like mp3 gets not compressed, so if the pg-datas are already compressed zfs would try it and give it up after some kb's.

ComputerGuru · on Nov 8, 2021

I've seen it debated both ways as to whether the zfs `recordsize` tunable should be set to 8k to match postgres or if you should keep it to something a little bit higher. Here's a talk by a FreeBSD/hashicorp guy about performance wins with a compromise of 16K [0]; this lets compression work better (the larger the block, the more compressible it is) and doesn't increase writes too much, although it would probably depend on what usecases you put postgres to and the nature of your reads/writes.

The same slides mention `primary_cache` and the suggestion is use `metadata` if db working set fits in RAM and use the default of `all` if it doesn't.

[0]: https://people.freebsd.org/~seanc/postgresql/scale15x-2017-p...

ahachete · on Nov 8, 2021

>Plus maybe some pgtune:

Alternative tuning guide: https://postgresqlco.nf/tuning-guide

(Disclosure: part of the team behind it)

tomnipotent · on Nov 8, 2021

> full_page_writes=off <===That's a maybe

My understanding is that ZFS CoW makes torn pages impossible, so it's safe to turn off full page writes.

RedShift1 · on Nov 8, 2021

What about VDEV organization in a virtual machines? I have no physical machines left, everything is running on VMware. Just create one big VDEV?

nix23 · on Nov 8, 2021

Yes in a virtual machine just one vdev/dataset (if it's about speed, if redundancy...well that depends on you host-filesystem, then maybe two virtual disks?), maybe some tuning with the write-through etc on the Host to that virtual-disk? And raw would be a good thing too.