- Pre-zeroing a page only takes 80ns on a modern cpu. vm_fault overhead
in general is ~at least 1 microscond.
- Pre-zeroing a page leads to a cold-cache case on-use, forcing the fault
source (e.g. a userland program) to actually get the data from main
memory in its likely immediate use of the faulted page, reducing
performance.
- Zeroing the page at fault-time is actually more optimal because it does
not require any reading of dynamic ram and leaves the cache hot.
- Multiple synth and build tests show that active idle-time zeroing of
pages actually reduces performance somewhat and incidental allocations
of already-zerod pages (from page-table tear-downs) do not affect
performance in any meaningful way.
I expect openBSD to continue to do this anyway though because there is the possibility that someone can reboot the system to a new OS (presumably designed just for this purpose) and read whatever was in RAM. Of course programs that deal with encryption zero memory before returning it (It is hard to make sure the compiler doesn't optimize this otherwise useless work out), but most other programs that deal with secrets are not so well written and will live sensitive information around.
The starting point is that there is stale, useless data in ram. Then a usermode program requests an empty page, and usually when they do this they want to immediately use it.(1) Using non-polluting writes, you have to use main memory bandwidth both for clearing the page, and also for bringing the page back to cache immediately afterwards when the program uses it.
Using writes that just allocate new, cleared dirty lines in cache (like the AMD CLZERO), it avoids both the write (which will happen later, when the lines are evicted from cache, probably after the lines have been used by the program), and the read because the lines are now all in the cache.
(1) And on Linux this is trivially true, because Linux only allocates and clears the page when it is first accessed.
I don't follow. Who is "you" here? The user-mode program, or the kernel-mode zero-page thread (or whatever its name is)? I'm talking about the zero-page thread here, which is zeroing pages in the background long before any thread has requested access. Those threads do not want to evict anything from the cache. This seems exactly what we want.
The issue is that zeroing pages in the background is a pessimization, that should not ever be done. The user-mode program that allocates some memory is typically not going to be able to use only writes that allocate new cache lines without reading memory. So to compare the two systems:
Your system: memory is released, kernel clears it in the background, wasting write bandwidth (which might not matter for anything except power if the system was idle at the time), and when the user-mode program starts using it, it will start writing and every new line they write to will trigger a spurious read.
Modern Linux: memory is released, kernel lets it lie, not using any power or bandwidth to do anything to it, until an user-mode program allocates it and touches the page. Then the kernel picks up the page, writes the entire page with 0:es using whatever idiom on that CPU allows it to just allocate the page in cache without reading it from the RAM. This is really fast, faster than a single memory fetch. The user-mode program can then directly use it without having to fetch anything from DRAM.
> This is really fast, faster than a single memory fetch.
Nit: it's faster than a page fault (so fault + zeroing is pretty much the same as just fault).
According to the recent-ish Latency Numbers, a main memory reference is ~100ns (variable by arch and local v remote DRAM) which is about the same as zeroing a page, at least with respect to the dragonfly numbers I posted above.
Dragonfly actually removed it two years ago: http://lists.dragonflybsd.org/pipermail/commits/2016-August/...