The concept of what a pointer "is" is somewhat language-dependent, particularly ...

tialaramex · on June 19, 2021

> particularly as people finally attempt to mitigate the waste of moving to 64-bit by using tagged pointers

It's very fraught to try to "mitigate the waste" at the top of the pointer. Everybody involved is clear that what you're doing there is saving up pain for yourself because periodically more of those "wasted" bits become significant and each time that happens if you've been using them now you've incurred a maintenance burden.

The waste at the bottom of the pointer is a far more sensible place to hide information because it won't get taken away when somebody ships a new CPU, it was already there in 32-bit architectures.

LLVM itself makes use of the latter trick extensively as I understand it, but doesn't use the former. If they've got a 64-byte structure, they can align them, then steal six bits at the bottom of each pointer to those structures for bit flags, and when Intel ships a new CPU their code still works. Whereas if I steal ten bits at the top of the pointer, when Intel ships Ice Lake CPUs my compiler now crashes. Good luck debugging that!

zozbot234 · on June 19, 2021

> Whereas if I steal ten bits at the top of the pointer, when Intel ships Ice Lake CPUs my compiler now crashes. Good luck debugging that!

That's not how it works. The x86-64 ISA requires these "top" bits to be properly sign extended when actually accessing addresses, so any binary code that's using this "trick" is forward compatible wrt. future CPU's that might enable a bigger virtual address space. The only concern wrt. compatibility is purely ABI related but that can be managed at the OS and system-software layer.

tialaramex · on June 19, 2021

> That's not how it works. The x86-64 ISA requires these "top" bits to be properly sign extended when actually accessing addresses.

On Coffee Lake no more than 48 bits of the virtual address are "real". If I steal ten of the 64 bits from the top, all the actual address information is preserved and I've lost nothing. My compiler works just fine.

On Ice Lake up to 57 bits of the virtual address are "real". Since I'm stealing ten of the 64 bits from the top, I only have 54 bits left, so 3 significant bits were dropped and I don't have them any more. Sign-extending the wrong value won't magically make it into the right value.

klodolph · on June 19, 2021

Yes—and my concern is that old binaries will suddenly stop working on newer hardware. This has happened before!

The whole reason the top bits are sign-extended in the first place is to discourage these kinds of shenanigans. I remember the pain of switching 24-bit to 32-bit on M68K Macs.

celeritascelery · on June 19, 2021

Until the virtual address space gets so big that there are no bits left at the top that are "free". As soon as you have a 64 bit virtual address space, pointer tagging the top bits no longer works. Your program is now broken on the new CPU.

devit · on June 19, 2021

Of course competent programmers using the top K bits of the pointer would make sure that the kernel and memory allocator operate so that they don't allocate memory outside a (64 - K)-bit address space, regardless of what the CPU supports (e.g. Linux mmap as one would expect won't allocate beyond 2^48 unless explicitly requested).

And obviously any intelligent CPU architect, if they make a CPU that masks the upper bits, they would make the number of bits that gets masked configurable per-process and not dependent on the maximum virtual address space supported by the CPU.

zozbot234 · on June 19, 2021

> And obviously any intelligent CPU architect, if they make a CPU that masks the upper bits, they would make the number of bits that gets masked configurable per-process

x86-64 does this by requiring all accesses to be correctly sign-extended, which means that valid virtual addresses are unique and there is full forward compatibility even with an extended virtual address space. The "number of bits that get masked" is a pure system/ABI concern, the CPU support only defines a maximum.

celeritascelery · on June 19, 2021

> Of course competent programmers ... And obviously any intelligent CPU architect

Interesting to see the future is so "obvious" to you. You are correct however, that this is not a problem at the present because no architecture uses more then 48-bits of address space in practice. And since this is such a common practice now, there will have to allowances made in the future for when the address space is expanded.

vlovich123 · on June 19, 2021

As a sibling to you noted, Ice Lake has bumped this from 48 bits to 57 bits to support 5-level paging [1]. So the "in practice" appears to already by out of date. This was known publicly as early as 2017 (albeit this is the first I've heard of it).

That being said, I don't know why the increase from 256TB to 128PB is at all useful but it's enabled in my Arch Linux kernel on a 32GB laptop, so clearly adopted by the broader ecosystem even outside whatever niche use-case it's for.

That being said, the kernel maintains backward compatability by restricting the allocation address space it hands out by default. Userspace needs to explicitly ask for the higher bits [2], so it's probably OK to just continue as-is assuming the 48-bit limit in practice.

[1] https://en.wikipedia.org/wiki/Intel_5-level_paging

[2] https://www.kernel.org/doc/html/latest/x86/x86_64/5level-pag....

saurik · on June 19, 2021

The longer I stare at your comment the deeper the feeling that you are hung up on a semantic issue feels. While the intermediate states I went through were maybe a bit more interestingly-explanatory with respect to specific techniques, the final state I landed in is simply: "if you only used the bottom single bit to tag that the entire value is a value instead of a pointer, that alone mitigates the waste of moving to 64-bits by having fewer indirect allocations".

tialaramex · on June 19, 2021

The 32-bit platform can do this same trick, so, your 64-bit pointers are still wider than theirs, and you didn't gain anything unless you also steal high bits.

This isn't quite entirely true, for them address space is very tight and so they daren't just align everything and risk wasting too much address space, whereas on a 64-bit platform you can afford to be very generous with address space. This is akin to how it's fine to have randomised privacy addressing in IPv6, but that'd be a problem in IPv4. But this gain is hard to give a specific value to, whereas the simple fact is that 64-bit pointers are exactly twice as big as 32-bit pointers.

This can give you the incentive to build a different data structure, it's surprising how often "Just use a vector" is correct these days, Chandler Carruth has talked about that at length (in C++). If you have at least some idea how many Foozles you need, and Foozles aren't some crazy multi-page data structure, chances are you should put all the Foozles in a vector, and not carry around actual pointers to Foozles at all.

xfer · on June 19, 2021

Isn't using bottom bit is how most runtimes represent their ints taking advantage of the fact that gc pointers are aligned? This is common practice afaiu.

tialaramex · on June 19, 2021

Sure, but for the bottom bit, 64-bit addressing doesn't come into it. If you only want to steal bits at the bottom and have pointers with greater than single byte alignment this worked just fine on a 32-bit CPU already.

I was reacting to the claim that this mitigates the overhead (larger pointer size) of 64-bit addressing. You can steal bits at the top but it's probably a bad idea.