Java for example, can use "compressed OOPs" -- ordinary object pointers. This is a 32-bit stored pointer but shifted, as pointers to objects always point to something on multiple of 8 bytes. So the shifted address is stored, but then shifted left before being dereferenced, allowing addressing of 32GB memory with 32-bit references.
There are variations on this method that can be applied to other memory management pools. For example, let's say are allocating many objects for processing, need them only within a certain span of time and have to free all of them when it's over. Well, rather than individually tracking allocations you can allow a single perhaps growing memory area, store the start offset somewhere and reference them with a shorter reference relative to that area.
Do you need to address every byte in your memory? If all your Python objects are at least two words in size, or more, then no, you only need to address every 16 bytes or so. So shift your whole pointer right a bit and you can address more than 4 GB of objects in a 32 bit pointer.
> If there has to be base + offset translation on every pointer access it is way too slow.
It does do this, but it's not too slow - the overhead of the translation is lower than the benefit of reduced memory transfer, increased cache space, etc. Obviously - otherwise people wouldn't be doing it.