Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've wondered for a long time if we would have been able to make do without protected mode (or hardware protection in general) if user code was verified/compiled at load, e.g. the way the JVM or .NET do it...Could the shift on transistor budget have been used to offset any performance losses?


Microsoft Research had an experimental OS project at one point that does just that with everything running in ring 0 in the same address space:

https://en.wikipedia.org/wiki/Singularity_(operating_system)

Managed code, the properties of their C# derived programming language, static analysis and verification were used rather than hardware exception handling.


Fil-C vs CHERI vs SeL4 vs YOLO

I think hardware protection is usually easier to sell but it isn't when it is slower or more expensive than the alternative.


"Operating System Principles" (1973) by Per Brinch Hansen. A full microkernel OS (remake of RC-4000 from 1967) written in a concurrent dialect of Pascal, that also manages to make do without hardware protection support.


I think TempleOS also worked like this, though its certainly better known for its "other" features.

edit: I missed it was linked on the above page


In TempleOS, everything runs in ring 0, but that's not the same as doing protection in software (which would require disallowing any native code not produced by some trusted translator). It simply means there's no protection at all.


Very fitting if that was intended to be protection by faith.


I think the interesting thing about having protection in software is you can do things differently, and possibly better. Computers of yesteryear had protection at the individual object level (eg https://en.wikipedia.org/wiki/Burroughs_Large_Systems). This was too expensive to do in 1970s hardware and so performance sucked. Maybe it could be done in software better with more modern optimizing compilers and perhaps a few bits of hardware acceleration here and there? There's definitely an interesting research project to be done.


Sadly, even software-filled TLBs look to be a thing of the past. Apparently a hardware page-table walker is just that much faster? I’m not sure.


Why is that surprising? The trap into kernel mode alone would already take more cycles than dedicated hardware needs for the full page table walk.


Since we're talking about defining our own processor, that means we need to define one with cheaper traps.

Expanding on what I wrote above about "bits of hardware acceleration", maybe adding a few primitives to the instruction set that make page table walking easier would help.

And with a trusted compiler architecture you don't need to keep the ISA stable between iterations, since it's assumed that all code gets compiled at the last minute for the current ISA.

Lots of fun things to experiment with.


Taking this to an extreme, the whole idea of a TLB sounds like hardware protection too?

As a thought experiment, imagine an extremely simple ISA and memory interface where you would do address translation or even cache management in software if you needed it... the different cache tiers could just be different NUMA zones that you manage yourself.

You might end up with something that looks more like a GPU or super-ultra-hyper-threading to get throughput masking the latency of software-defined memory addressing and caching?


I looked into that, concluded the spoiler is Specter.

Basically, you have to have out of order/speculative execution if you ultimately want the best performance on general/integer workloads. And once you have that, timing information is going to leak from one process into another, and that timing information can be used to infer the contents of memory. As far as I can see, there is no way to block this in software. No substitute for the CPU knowing 'that page should not be accessible to this process, activate timing leak mitigation'.


OTOH, out of order/speculative execution only amounts to information disclosure. And general purpose OS's (without mandatory access control or multilevel security, which are of mere academic interest) were never designed to protect against that.

A far greater problem is that until very recently, practical memory safety required the use of inefficient GC. Even a largely memory-safe language like Rust actually requires runtime memory protection unless stack depth requirements can be fully determined at compile time (which they generally can't, especially if separately-provided program modules are involved).





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: