Add a static analyzer pass to each C/C++ compiler which is switched on by default. Add clang-address-sanitizer-style code when compiling in debug mode, and also offer runtime checks as option for release-compiled code. Both combined would have caught 99% of memory safety issues that pop up now and then.
If necessary extend the language with optional ownership keywords a'la Rust, and allow me to switch off features I don't require to reduce complexity.
Most important of all: provide options that I don't need to pay at runtime for memory safety with precious CPU cycles.
I don't understand this type of reaction every time the problems of C/C++ are highlighted. Yes, of course the problems can be migitated using tools, but clearly this is not being done or is not as thorough or practical a solution as making it the language's responsibility. There's hoops that you must jump through to make C/C++ safe and clearly this just doesn't happen in practice. The same hoops just don't exist for Go and Rust (and that's just the system(ish) programming languages) and therefore I disagree that you can handwave this away as bullshit.
Rust is the right approach since it figures everything out at compile time. But it still has to prove itself. And what is the timeline for a browser completely re-implemented in Rust? 5 years? 10 years? Suggesting that throwing away and rewriting our entire software foundation that has been written in the past 50 years is just silly. To be realistically achievable we need an incremental approach to gradually analyze and fix old software instead of throw-away-and-rewrite.
Luckily, with Rust's great FFI, it's easy to peel off a component and re-write it, you don't need to do the whole thing. It's not like Firefox is getting transpiled to Rust tomorrow, we're just replacing bits and pieces, slowly.
Well the time will come when you will realize that
1) every language has hoops to jump through to make them safe (e.g. garbage collection does not protect you from memory leaks) [1]
2) every 4-5 years or so someone comes out with a programming language/system* [2] (rust, go at the moment) that fixes it all and "encourages good practices" - and then the good practices change (e.g. microservices are making their second round. They're good for sysadmins/sres and very bad for programmers (because you have to serialize/deserialize everything and changes to the system propagate into potentially dozens of separate programs - and God help you if they're maintained by different departments)). I wonder what's next ? Data-oriented programming is the perfect solution for the problems of microservices, so that could be next I guess.
* they may call it programming language, they may call it a system. Of course, it's always both.
[1] https://www.youtube.com/watch?v=ydWFpcoYraU
[2] Forth, C, COBOL (/mainframes), Pascal, C++, Oberon vs Modula-2, dBase (and it's competitors), Object Pascal/VB (the built-in database approach), Perl, Java, .Net, and now here Go/Rust
No need to panic, nobody is coming to take your precious pointer arithmetic away. If you want to write unsafe code, go ahead. But we also have to acknowledge that we as a profession are not able to write secure code in C/C++. We're talking about classes of security bugs that just don't exist in managed languages. Bugs that can't happen in languages like Rust either because it forces programmers to indicate ownership and lifetime of memory.
There's always a trade-off between performance and security, but with better tooling support the price for secure code will be very low. Right now security is bad because we look at security as a feature. As something that is added to a project, instead of as something that guides the entire design.
Your suggestion that static analysis can fix this is wildly optimistic, because C/C++ code for a modern browser uses memory pools, pointer arithmetic, just-in-time compilation, and shared memory between object instances, threads and processes and so on. It's hugely complex, and correctness can't be verified with some sort of static analyzer or with added runtime checks.
> But we also have to acknowledge that we as a profession are not able to write secure code in C/C++.
Wrong. You can write perfectly safe code in C++. You seem to have this ridiculous idea that pointer arithmetic is mandatory on C++. "Pointer arithmetic" is not a feature of C++. It's a feature of how computers work. C++ allows you to do it if you want. Memory pools are not a C++ feature, memory pools are part of how a modern computer and OS works. You know, memory doesn't appear magically in your program.
> Rust forces programmers to indicate ownership and lifetime of memory
So like C++ with unique_ptr, shared_ptr, and RAII?
Also I wanted to say that some of you keep mixing C and C++, calling them "C/C++". They're two different languages.
shared_ptr uses an atomic counter. In rust you can choose to use an atomic ref count or not. The type system is sufficiently smart to ensure that you do not use the non-atomic one in a thread unsafe context. C++ has no ability to do so, so it does not provide a thread unsafe version.
Additionally, in C++ there's nothing stopping me from wrapping a unique_ptr around a pointer that's aliased else where. This violates the guarantees of a unique_ptr and forces the user to make sure its invariants are held. In Rust the compiler ensures that your Box (Rust's unique_ptr) can never be owned more than once.
>C++ has no ability to do so, so it does not provide a thread unsafe version.
In which case would you like to enforce a thread unsafe version?
>Additionally, in C++ there's nothing stopping me from wrapping a unique_ptr around a pointer that's aliased else where. This violates the guarantees of a unique_ptr and forces the user to make sure its invariants are held. In Rust the compiler ensures that your Box (Rust's unique_ptr) can never be owned more than once.
True, nothing stops you to do that and it is nice to know it on compile time. But I fail to say how you would do this by accident. If you deliberately want to do this kind of things, then you are asking for trouble and it doesn't matter how awesome your language is, you would find a way to shoot your foot.
> In which case would you like to enforce a thread unsafe version?
Any time you don't use shared_ptr in a multithreaded context, you're paying extra for the overhead of atomics. In Rust, if you need shared ownership, but you're not doing anything with threads, you can remove that overhead. If you then later try to use it across threads, the compiler won't let you, until you switch to using Arc.
> In Rust, if you need shared ownership, but you're not doing anything with threads, you can remove that overhead.
If I am following you well, then what you are saying is that you if you are in a single thread application you don't need to use a shared_ptr equivalent. Wouldn't that mean that you are using a garbage collected pointer? (Which should have a greater overhead than a shared pointer) because otherwise you would be using plain old raw pointers (with all its problems).
Well, it's the same thing, but without the atmoics. Rc and Arc only differ in the instructions used to update the count, Rc uses regular increment, Arc uses atomic increment. The code is otherwise the same.
Arc<T> vs Rc<T> compared to shared_ptr was what I was going to say for overhead, as well as things like the lack of move constructors, which (should) make Vec<T> faster than std::vec.
For stronger guarantees, I was going to point out that if you std::move a uniq_ptr, it becomes null, but Rust prevents using a Box<T> that's been moved at compile time. Your example is good here too.
> the lack of move constructors, which (should) make Vec<T> faster than std::vec.
So you are saying that the lack of move constructors (hence copying the structure) will make Vec<T> faster than std::vector? I don't follow that logic, wouldn't it actually be slower?
> I was going to point out that if you std::move a uniq_ptr, it becomes null
The original becames null because the semantic is actually tranfering ownership which might be desirable depending on the case. Also you need it in case you are using custom deleters.
Specifically, I mean that when the vector changes size, the vector needs to reallocate. In Rust, since there are no move constructors, it's just a memcpy of the whole chunk of memory, one big one. Very straightforward and fast. In C++, you need to call all those move constructors.
> The original becames null because the semantic is actually tranfering ownership
Absolutely. In Rust, this is tracked at compile time, and attempting to use after the move is a compile-time error. In C++, you'll get a null pointer dereference. (I _think_ that technically it's in an 'unspecified' state, not a null one, but on my system, null is what I get).
> mean that when the vector changes size, the vector needs to reallocate
You can control how often the vector changes the size and a lot of times you can even prevent that.
> In Rust, since there are no move constructors, it's just a memcpy.
memcpy is actually slower that move the ownership of a buffer, that's why you avoid copying as much as you can (std::vector also uses memcpy inside).
> In Rust, this is tracked at compile time, and attempting to use after the move is a compile-time error.
It is nice to get a misuse at compile time, but my point is that once you transfer the ownership of the pointer you shouldn't be using that unique_ptr anymore. I'm probably biased but I don't see this "accidental" misuse happening often (tbh I haven't seen it ever and I have seen some horrible things in C++ before)
I don't think I am communicating myself clearly, so I'll just bow out about this issue, until I have an example ready. I should write one anyway, I just don't have it at the moment.
> I'm probably biased but I don't see
this "accidental" misuse happening
> often
While that may be true, as the article points out, all it takes is for someone, anywhere, to slip up once, and then you possibly have a serious security violation. Having the compiler double-check your work is nice.
>But we also have to acknowledge that we as a profession are not able to write secure code in C/C++.
You can't write safe code in any Turing complete language. That's the whole point of Turing completeness. The only reason why memory attacks are as common as they are is because C like languages are the most popular ones.
If we replaced everything written with C to a "safe" language like, I don't know Haskell?, we'd have just as many zero day exploits of the monads within programs.
You cannot write safe code in any Turing complete language? That's a bold assertion that I don't believe is true. Surely you can use formal methods to develop software and maybe even prove it's correctness and security, it's costly in multiple ways but the language of implementation doesn't prevent you from doing this. It is possible to write secure code.
If we replaced everything in C with Haskell, we'd have an entirely different problem. The attack surface wouldn't involve buffer overflows and stack smashing, it would involve various DoS attacks. Those might be easier to address though.
There is a huge difference between a program that computes the wrong answer and one that corrupts memory, hijacks a shell, and installs a rootkit.
Javascript is turing complete but it's trivial to write a (slow) javascript interpreter in Python that allows anybody to run any javascript program without any risk to their machine. No memory attacks possible. To privilege escalation possible. No unchecked stack overflows. The system would be sandboxed and secure. It will just work or fail gracefully.
It's only when we increase the complexity of our runtimes a thousand fold and when we cut corners to squeeze out higher performance that all the nasty vulnerabilities start to creep in.
> If we replaced everything written with C to a "safe" language like, I don't know Haskell?, we'd have just as many zero day exploits of the monads within programs.
I disagree. There will always be security issues, due to (ab)uses that developers didn't consider, but there can be different probabilities. Requiring developers to learn about language gotchas, remember those gotchas when coding, and jump through hoops to avoid them, just stacks the odds against us.
> You can't write safe code in any Turing complete language.
Correct. But, a lot of code is written in Turing complete languages which does not really require any Turing completeness at all. And some code should be implemented in non-Turing-complete, verifiable languages.
Of course you can. You can, for instance, include an automatically verified proof that your code terminates for finite inputs.
This is in fact where the newest academic languages are going. In dependantly typed languages bounds checks are only done in theory. you have to include a proof that your indexes are within bounds, and the compiler verifies that proof. If it checks out, it compiles. If not, back you go. Far safer than java/go and the like and far faster than c++. These languages tend to allow pointer arithmetic too, for beating it in speed is almost impossible.
> But we also have to acknowledge that we as a profession are not able to write secure code in C/C++.
Then you've picked the wrong profession. You write that like it was a fact, while it's obviously nothing more than a lack of skill (and maybe knowledge) on your side.
If I chose a language primarily because it's 'safe', that would mean that either I'm pretty bad at writing safe code, or that I'm just unnecessarily lazy.
Maybe a single developer personally can write C/C++ code that is probably pretty safe (keeping in mind that most personal/small projects never get a proper security analysis/penetration test, so one may have a biased perception), but as soon as you have several people with non-trivial components that interact, it is quite to introduce serious problems accidentally. The evidence is unfortunately against C/C++: even the biggest tech companies in the world developing applications for which security is a critical feature (operating systems, web browsers) still regularly have bugs caused by memory safety issues, and they're almost certainly throwing every tool and a lot of CPU time at the problem.
Assuming you work alone maybe yes. However most of us work in teams.
C and C++ have long proven that with average skills teams, the end result is lurking buffer exploits, dangling pointers, corrupted memory somewhere in the code base, that usually take days to sort out.
To a degree this ammuses me greatly. Rust allows you to opt-out of safety. While your proposal (which I agree with to be clear) allows you to opt-in to safety in C/C++.
This really cuts to the core of ideological differences between C/C++ and Rust camps. C/C++ programmers tell the compiler trust me, while Rust programmers say check me. I can't help but feel to a degree the Rust ideology is superior. To make by bias clear I work in C, and write in Rust in my spare time.
To be clear, I love Rust's approach to safety. I just don't think it is an realistic expectation that we can re-implement all security-critical C code in Rust until the sun goes super-nova ;)
It's probably not going to be a rewrite of existing systems, but a switch to new systems written in safer languages while phasing out the old C based systems.
On the server: unikernels. On the client: new mobile platforms, like OSes based mostly on web technologies (which while not quite being there yet are getting more and more powerful).
C/C++ is not only a poor choice in regards to security, but also to develop for hardware with hundreds or thousands of cores, and all hardware platforms are moving in that direction. Yes there are of course ways to deal with that in C/C++, but other languages are much better at this.
Within just a decade or two I can see a fundamental shift away from the old computing paradigm, at least for the majority of systems in mainstream use.
While tooling may help, I think that a completely new approach is necessary.
The correctness invariants of complex C++ programs (such as browsers and JITs) cannot be 'discovered' by static analysis - they must be, at least in part, supplied by the programmer.
C and C++ were not designed to allow programmers to specify such invariants (and have them automatically checked). I am not convinced that introducing them can be done in a clean way.
Hmm true, we could have a new 'safe' keyword (or even #pragma) which would switch off 'unsafe' language features (basically the opposite of Rust's 'unsafe' enforce a stricter, more restrictive code style which is easier for the static analyzer to reason about). The majority of even high-performance C/C++ apps only needs to twiddle bits in very small areas of the code. That's still a lot better then trying to rewrite basically all software that has been written in the last 50 years ;)
It's not the language, it's the tools.
Add a static analyzer pass to each C/C++ compiler which is switched on by default. Add clang-address-sanitizer-style code when compiling in debug mode, and also offer runtime checks as option for release-compiled code. Both combined would have caught 99% of memory safety issues that pop up now and then.
If necessary extend the language with optional ownership keywords a'la Rust, and allow me to switch off features I don't require to reduce complexity.
Most important of all: provide options that I don't need to pay at runtime for memory safety with precious CPU cycles.