I think most bugs are that kind that the compiler can catch. But yes absolutely a program can compile and be dead wrong. I’m writing a regex engine and lol at types saving me from all the mistakes I can make with building and transforming finite automata.
I’d still much rather have a slightly slow compilation than verify types in my head every time I’m in that area of code, or writing unit tests for every configuration of code which basically just validate I haven’t returned any nulls.
Typed compilation may take time, but it results in faster running programs. And unit tests also take time. I find the errors generated by a type checker come quicker and give more useful diagnostics for deep areas of the code.
My experience has been the opposite. Compilers are not just a little slow, but introduce a significant slowdown for every read-edit-test loop, which adds up to such huge productivity losses that any small gains from the compiler catching typo-style / wrong arguments sorts of bugs is totally erased.
I’m not arguing here but honestly don’t know: how can interpretation be faster than the checking step of compilation? Maybe code generation takes more time, but Rust has ‘cargo check’ which only typechecks.
Parsing is basically a wash between compilation and interpretation. Dynamic types still need to be resolved for the tests to run. So why would interpretation be faster?
I’m working through a compiler book and would love to know.
I think mlthoughts2018 may be saying that he finds the advantages of a good "read-edit-test loop" to be more valuable than a compiler that catches type errors?
It is certainly valuable. A good REPL is completely fantastic for prototyping and debugging. Being able to change how your program works while it's still running and has all its data loaded is great compared to a classic edit-compile-run cycle where you've got to get your program's data re-loaded each time.
But I don't see that this has much to do with compilers versus interpreters, or even really dynamic versus static type checking...
- There are REPL-based environments with integrated compilers, and there are compiled languages with REPLs bolted on top.
- Good REPL support is surely more challenging for languages with strong static typing. After all: what does it mean to redefine a type? what happens to the functions that are using the old definitions? what happens to the instances of that type that are already alive in your program? But these problems all exist in dynamic languages too, it's just easy there to sweep them under the rug by treating them as yet more run-time errors.
You are close to describing what I meant, except what I was saying is not related to a REPL.
For example, I find I am much more productive writing Python code instead of Scala or Haskell, after many years of experience in all three. By “productive” I mean writing fewer defects, completing programs more quickly, and validating that programs are sufficiently correct & efficient for deployment.
A typical work cycle in all 3 languages would be to read code, edit code, invoke a test command from a shell prompt (which triggers incremental re-compilation with Scala and Haskell). In business settings, even for very minor code changes, the incremental re-compilation would take on the order of a few minutes every time, and this was in large companies with sophisticated monorepo tooling and dedicated teams of tooling engineers who worked on performance for incremental compulation. For Python, I just run tests and get immediate feedback without waiting ~3 minutes every time.
The types of correctness verification offered by using the compiled languages and waiting ~3 minutes every cycle was just not useful. I got the same verification in Python by just writing some low effort extra tests one time and then save ~3 minutes on every edit-test cycle.
This is way off topic but how far did you get with Haskell? I was a Python programmer, contributor, and speaker for ~10 years and felt I was way more productive in it than in Haskell -- but after spending the last few years with Haskell I now find the opposite to be true and hardly write Python anymore.
Haskell let's me encode more of the business logic at the type level. I've finally been able to experience what people mean when they say a type system as good as Haskell's allow you to make sure invalid states are not presentable.
I can do the same in Python, Javascript, etc but it's much more work by writing thousands of lines of test code and still not being certain where the edge cases are.
That's one area a rich type system helps with...
but I think more to the spirit of your comment: it's the behaviors that are most important. No language is sufficiently expressive enough to define what those behaviors should be, and more importantly, which behaviors are not allowed. I totally agree that whether Haskell or Python, when it comes to this problem, neither are effective! There are no compile-time or run-time errors and yet we observe incorrect behavior! That's something I've only seen formal methods able to tackle.
I spent about 4 years total working professionally in Haskell, commonly using things like multi-parameter type classes, liquidhaskell, compiler extensions for fully dependent types, higher kinded types.
I’ve heard the claim, “Haskell let me encode business logic into the type system” so many times, but I think it’s totally a false promise. Usually people mean design patterns like phantom types and things, and it just leads to the same spaghetti code messes as in any other paradigm.
I’ve never found any cases where encoding this stuff into the type system actually resulted in verifiably more correct code as compared to doing the same thing with analogous patterns in dynamic typing languages and adding lightweight tests. You still end up needing approximately the same amount of test code either way.
Yet in the statically typed case, you often pay a big constant penalty of compile time overhead delaying the work cycle, even for the best incremental compilers.
> I’ve heard the claim, “Haskell let me encode business logic into the type system” so many times, but I think it’s totally a false promise. Usually people mean design patterns like phantom types and things, and it just leads to the same spaghetti code messes as in any other paradigm.
Well it's working for my code base so it's not a false promise.
I find it hard going back to dynamic languages now because of how little they allow you to see the types.
> I’ve never found any cases where encoding this stuff into the type system actually resulted in verifiably more correct code
I think there is still much debate about this. Dependent type theory is the only type system I know of in which it's possible to write proofs, and therefore verify, a program... but even dependent type theory alone is a bit bothersome for the task.
In my experience it's certainly possible to write a program in Python that is reasonably correct but it takes more effort.
Firstly, I didn’t mean to suggest you cannot write effective code in statically typed languages.. only that it is no easier and no more productive to do it than with dynamically typed languages.
> “In my experience it's certainly possible to write a program in Python that is reasonably correct but it takes more effort.”
I think there is a pivotal trade-off though. I’d argue that in many cases it can be less effort to write correct code in Python for three reasons,
- the simpler types of compiler checks are just inapplicable or handled by totally ignorably-low-effort unit tests in Python and don’t matter 99.9999% of the time, and have low cost in the 0.00001% of the time they do matter.
- the extra type system code and constraints it imposes introduces more extra code than what an analogous set of unit tests would have required for functionally the same level of correctness guarantee in Python.
- the extra time cost of including a compilation step in many edit cycle round trips adds up to far more effort than writing and maintaining extra tests, and disallows flexibly deciding when to get rapid feedback when that feedback matters more than whole-program correctness required by a compiler (and these points are magnified if ever working in a monorepo).
Thanks for the clarifications -- this is an interesting perspective.
For the last few years I think I've managed to land in the worst of both worlds with SystemVerilog. It manages to combine essentially no type system and no type checking (everything is implicitly just a bunch of bits) with abysmal compile times, so you get neither safety nor a quick turnaround.
All of this makes me really long for both a good type system, and a better edit-test cycle. Which is more valuable? You're probably right. There's nothing more frustrating than waiting 15 minutes to discover a missing comma...
Just curious. I’m learning about compilers and would be interested in which compilers most make the trade-off in favor of optimal code generation over compilation speed.
I'm not sure about C, but I know of a few for other languages.
Some compilers perform whole program optimisation, rather than separate compilation of components/modules. This takes longer, but can find some extra optimisations (e.g. if a module provides some functionality, but it's not used in the resulting program, it can be deleted as dead code). MLton does this for StandardML ( http://mlton.org ) and Stalin does this for Scheme ( https://en.wikipedia.org/wiki/Stalin_(Scheme_implementation) )
At the very extreme end is the idea of superoptimisation ( https://en.wikipedia.org/wiki/Superoptimization ). Rather than translating the given code in some way, like a compiler, a superoptimiser treats the given code like a specification or test suite. It performs a brute-force search through the space of all possible programs, looking for any that match the specification. This can find truly optimal code, but it's ridiculously slow. So far its only real-world uses are to find small "peephole" (find/replace) optimisations that can be added to real compilers like LLVM.
A supercompiler can be thought of as running the given code at compile-time. If we know the values of all the functions and variables in a given piece of code, we can just run it to find out what the answer is. Interestingly, we can also "run" code involving values we don't know: we just pass those values around as opaque black-boxes. This is really useful for collapsing layers of indirection, resolving dynamically dispatched targets, baking-in configurable parameters, etc. The downside is that it can lead to bloated executables. Roughly speaking, supercompilation replaces a few long paths containing many conditionals, with many short paths containing fewer conditionals. It also specialises general-purpose functions into many single-use functions which have particular arguments baked-in.