The `bisect-binary` project listed in the introduction is almost the _exact_ program I made while I was a devious high school student trying to circumvent weak (i.e. most widely-used-) antivirus software by locating a few "key" bytes in the binary that the AV used to fingerprint a virus.
I would zero out half the file, scan the file with AV, then recursively repeat the same for any zero'd regions of files that AV didn't consider a virus. Essentially a search of the file for the smallest set of bytes that needed to be changed for AV to fail detection.
Usually this would result in only a small number of bytes that needed to be slightly tweaked.
Pretty unproductive "hobby", but I must admit it was really fun. :)
This was pretty rare for the most popular AVs, but if it didn't work I would try manually zeroing different sections until I found something. I was surprised how easy it was to make a virus undetected, and that they didn't routinely change their fingerprint algorithm as a defense against this attack.
I've never worked with Coq myself, and really, the process described here still looks much too convoluted for practically proving nontrivial amounts of code correct. But it looks like lots of fun too! And it is sort-of the holy grail of computer science, writing code and being able to prove it correct instead of just testing it with maybe-or-maybe-not enough test cases.
In practice I find that Haskell itself already prevents a lot of bugs, but of course in functions like the one dissected here, logical errors are still easily made.
If you're considering Coq but haven't started I'd also consider Lean[0].
The documentation isn't as great but the main tutorial is decent. It also doubles as a practical programming language. And the community is active and helpful.
I've always found Coq to be a little impenetrable. I don't know what makes Lean more accessible to me but I've found it much easier to get into.
I have a stock recommendation for this if you want to learn formal methods to hopefully prove some useful programs. It might also apply if you want to do it for fun but don't want to get weighed down too much.
I have trouble understanding the usefulness of this. Haskell is barely used in the first place, and even less for highly safety critical software.
It's also not clear to me why I would use this over writing the program in the constructive logic of coq, agda or idris and then extract ("compile") the program from the proof itself. Of course, a usecase would be to formally verify "legacy" haskell code, but then my first point applies -- there is not that much important haskell code out there.
The last reason for the existence of this program I could see is that this is just experimental and the authors want to explore whether it's possible to formally prove things about haskell programs. But isn't that obviously true, given enough time put into the project?
> Standard Chartered has over 3 million lines of Haskell code
Not exactly, they have their own closed source fork of GHC, which is strictly evaluated. Given that the codebase is proprietary it's unknown what other differences there are beyond having dumped lazy evaluation. It's a Haskell-like language, but far different than the Haskell that the public has access to.
> I have trouble understanding the usefulness of this.
Well for one thing, sometimes projects are just for fun, or to explore uncharted territory, without thinking about a specific use case. For another, there are definitely use cases for proofs in code, as verified software appears in a variety of domains. It’s not likely to make its way into the average CRUD app any time soon, but that doesn’t mean it’s not useful.
> Haskell is barely used in the first place, and even less for highly safety critical software.
“Barely used” is relative. Compared to java or python, sure. But there are untold millions of lines of Haskell code running in production right now. Besides, in my humble opinion Haskell should be used more, whether or not it is (right now). And in any case Haskell has had a big influence on many other more popular languages, and/or libraries, and this could one day be another example of that.
This is not (just) for fun: "But things have changed since then, as a group at UPenn (mostly Antal Spector-Zabusky, Stephanie Weirich and myself) has created hs-to-coq: a translator from Haskell to the theorem prover Coq."
I agree that haskell should be used more, at least if the alternative is one of the main stream languages.
> there is not that much important haskell code out there.
In other words, there's at least some important Haskell code, and there's Haskell code important to someone. That's a good enough reason. Or are you saying we can't have nice things?
Well, I assume this is at least partially funded by the public. Most side projects are not. While I appreciate that the engineering that went into this is non-trivial, I fail to see the scientific relevance.
> I have trouble understanding the usefulness of this. Haskell is barely used in the first place, and even less for highly safety critical software.
Every new approach is initially barely used in the wild, and the initial
method of such a new approach is usually quite different in details from how
things end up eventually.
In this case the value lies in working out how proving software could look
like, and proving program after writing it is a different direction than
writing it to a proof. We can't tell which of these methods is more useful or
convenient before somebody checks them out.
There are a few major issues with my idea currently (it really requires dependent types to be ergonomic and allocation and pointers could be strange) which have prevented me from making a quick demo compiler or language spec. But I would be very excited to discuss it.
Have you looked into research regarding refinement types? Because they describe almost exactly what your reaching for (types with assciated provable assertions). Specifically Liquid Haskell[0] extends Haskell with (dependent) refinement types, but keeps the type checking decidable and the proof burden relatively light by using a restricted predicate language and an SMT solver.
Yeah, I am aware of refinement types. You can do a lot of what I am envisioning with just refinement types. Expressing loop invariants with just refinement types is really awkward though.
The main reason why the language I am envisioning is not a functional language is because Idris, Liquid haskell, ... have already explored many of the related ideas. I want to take ideas from them and bring it into an imperative language as that hasn't been as well explored.
To throw another one onto the _have you looked at X_ pile, Microsoft Research has put a lot of effort into the Why3 theorem proving platform, which is the backend for the verifier built into their (experimental?) verifier-aware, imperative language, Dafny[1]. It feels very much like writing C#/Java but with verified pre/post conditions, loop invariants etc.
I took a formal verification course in college that involved writing several verified sorting, search etc. algorithms in Dafny [2]. I remember it being somewhat cumbersome to write your assertions in a way that the checker can check them, but it's looks very much like what you're suggesting.
(At the time I didn't realise that not only does the checker check the program, but compiles it into a CLR-compatible binary, hence you'll see equivalent C code for comparison)
> To throw another one onto the _have you looked at X_ pile,
I am pretty sure I have seen Dafny before, but it wasn't on the top of my head. Thanks for mentioning it.
> I remember it being somewhat cumbersome to write your assertions in a way that the checker can check them, but it's looks very much like what you're suggesting.
I agree with both of those statements. I am not thinking of something revolutionary. I have been thinking of a few ways to make it more ergonomic than Dafny though. Refinement types would be one of the key ways to do that.
I think all of the key capabilities would be the same though. It would probably make sense to make a transpiler to Dafny to try it out.
Just to give credit where it's due, I believe why3 is developed purely by LRI, a public french research lab (see http://why3.lri.fr/). It's indeed used as a proof backend in several tools, including frama-C and Dafny (Microsoft Research), but otherwise it's from academia.
Once you've done enough theorem proving, you'll really start to appreciate the core benefit of functional programming and understand why academics abundantly use it over imperative coding: it makes programs an order of magnitude to reason about for both you (when you have to write manual proofs) and the machine (if you want proof automation).
To make it practical to verify imperative programs, you're likely going to end up imitating a functional and stateless style to get anything done anyway.
> Once you've done enough theorem proving, you'll really start to appreciate the core benefit of functional programming and understand why academics abundantly use it over imperative coding
I do already appreciate functional languages. And I have struggled to keep myself from adding functional elements to the language. A large part of why functional programming is easy to reason about is that functions are pure. This would be included in the language; by default everything would be pass by value with references needed an environment to be passed as well (to keep them pure). It still reads like an imperative language though.
Either way, you're not going to get around that proving general program properties beyond very basic ones is very difficult to automate and very challenging to do manually. Even with decades of work and teams of researchers, we still can't make this easy for pure functional programs yet. Have you tried existing theorem provers?
> Either way, you're not going to get around that proving general program properties beyond very basic ones is very difficult to automate and very challenging to do manually.
What are you considering basic? Would proving that matrices of the form vv^T/(v^Tv) are idempotent (for a fixed ) be basic? That is about the maximum I would be shooting for this language to automatically prove.
Additionally, for many assertions it would make sense to be able to fall back to a runtime check if it can't be proved easily. If you want to prove something more advanced writing, out all the steps of the proof as assertions should make it trivial to do the automation.
> Have you tried existing theorem provers?
I have used Coq. I have actually written a first order logic prover (which only worked with single variable predicates).
The problem is it's really difficult to constrain how hard the proof goals are going to be. Even if you're only interested in properties to do with the length of lists/arrays which are about is simple as you can get, you quickly get into undecidable areas (i.e. there cannot exist an algorithm that can always generate the proof or tell you it's unprovable) and you have to resort to manual proofs.
More advanced properties like if a data structure is sorted are even more unpredictable. Look up verifying sorting algorithms in theorem provers like Coq and Isabelle which have been around for decades; you'd assume basic properties like these would be easily automated by now but that's not the case.
> Additionally, for many assertions it would make sense to be able to fall back to a runtime check if it can't be proved easily.
> Even if you're only interested in properties to do with the length of lists/arrays which are about is simple as you can get, you quickly get into undecidable areas.
AFAIK, most of the undecidable areas in current solvers are not truly undecidable but rather practical limits. A full search of all proofs up to some arbitrary depth could solve pretty much any real problems, but would be impractical. I don't know much about avoiding practical undecidability though as the prover I wrote was complete (or at least I believe it is and it seems to be true in practice).
Thanks for mentioning Sage. I am not familiar with it.
> Did you find the manual proof challenging when you tried to verify basic algorithms?
I have been using insertion sort to benchmark the hypothetical language with, and I found proving insertion sort to be correct in Coq to be fairly easy. It definitely wasn't trivial, but coming from a background of formal proofs on paper, it was easy.
> AFAIK, most of the undecidable areas in current solvers are not truly undecidable but rather practical limits. A full search of all proofs up to some arbitrary depth could solve pretty much any real problems, but would be impractical
I wouldn't agree with this. Look up inductive proof automation which you need when working with user defined data structures and properties like in your examples. It's still very hard to fully automate these and researchers have been trying at this problem for decades. The branching factor in the proof search is essentially infinite so you cannot just do an exhaustive search. Have a read about inductive proof automation in ACL2 and Isabelle for example to see what is currently possible.
> I have been using insertion sort to benchmark the hypothetical language with, and I found proving insertion sort to be correct in Coq to be fairly easy. It definitely wasn't trivial, but coming from a background of formal proofs on paper, it was easy.
That proof will be fairly easy because the algorithm, data structure and the program property definitions will all have a similar shape which won't be the case for e.g. quicksort or heapsort.
Keep going with your idea, I just mean to be aware that you're hitting against problems that are well known to be very difficult, have lots of existing related work and are still current research areas. Coq etc. are a long way from being mainstream because of this.
> The branching factor in the proof search is essentially infinite so you cannot just do an exhaustive search.
That was the impractical part. Exhaustive search scales so poorly it can't be seriously considered.
> That proof will be fairly easy because the algorithm, data structure and the program property definitions will all have a similar shape which won't be the case for e.g. quicksort or heapsort.
I would agree that heapsort (using an array/list for the heap) would be much harder to prove, but I would expect quicksort to be fairly similar (probably a bit harder).
> Keep going with your idea, I just mean to be aware that you're hitting against problems that are well known to be very difficult, have lots of existing related work and are still current research areas. Coq etc. are a long way from being mainstream because of this.
Yeah, I never planned to push research forward with the idea. Primarily, I would like to explore it just to improve my own knowledge; its much easier to know why something is hard once you've failed at it yourself. And perhaps make an interesting toy language.
I think tools that enable formal reasoning about code are incredibly valuable.
Just going through the process the author was able to find and fix three bugs. Could you imagine a pipeline where people submitted code that had already been proven to be correct?
The `bisect-binary` project listed in the introduction is almost the _exact_ program I made while I was a devious high school student trying to circumvent weak (i.e. most widely-used-) antivirus software by locating a few "key" bytes in the binary that the AV used to fingerprint a virus.
I would zero out half the file, scan the file with AV, then recursively repeat the same for any zero'd regions of files that AV didn't consider a virus. Essentially a search of the file for the smallest set of bytes that needed to be changed for AV to fail detection.
Usually this would result in only a small number of bytes that needed to be slightly tweaked.
Pretty unproductive "hobby", but I must admit it was really fun. :)