Finding bugs in Haskell code by proving it

namuol · on Dec 11, 2017

Mostly-off-topic:

The `bisect-binary` project listed in the introduction is almost the _exact_ program I made while I was a devious high school student trying to circumvent weak (i.e. most widely-used-) antivirus software by locating a few "key" bytes in the binary that the AV used to fingerprint a virus.

I would zero out half the file, scan the file with AV, then recursively repeat the same for any zero'd regions of files that AV didn't consider a virus. Essentially a search of the file for the smallest set of bytes that needed to be changed for AV to fail detection.

Usually this would result in only a small number of bytes that needed to be slightly tweaked.

Pretty unproductive "hobby", but I must admit it was really fun. :)

maweki · on Dec 11, 2017

This sounds fun. How did you handle the case where on a bisect both halves didn't trigger the AV on their own?

namuol · on Dec 11, 2017

This was pretty rare for the most popular AVs, but if it didn't work I would try manually zeroing different sections until I found something. I was surprised how easy it was to make a virus undetected, and that they didn't routinely change their fingerprint algorithm as a defense against this attack.

tomsmeding · on Dec 10, 2017

This looks really cool!

I've never worked with Coq myself, and really, the process described here still looks much too convoluted for practically proving nontrivial amounts of code correct. But it looks like lots of fun too! And it is sort-of the holy grail of computer science, writing code and being able to prove it correct instead of just testing it with maybe-or-maybe-not enough test cases.

In practice I find that Haskell itself already prevents a lot of bugs, but of course in functions like the one dissected here, logical errors are still easily made.

I'll definitely look into Coq one of these weeks.

lcdoutlet · on Dec 10, 2017

http://www.cl.cam.ac.uk/~jrh13/hol-light/

The author John Harrison who works at Intel has used his theorem proving application hol-light in doing verification on large code bases.

neilparikh · on Dec 11, 2017

You might already know about this, but if you're looking to learn Coq, I would recommend the Software Foundations series (https://softwarefoundations.cis.upenn.edu/).

harveywi · on Dec 11, 2017

I started with Certified Programming with Dependent Types (http://adam.chlipala.net/cpdt/).

If you are more interested in programming, consider trying Idris (http://idris-lang.org/). Edwin Brady, the language designer, wrote a great book that I would recommend called Type-Driven Development with Idris (https://www.manning.com/books/type-driven-development-with-i...).

agentultra · on Dec 11, 2017

If you're considering Coq but haven't started I'd also consider Lean[0].

The documentation isn't as great but the main tutorial is decent. It also doubles as a practical programming language. And the community is active and helpful.

I've always found Coq to be a little impenetrable. I don't know what makes Lean more accessible to me but I've found it much easier to get into.

[0] https://leanprover.github.io/

nickpsecurity · on Dec 11, 2017

I have a stock recommendation for this if you want to learn formal methods to hopefully prove some useful programs. It might also apply if you want to do it for fun but don't want to get weighed down too much.

https://news.ycombinator.com/item?id=15781508

tomsmeding · on Dec 11, 2017

Thanks for mentioning, will look at that as well. But I think I'd like to try Coq anyway :)

mbid · on Dec 10, 2017

I have trouble understanding the usefulness of this. Haskell is barely used in the first place, and even less for highly safety critical software.

It's also not clear to me why I would use this over writing the program in the constructive logic of coq, agda or idris and then extract ("compile") the program from the proof itself. Of course, a usecase would be to formally verify "legacy" haskell code, but then my first point applies -- there is not that much important haskell code out there.

The last reason for the existence of this program I could see is that this is just experimental and the authors want to explore whether it's possible to formally prove things about haskell programs. But isn't that obviously true, given enough time put into the project?

samgd · on Dec 10, 2017

Standard Chartered has over 3 million lines of Haskell code and Facebook uses it to fight spam - there's definitely more than you think out there :-)

http://hauptwerk.blogspot.co.uk/2017/04/four-openings-for-ha...

https://code.facebook.com/posts/745068642270222/fighting-spa...

virtualwhys · on Dec 10, 2017

> Standard Chartered has over 3 million lines of Haskell code

Not exactly, they have their own closed source fork of GHC, which is strictly evaluated. Given that the codebase is proprietary it's unknown what other differences there are beyond having dumped lazy evaluation. It's a Haskell-like language, but far different than the Haskell that the public has access to.

mbid · on Dec 10, 2017

I don't think a spam filter would be the first thing you'd prove correct if you wanted to go down that road..

thinkpad20 · on Dec 10, 2017

> I have trouble understanding the usefulness of this.

Well for one thing, sometimes projects are just for fun, or to explore uncharted territory, without thinking about a specific use case. For another, there are definitely use cases for proofs in code, as verified software appears in a variety of domains. It’s not likely to make its way into the average CRUD app any time soon, but that doesn’t mean it’s not useful.

> Haskell is barely used in the first place, and even less for highly safety critical software.

“Barely used” is relative. Compared to java or python, sure. But there are untold millions of lines of Haskell code running in production right now. Besides, in my humble opinion Haskell should be used more, whether or not it is (right now). And in any case Haskell has had a big influence on many other more popular languages, and/or libraries, and this could one day be another example of that.

mbid · on Dec 10, 2017

This is not (just) for fun: "But things have changed since then, as a group at UPenn (mostly Antal Spector-Zabusky, Stephanie Weirich and myself) has created hs-to-coq: a translator from Haskell to the theorem prover Coq."

I agree that haskell should be used more, at least if the alternative is one of the main stream languages.

viraptor · on Dec 10, 2017

> there is not that much important haskell code out there.

In other words, there's at least some important Haskell code, and there's Haskell code important to someone. That's a good enough reason. Or are you saying we can't have nice things?

mbid · on Dec 10, 2017

Well, I assume this is at least partially funded by the public. Most side projects are not. While I appreciate that the engineering that went into this is non-trivial, I fail to see the scientific relevance.

dozzie · on Dec 11, 2017

> I have trouble understanding the usefulness of this. Haskell is barely used in the first place, and even less for highly safety critical software.

Every new approach is initially barely used in the wild, and the initial method of such a new approach is usually quite different in details from how things end up eventually.

In this case the value lies in working out how proving software could look like, and proving program after writing it is a different direction than writing it to a proof. We can't tell which of these methods is more useful or convenient before somebody checks them out.

nl · on Dec 10, 2017

A lot more people know Haskell than know Coq, so there's that.

mbid · on Dec 10, 2017

Well you can't use this approach to verify haskell code if you don't know coq.

chongli · on Dec 11, 2017

But you can write a library in Haskell, verify it with Coq, and publish it on Hackage for everybody to benefit from.

nl · on Dec 11, 2017

Indeed, and reading Coq errors should be a lot easier than having to work out how to write a Coq program.

nine_k · on Dec 12, 2017

Consider it a step to a more general approach. Maybe some parts will be useful. Imagine applying this to Rust and Scala.

thethirdone · on Dec 11, 2017

I have been playing around with an idea for a programming language that enables practical theorem proving.

The core idea behind it is that statically proven assertions would be really powerful.

Loop invariants can be simply expressed as an assertion inside the loop.

Additionally, to give it a bit more power; assertions could be given the ability to act over sets (say over all ints).

This might look something like

  XY = {int, int}

  @assert commutative
  fn add(a XY, b XY) XY {
    return (a[0]+b[0], a[1] + b[1])
  }
  
  fn commutative<a,b>(f (a,a)b){
    assert x,y: f(x,y) == f(y,x)
  }

And then it would also be helpful to add on assertions to types so assignments to variables also checks the assertions

  order = enum {greater, equal, lesser}
  comparator<a> = (a,a)order
  
  fn sorted<a>(pair ([]a,comparator<a>))bool {
    (l, cmp) = pair
    if l.length() <= 1 {
      return true
    }
    return cmp(l[0],l[1]) != order.greater & sorted((l[1:],cmp))
  }
  
  SortedList<a> = ([]a,(a,a)order) & sorted

There are a few major issues with my idea currently (it really requires dependent types to be ergonomic and allocation and pointers could be strange) which have prevented me from making a quick demo compiler or language spec. But I would be very excited to discuss it.

ghkbrew · on Dec 11, 2017

Have you looked into research regarding refinement types? Because they describe almost exactly what your reaching for (types with assciated provable assertions). Specifically Liquid Haskell[0] extends Haskell with (dependent) refinement types, but keeps the type checking decidable and the proof burden relatively light by using a restricted predicate language and an SMT solver.

[0] https://ucsd-progsys.github.io/liquidhaskell-blog/

thethirdone · on Dec 11, 2017

Yeah, I am aware of refinement types. You can do a lot of what I am envisioning with just refinement types. Expressing loop invariants with just refinement types is really awkward though.

The main reason why the language I am envisioning is not a functional language is because Idris, Liquid haskell, ... have already explored many of the related ideas. I want to take ideas from them and bring it into an imperative language as that hasn't been as well explored.

shawa_a_a · on Dec 11, 2017

To throw another one onto the _have you looked at X_ pile, Microsoft Research has put a lot of effort into the Why3 theorem proving platform, which is the backend for the verifier built into their (experimental?) verifier-aware, imperative language, Dafny[1]. It feels very much like writing C#/Java but with verified pre/post conditions, loop invariants etc.

I took a formal verification course in college that involved writing several verified sorting, search etc. algorithms in Dafny [2]. I remember it being somewhat cumbersome to write your assertions in a way that the checker can check them, but it's looks very much like what you're suggesting.

(At the time I didn't realise that not only does the checker check the program, but compiles it into a CLR-compatible binary, hence you'll see equivalent C code for comparison)

[1] https://github.com/Microsoft/dafny [2] https://github.com/shawa/formal-verification-project

thethirdone · on Dec 11, 2017

> To throw another one onto the _have you looked at X_ pile,

I am pretty sure I have seen Dafny before, but it wasn't on the top of my head. Thanks for mentioning it.

> I remember it being somewhat cumbersome to write your assertions in a way that the checker can check them, but it's looks very much like what you're suggesting.

I agree with both of those statements. I am not thinking of something revolutionary. I have been thinking of a few ways to make it more ergonomic than Dafny though. Refinement types would be one of the key ways to do that.

I think all of the key capabilities would be the same though. It would probably make sense to make a transpiler to Dafny to try it out.

c-cube · on Dec 11, 2017

Just to give credit where it's due, I believe why3 is developed purely by LRI, a public french research lab (see http://why3.lri.fr/). It's indeed used as a proof backend in several tools, including frama-C and Dafny (Microsoft Research), but otherwise it's from academia.

seanwilson · on Dec 11, 2017

Once you've done enough theorem proving, you'll really start to appreciate the core benefit of functional programming and understand why academics abundantly use it over imperative coding: it makes programs an order of magnitude to reason about for both you (when you have to write manual proofs) and the machine (if you want proof automation).

To make it practical to verify imperative programs, you're likely going to end up imitating a functional and stateless style to get anything done anyway.

thethirdone · on Dec 11, 2017

> Once you've done enough theorem proving, you'll really start to appreciate the core benefit of functional programming and understand why academics abundantly use it over imperative coding

I do already appreciate functional languages. And I have struggled to keep myself from adding functional elements to the language. A large part of why functional programming is easy to reason about is that functions are pure. This would be included in the language; by default everything would be pass by value with references needed an environment to be passed as well (to keep them pure). It still reads like an imperative language though.

seanwilson · on Dec 12, 2017

Either way, you're not going to get around that proving general program properties beyond very basic ones is very difficult to automate and very challenging to do manually. Even with decades of work and teams of researchers, we still can't make this easy for pure functional programs yet. Have you tried existing theorem provers?

thethirdone · on Dec 12, 2017

> Either way, you're not going to get around that proving general program properties beyond very basic ones is very difficult to automate and very challenging to do manually.

What are you considering basic? Would proving that matrices of the form vv^T/(v^Tv) are idempotent (for a fixed ) be basic? That is about the maximum I would be shooting for this language to automatically prove.

Additionally, for many assertions it would make sense to be able to fall back to a runtime check if it can't be proved easily. If you want to prove something more advanced writing, out all the steps of the proof as assertions should make it trivial to do the automation.

> Have you tried existing theorem provers?

I have used Coq. I have actually written a first order logic prover (which only worked with single variable predicates).

seanwilson · on Dec 12, 2017

> What are you considering basic?

The problem is it's really difficult to constrain how hard the proof goals are going to be. Even if you're only interested in properties to do with the length of lists/arrays which are about is simple as you can get, you quickly get into undecidable areas (i.e. there cannot exist an algorithm that can always generate the proof or tell you it's unprovable) and you have to resort to manual proofs.

More advanced properties like if a data structure is sorted are even more unpredictable. Look up verifying sorting algorithms in theorem provers like Coq and Isabelle which have been around for decades; you'd assume basic properties like these would be easily automated by now but that's not the case.

> Additionally, for many assertions it would make sense to be able to fall back to a runtime check if it can't be proved easily.

There's a few system like this already e.g. https://sage.soe.ucsc.edu/

> I have used Coq. I have actually written a first order logic prover (which only worked with single variable predicates).

Did you find the manual proof challenging when you tried to verify basic algorithms?

thethirdone · on Dec 12, 2017

> Even if you're only interested in properties to do with the length of lists/arrays which are about is simple as you can get, you quickly get into undecidable areas.

AFAIK, most of the undecidable areas in current solvers are not truly undecidable but rather practical limits. A full search of all proofs up to some arbitrary depth could solve pretty much any real problems, but would be impractical. I don't know much about avoiding practical undecidability though as the prover I wrote was complete (or at least I believe it is and it seems to be true in practice).

> There's a few system like this already e.g. https://sage.soe.ucsc.edu/

Thanks for mentioning Sage. I am not familiar with it.

> Did you find the manual proof challenging when you tried to verify basic algorithms?

I have been using insertion sort to benchmark the hypothetical language with, and I found proving insertion sort to be correct in Coq to be fairly easy. It definitely wasn't trivial, but coming from a background of formal proofs on paper, it was easy.

seanwilson · on Dec 12, 2017

> AFAIK, most of the undecidable areas in current solvers are not truly undecidable but rather practical limits. A full search of all proofs up to some arbitrary depth could solve pretty much any real problems, but would be impractical

I wouldn't agree with this. Look up inductive proof automation which you need when working with user defined data structures and properties like in your examples. It's still very hard to fully automate these and researchers have been trying at this problem for decades. The branching factor in the proof search is essentially infinite so you cannot just do an exhaustive search. Have a read about inductive proof automation in ACL2 and Isabelle for example to see what is currently possible.

> I have been using insertion sort to benchmark the hypothetical language with, and I found proving insertion sort to be correct in Coq to be fairly easy. It definitely wasn't trivial, but coming from a background of formal proofs on paper, it was easy.

That proof will be fairly easy because the algorithm, data structure and the program property definitions will all have a similar shape which won't be the case for e.g. quicksort or heapsort.

Keep going with your idea, I just mean to be aware that you're hitting against problems that are well known to be very difficult, have lots of existing related work and are still current research areas. Coq etc. are a long way from being mainstream because of this.

thethirdone · on Dec 12, 2017

> The branching factor in the proof search is essentially infinite so you cannot just do an exhaustive search.

That was the impractical part. Exhaustive search scales so poorly it can't be seriously considered.

> That proof will be fairly easy because the algorithm, data structure and the program property definitions will all have a similar shape which won't be the case for e.g. quicksort or heapsort.

I would agree that heapsort (using an array/list for the heap) would be much harder to prove, but I would expect quicksort to be fairly similar (probably a bit harder).

> Keep going with your idea, I just mean to be aware that you're hitting against problems that are well known to be very difficult, have lots of existing related work and are still current research areas. Coq etc. are a long way from being mainstream because of this.

Yeah, I never planned to push research forward with the idea. Primarily, I would like to explore it just to improve my own knowledge; its much easier to know why something is hard once you've failed at it yourself. And perhaps make an interesting toy language.

tom_mellior · on Dec 11, 2017

You might want to look into things like the KeY verifier for Java (https://www.key-project.org/), Frama-C for C (http://frama-c.com/), or SPARK Ada (https://www.adacore.com/download). There has been quite a lot of work in imperative languages that might inspire you.

Low-level pointers make everything very very annoying, though.

thethirdone · on Dec 11, 2017

> You might want to look into things like the KeY verifier for Java (https://www.key-project.org/), Frama-C for C (http://frama-c.com/), or SPARK Ada (https://www.adacore.com/download).

I am aware of SPARK. KeY and Frama-C are new to me though. Thanks for mentioning them.

> Low-level pointers make everything very very annoying, though.

Definitely.

VladimirGolovin · on Dec 11, 2017

Have you looked at F* ? It's a Microsoft research language based on a subset of OCaml augmented with Z3 theorem prover: https://www.fstar-lang.org

Here's how the assertion of commutativity would look in F*:

  val f : int -> int -> Tot int
  let f a b = a + b
  let commutative = assert (forall a b. f a b = f b a )

PeCaN · on Dec 11, 2017

Reminds me a lot of LiquidHaskell, you should take a look at what they've done if you haven't seen it yet. It's a very good idea IMO.

https://ucsd-progsys.github.io/liquidhaskell-blog/

lcdoutlet · on Dec 10, 2017

Nice Post!

I think tools that enable formal reasoning about code are incredibly valuable.

Just going through the process the author was able to find and fix three bugs. Could you imagine a pipeline where people submitted code that had already been proven to be correct?

Sounds amazing!

based2 · on Dec 10, 2017

https://www.reddit.com/r/programming/comments/7iuakc/finding...