I've used CLaSH over most of the pass year to do all my Verilog assignments (CS major doing EE electives), and also contributed a bit to the project, so figure I'd offer some insight.
For all you saying "but sequential is the hard part". Yes functional programming is most clearly a smash-hit with combinatorial circuits, but CLaSH shines with sequential circuits too. Basically, time-varying nets are represented as infinite streams where the nth element is the value at the nth cycle. Registers are basically `cons`, they just "delay" the stream by one cycle, tacking on the initial value in front. To make complex sequential code you just "tie the knot" around the streams with `letrec`s -- which actually corresponds to exactly what the feedback loop looks like on the schematic. [Anyone that's done FRP should recognize this ideom.] In this way CLaSH is both more high-level and more low level than Verilog/VHDL: clocks nets are derived automatically, but feed back loops are explicit.
Now if you are an electrical engineer, subsisting one tedious task (routing clocks) for another (programming without "blocking assignment") might seem like no net gain. But us functional programmers are fluent at working with such fix-points, and abstracting both what we tie together an the knot-tying itself. The Moore and Mealy combinatorial are the tip of the iceberg -- examples that we hope will be more accessible to electrical engineers unfamiliar with functional programming.
For all you saying that "the hard part isn't working with the HDL at all, but lower level concerns like timing, layout, etc", I have two things.
First you are acting like HDL writing is not on the "critical path" of your development process, and thus of no concern. Well that's not just true--you can't have one engineer do HDL writing, one do layout, and one engineer do testing completely independently because there are some basic data dependencies here that linearized the development workflow. It may not be the component with the "most delay" but it's still on that critical path, and thus improving it will yield at least some speedup to some extent. Automatic layout and timing analysis is great too, but unless you have a massive amount of computing power at your disposal, AFIAK you can't get very far, so improving the HDL side of things might be the /best/ you can do.
Second, there is the development cost of finding all your bugs with low-level tools. Yes timing analysis is essential, but it's not great in diagnosing the underlying problem. If you have lots of code that, well, isn't very aesthetically pleasing, and you do all your debugging on FPGA or with timing analysis, I suppose just about all bugs look like timing issues. With CLaSH:
- You have far less code, and it's more high-level, so just reading looking for errors it is more productive.
- You can try out your code on the repl, seeing providing a stream of inputs and getting a stream of output. High level state machine errors (do you really nail this the first time with verilog?) are easily caught this way.
- Because you have more opportunities to modularize your code, you have more opportunities to test components in isolation. Unit tests vs. System tests--y'all know the deal. The former is no panacea, but obviously it makes complete code/path coverage way more tractable computationally.
- QuickCheck. I generate programs, run my single-cycle and pipeline processor for n cycles, see if they both halted and compare register/mem stage, otherwise throw out the test. I /suppose/ you could do this with C-augmented testbenches, but it would be way, way, way more code and effort. QuichCheck worked so well that I never wrote a test bench.
- EVENTUALLY, with idris or [faking it with] dependent Haskell prove your circuit correct up to the synchronous model CLaSH is built around.
In practice I can say I honestly wrote and debugged programs all from GHCi (the Haskell REPL, so very much in software land), and saw them work first time on the FPGA. Where this didn't happen was usually do to a black boxes, like megfunctions and other components on the dev board. Obviously my Haskell testing is of no use if I model them wrong in CLaSH.
Finally, it would be dishonest and misleading to not mention CLaSH's downsides. CLaSH is designed assuming your circuit is totally synchronous (or purely combinatorial, but that's the trivial case). I don't know often this comes up in the real world, but in interfacing with the components on the dead board, I often had to do things that violated rigidly synchronous circuit design --- inverting my clock to get a second 180-degree-off clock domain, asynchronous communication with SRAM. [CLaSH supports multiple clock domains, but only knows about their relative frequency, not phase.] You can often still describe these circuit in CLaSH, but because it violates its synchronous model, it won't understand them and neither will your Haskell-land testing infrastructure. Basically you loose the benefits that made CLaSH great in the first place. Fundamentally, I think true fixing these cases means designing a lower-level "asynchronous CLaSH" that both normal CLaSH and these cases can elaborate to. Trying to tack them on as special cases to CLaSH and it's synchronous modle won't fly.
But all is not lost, if you can contain the model-violation to one bit of code and give it kosher synchronous interface, you are all good. Write some Haskell to simulate what it does (need not be even in the subset CLaSH can understand), and make a Verilog/VHDL black box. CLaSH doesn't help you with that module, but that module doesn't pollute the rest of your program either. Most real-world designs are by and large synchronous, unless the world has been lying to me. So the quarantined modules would never form a significant part of your program.
That about wraps it up, ...hope somebody's still reading this thread after writing all that.
1. So I've actually never used multiple clock domains with/ CLaSH. [The inverted clock I mentioned went to the RAM megafunction, which was instantiated in Verilog. For testing purposes my RAM (in CLaSH) had zero-cycle-delay reads, which is what the RAM w/ phase-shifted clock was supposed to simulate. Also the circuit topology is the same either way (but for the inverter on the clock), just the circuit "works" for different reasons, and thus the timing is different.]
I ask about clock domains because you stated real world designs are by and large synchronous. The issue is when we have data crossing clock domains we have a potential area for bugs dependent on the possible combinations of clock speeds.
It becomes effectively asynchronous because we need to determine when data from one clock domain arrives relative to the edge of the other clock.
I can't understand the Haskell stuff you are linking to. I don't know whether it is capable of finding the issues I am talking about.
The second question was just trying to figure whether CLaSH can be used with Verilog/VHDL in some way. I was hoping against hope that there was a usable aspect to it in industry.
I can't figure out whether there is though.
The Haskell aspect is not a sweetener. We don't usually study that and it doesn't look good. You think VHDL looks bad but I think Haskell looks bad. It's like saying you've been real keen on a new beer made from brussel sprouts.
10 years ago, people were going on about SystemC. It didn't really catch on and it was a lot more normal looking.
Ok, yeah sorry the docs other than the tutorial assume some familiarity with Haskell.
2. What do you mean by "verification IP"? It was that phrase that made me mention testbenches.
Basically, while CLaSH is hard coded to understand certain types such as the Signal type, almost all primitive function are just defined with Verilog/VHDL templates which it instantiates. One can write their own templates that work just the same way. So for any piece of CLaSH-compilable Haskell, you get VHDL/Verilog for free, and for any bit of piece of VHDL/Verilog, you can use it in CLaSH by writing some Haskell (with the same ports, and that hopefully does the same thing) and then telling CLaSH the Haskell is to be replaced with your Verilog/VHDL.
This is about as good bidirectional comparability as one can get. Automatic Verilog/VHDL -> CLaSH compilation would be an improvement, but I don't think it is possible: I'm not sure to what degree the semantics of Verilog/VHDL are formalized, and even if they are, there's no way the implementations all respect those semantics.
The testbench functions are just templated like any other primitive function.
1.
UnsafeSynchronizer "casts" one signal to another -- it's compiled to a plain net in Verilog/VHDL. At each output cycle, n, it looks at the round(n*fin/fout) input cycle and give it that value.
Obviously this is unsafe because, as you say, in the real world the problem is asynchronous. You don't know the exact frequency ratios and phase differences, nor are they constant, and even if you did you'd get subtle timing errors with an incoming value that doesn't change on the clock edge.
The trick is it is a pretty basic "black box" to augment CLaSH with, so proper synthesizers can be written in pure CLaSH. if that's not enough for some super-asynchronous synchronizer design, one can always fall back on writing their own black-box as described above.
I don't think anyone imagines that CLaSH will be immediately understandable to someone who has never used Haskell. So no way does anyone expect the benefits will be immediately clear. So are you saying the restrictions I mention sound too onerous, or are you saying "I dunno, it looks weird"?
If the former, that's perfectly acceptable, thank you for reading.
If the latter, I'm sorry but this is a pet peeve of mine--we get this a lot in the function programming community. Understand that we are claiming the benefits are worth the non-trivial learning curve. If it was so damn obvious, it couldn't offer much benefit over the status quo---people would have already switched en mass and it would be the status quo.
While C-esque cuteness looks nice, I agree such things are doomed to failure. The C model is easy enough to stand, but it's linearity, implicit state, and notion of control flow have nothing to do with the hardware---you can understand both models, but the compilation process is necessarily non-trivial and sufficiently "far from subjective" that many designs cannot be expressed at all, and many more must be expressed through very round about means.
Functional HDLs like CLaSH have a dead-simple structural compilation model, so while they may not understand every circuit, they can express it---the compiler is near subjective but not homomorphic counting these like this SR flip-flop:
\ r s -> let q = nor r q'
q' = nor s q
in (q, q')
This compiles to exactly what it looks like, but diverges (i.e. infinite loops) under Haskell's semantics.
Verification IP is reusable code created by verification engineers. E.g. Say the designers are developing a networking module. The verification engineers would build the verification IP to generate the network packets. They also build the monitors to check the network protocols. For any design under verification, there is a corresponding amount of verification providing stimulus and checking.
The reason I bring this up is: verification is the hard part of the HW workflow. The other similarly tough part is synthesis. Every single project I have been in, verification and synthesis are the toughest tasks that consumes the most team effort. Not design. When we plan projects, it all revolves around the verification task.
For every bus, every module and at various stages of SoC integration, we are writing verification code using System Verilog.
If you want to improve our tools, I would look at the verification/simulation or the synthesis side.
For all you saying "but sequential is the hard part". Yes functional programming is most clearly a smash-hit with combinatorial circuits, but CLaSH shines with sequential circuits too. Basically, time-varying nets are represented as infinite streams where the nth element is the value at the nth cycle. Registers are basically `cons`, they just "delay" the stream by one cycle, tacking on the initial value in front. To make complex sequential code you just "tie the knot" around the streams with `letrec`s -- which actually corresponds to exactly what the feedback loop looks like on the schematic. [Anyone that's done FRP should recognize this ideom.] In this way CLaSH is both more high-level and more low level than Verilog/VHDL: clocks nets are derived automatically, but feed back loops are explicit.
Now if you are an electrical engineer, subsisting one tedious task (routing clocks) for another (programming without "blocking assignment") might seem like no net gain. But us functional programmers are fluent at working with such fix-points, and abstracting both what we tie together an the knot-tying itself. The Moore and Mealy combinatorial are the tip of the iceberg -- examples that we hope will be more accessible to electrical engineers unfamiliar with functional programming.
For all you saying that "the hard part isn't working with the HDL at all, but lower level concerns like timing, layout, etc", I have two things.
First you are acting like HDL writing is not on the "critical path" of your development process, and thus of no concern. Well that's not just true--you can't have one engineer do HDL writing, one do layout, and one engineer do testing completely independently because there are some basic data dependencies here that linearized the development workflow. It may not be the component with the "most delay" but it's still on that critical path, and thus improving it will yield at least some speedup to some extent. Automatic layout and timing analysis is great too, but unless you have a massive amount of computing power at your disposal, AFIAK you can't get very far, so improving the HDL side of things might be the /best/ you can do.
Second, there is the development cost of finding all your bugs with low-level tools. Yes timing analysis is essential, but it's not great in diagnosing the underlying problem. If you have lots of code that, well, isn't very aesthetically pleasing, and you do all your debugging on FPGA or with timing analysis, I suppose just about all bugs look like timing issues. With CLaSH:
- You have far less code, and it's more high-level, so just reading looking for errors it is more productive. - You can try out your code on the repl, seeing providing a stream of inputs and getting a stream of output. High level state machine errors (do you really nail this the first time with verilog?) are easily caught this way. - Because you have more opportunities to modularize your code, you have more opportunities to test components in isolation. Unit tests vs. System tests--y'all know the deal. The former is no panacea, but obviously it makes complete code/path coverage way more tractable computationally. - QuickCheck. I generate programs, run my single-cycle and pipeline processor for n cycles, see if they both halted and compare register/mem stage, otherwise throw out the test. I /suppose/ you could do this with C-augmented testbenches, but it would be way, way, way more code and effort. QuichCheck worked so well that I never wrote a test bench. - EVENTUALLY, with idris or [faking it with] dependent Haskell prove your circuit correct up to the synchronous model CLaSH is built around.
In practice I can say I honestly wrote and debugged programs all from GHCi (the Haskell REPL, so very much in software land), and saw them work first time on the FPGA. Where this didn't happen was usually do to a black boxes, like megfunctions and other components on the dev board. Obviously my Haskell testing is of no use if I model them wrong in CLaSH.
Finally, it would be dishonest and misleading to not mention CLaSH's downsides. CLaSH is designed assuming your circuit is totally synchronous (or purely combinatorial, but that's the trivial case). I don't know often this comes up in the real world, but in interfacing with the components on the dead board, I often had to do things that violated rigidly synchronous circuit design --- inverting my clock to get a second 180-degree-off clock domain, asynchronous communication with SRAM. [CLaSH supports multiple clock domains, but only knows about their relative frequency, not phase.] You can often still describe these circuit in CLaSH, but because it violates its synchronous model, it won't understand them and neither will your Haskell-land testing infrastructure. Basically you loose the benefits that made CLaSH great in the first place. Fundamentally, I think true fixing these cases means designing a lower-level "asynchronous CLaSH" that both normal CLaSH and these cases can elaborate to. Trying to tack them on as special cases to CLaSH and it's synchronous modle won't fly.
But all is not lost, if you can contain the model-violation to one bit of code and give it kosher synchronous interface, you are all good. Write some Haskell to simulate what it does (need not be even in the subset CLaSH can understand), and make a Verilog/VHDL black box. CLaSH doesn't help you with that module, but that module doesn't pollute the rest of your program either. Most real-world designs are by and large synchronous, unless the world has been lying to me. So the quarantined modules would never form a significant part of your program.
That about wraps it up, ...hope somebody's still reading this thread after writing all that.