Of course you can, but that's not going to work when you run out of stack space. The point of my stupid benchmark was to write the equivalent ruby code, and then the equivalent C code, forgetting the fact that there's a much more optimal way to do it (e.g. write it in scheme with tail calls).
Writing it tail-recursively forces you to switch from the O(2^n) algorithm (which has two recursive calls; you can't put both into tail position) to a more efficient algorithm. You could also write the more-efficient algorithm without tail recursion, of course.
No, you're not being a n00b. It'll help because the stack usage will grow linearly as opposed to quadratically, as in the case of the naive solution. You'll still overflow the stack without the optimization.
I just rewrote it in C though:
Which makes the python version something like 35x slower in wall clock time.