You know, I think it's a fine argument to say, "Yes, Lisp has a lot of parenthes...

zeveb · on June 28, 2017

Interesting analysis, but what do you get for Lisp? The Lisp version is available at https://rosettacode.org/wiki/Conway%27s_Game_of_Life#Common_...

I suspect that yorwba has a good point re. identifiers (note too that *, + and - are allowed & quite common in Lisp).

Also note that Racket is not Scheme, so using '.scm' as an identifier for it is a bit misleading:-)

yorwba · on June 28, 2017

I think most of the non-bracket "punctuation" in Racket is caused by the kebab-case identifiers and to a lesser degree by ? and ! in function names.

So in fact "Other languages have more punctuation characters than Lisp." is not an objective statement at all, since it relies on the definition of punctuation, which is subjective or at least language-specific.

munificent · on June 28, 2017

Good point. I updated my little script to (roughly) take each language's identifier rules into account. Now "?" is considered an identifier character in Ruby and "-", "!", and "?" are punctuation in Racket. It gets a little fuzzy, of course because ">" works like punctuation in "(> 1 2)" but less so in "(number->string 123)".

              bracket   punctuation
    life.java  13.25%        27.52%
    life.py    13.27%        21.95%
    life.rb     9.56%        23.29%
    life.scm   22.29%        24.81%

This gets Racket down to having similar levels of punctuation but note that, of course, brackets are still significantly more common than other languages.

white-flame · on June 29, 2017

Personally, I would remove =+-*/ from punctuation. They're generally meaningful common & fundamental operators, not just groupers/separators/terminators. And as mentioned, in Lisp, symbols commonly contain dashes (and less commonly other punctuation), including standard-function-names, which would skew the results. I'm not super familiar with Racket, but presumably it retains this syntax style.

But generally, Lisps aren't intended to be compact in the small, but semantically regular, which can be a definition of "simpler", and give smaller code in larger systems. Source code compression for syntactically-atomic operations is where all the other varied punctuation comes from, which addresses snippets of code, not large abstraction gains. So yes, there are cases where Lisps will have more parens than other languages have total punctuation. Even in such cases, the other languages are more complex to read, both for humans and for machines.

Note that I didn't only mention quantity, but also more complex interspersal. Most other languages have commas for field separators, while Lisps tend only to use naturally-occurring whitespace. For instance, f(a,b,); is commonly an error, while (f a b ) is valid Lisp syntax, as a low baseline example. Prolog-derived languages go even more nuts with ; and . than C-style terminators. And the brackets vs braces vs parens all nest together. The }]);}]); style closing mess is very common in JavaScript, where objects and function objects nest heavily due to its callback-centric style.

Other languages also have a mix of infix, postfix, and prefix ordering:

  fun5((fun1() op3 fun2()).fun4()).fun6();

The function calls & operators are numbered by their final execution ordering (assuming op3 isn't short-circuiting). This is just messy to read and edit, again both for humans and machines.

And the biggest reason why things are so much more regular in Lisp is metaprogramming, code generation, and code transformation. This is such a pain in the butt in other language's direct source code representations, unless separate ASTs are involved, and ASTs aren't editable as plain source code for template-style use. Separate libraries could tackle this, but they would have to be maintained in lockstep with the language itself, and manually replicate every subtle nuance of the language to avoid bugs (including replicating reference implementation bugs).

So this whole thing is about technical complexity of representation, not just personal preference.

And for a bit of levity: http://www.loper-os.org/wp-content/parphobia.png :)

zmonx · on June 29, 2017

> Prolog-derived languages go even more nuts with ; and . than C-style terminators.

What do you mean by this? In Prolog, (;)/2 is simply a binary infix operator, so that you can write a term of the form:

    ;(A, B)

i.e, a binary term whose primary functor is a semicolon, equivalently as:

    ( A ; B )

where the semicolon now occurs as infix operator instead of at its canonical position at the beginning of the term.

In Prolog, you can always use the predicate write_canonical/1 to obtain the canonical representation of any term, which is completely regular. In the case above, we get:

    ?- write_canonical( (a ; b) ).
    ;(a,b)

This confirms that (a;b) and ;(a,b) denote the exact same Prolog term which cannot be distinguished at the AST level.

We have also several other ways to inspect the term, such as functor/3:

    ?- functor((a;b), Functor, Arity).
    Functor =  (;),
    Arity = 2.

As expected, this again confirms that the primary functor is the semicolon in this case.

Further, the period '.' has a dual role as the primary functor for non-empty lists, and also to mark the end of a term.

For example, in a Prolog fact of the form:

    hello(world).

The period states that the term is complete. If it were not used, the term could have continued for example like this:

    hello(world) + universe.

In that case, (+)/2 is the primary functor, as can be verified with:

    ?- write_canonical(hello(world) + universe).
    +(hello(world),universe)

Thus, '.' is used to mark the end of clauses, since each clause is a valid Prolog term.

If you want, you can always use the canonical representation of Prolog terms in your code. However, this would make writing and reading Prolog at least as inconvenient as Lisp, because you then have to write everything in prefix notation instead of being able to benefit from infix operators too.

For example, using prefix syntax, you would define a Prolog rule as:

    :-(Head, Body).

whereas with infix syntax, you can write the exact same term equivalently as:

    Head :- Body.

Similarly, you would have to write:

    #=(X,+(5,3))

instead of using the typical infix notation for arithmetic predicates and expressions:

    X #= 5+3

Note that since the respective abstract syntax trees are completely the same, Prolog remains at least as amenable to metaprogramming as Lisp even though Prolog supports more flexible syntax!

white-flame · on June 30, 2017

Yes, I do like the fact that Prolog infix operators are just optional sugar for standard prefix definitions.

In talking against separators, in "Prolog-derived" languages like Erlang (and other custom derivations I've dealt with), separators are used even between clauses. Most other languages use separators just for fields, and use straight terminators for statements. Prolog itself gets a pass because these separators are literally ANDs and ORs which flow as logical expressions through the backtracking search.

Even toplevel expressions defining different match parameters for the same head require differentiating separators from a final terminator for that head in Erlang, whereas Prolog just deals with independently terminated head matches just fine. The Erlang code is much more removed from a logical AND/OR flow into more standard programming models, yet is still separator-based. Instead of thinking of the logical, partial evaluation possibilities of

  x AND y AND z

it's a more traditional

  statement 1,
  statement 2,
  statement 3.

with inconsistent characters at the end of each statement, when looking at it from a statement-centric point of view. Editing such code always requires a scan of the context of inter-statement relationship (especially when nesting) instead of just treating & terminating statements independently.

zmonx · on June 30, 2017

I agree, thank you for the clarification and additional examples!