You know, I think it's a fine argument to say, "Yes, Lisp has a lot of parentheses but we like how it looks." I don't think it makes sense to say, "No, other languages have more punctuation characters than Lisp." The latter is an objective statement that can be actually tested.
I went to Rosetta Code and grabbed the implementations of Conway's Game of Life in Racket, Ruby, Python, and Java. For each one, I manually removed comments and string literals. Then I wrote a little script. It looks at each character in the file and groups them into categories:
The Racket implementation has roughly twice as many bracketing characters as the other languages, and more of all kinds of punctuation than any of them. It may be that the languages and example programs I chose aren't representative, but I'd be surprised if another corpus was significantly different.
Most languages have multiple levels of precedence. The entire point of that is to allow eliding explicit bracketing characters for commonly nested subexpressions. You may dislike that feature, which is reasonable. But I don't think you can argue that it doesn't exist and doesn't accomplish what it intends to do. Consider:
I think most of the non-bracket "punctuation" in Racket is caused by the kebab-case identifiers and to a lesser degree by ? and ! in function names.
So in fact "Other languages have more punctuation characters than Lisp." is not an objective statement at all, since it relies on the definition of punctuation, which is subjective or at least language-specific.
Good point. I updated my little script to (roughly) take each language's identifier rules into account. Now "?" is considered an identifier character in Ruby and "-", "!", and "?" are punctuation in Racket. It gets a little fuzzy, of course because ">" works like punctuation in "(> 1 2)" but less so in "(number->string 123)".
This gets Racket down to having similar levels of punctuation but note that, of course, brackets are still significantly more common than other languages.
Personally, I would remove =+-*/ from punctuation. They're generally meaningful common & fundamental operators, not just groupers/separators/terminators. And as mentioned, in Lisp, symbols commonly contain dashes (and less commonly other punctuation), including standard-function-names, which would skew the results. I'm not super familiar with Racket, but presumably it retains this syntax style.
But generally, Lisps aren't intended to be compact in the small, but semantically regular, which can be a definition of "simpler", and give smaller code in larger systems. Source code compression for syntactically-atomic operations is where all the other varied punctuation comes from, which addresses snippets of code, not large abstraction gains. So yes, there are cases where Lisps will have more parens than other languages have total punctuation. Even in such cases, the other languages are more complex to read, both for humans and for machines.
Note that I didn't only mention quantity, but also more complex interspersal. Most other languages have commas for field separators, while Lisps tend only to use naturally-occurring whitespace. For instance, f(a,b,); is commonly an error, while (f a b ) is valid Lisp syntax, as a low baseline example. Prolog-derived languages go even more nuts with ; and . than C-style terminators. And the brackets vs braces vs parens all nest together. The }]);}]); style closing mess is very common in JavaScript, where objects and function objects nest heavily due to its callback-centric style.
Other languages also have a mix of infix, postfix, and prefix ordering:
fun5((fun1() op3 fun2()).fun4()).fun6();
The function calls & operators are numbered by their final execution ordering (assuming op3 isn't short-circuiting). This is just messy to read and edit, again both for humans and machines.
And the biggest reason why things are so much more regular in Lisp is metaprogramming, code generation, and code transformation. This is such a pain in the butt in other language's direct source code representations, unless separate ASTs are involved, and ASTs aren't editable as plain source code for template-style use. Separate libraries could tackle this, but they would have to be maintained in lockstep with the language itself, and manually replicate every subtle nuance of the language to avoid bugs (including replicating reference implementation bugs).
So this whole thing is about technical complexity of representation, not just personal preference.
> Prolog-derived languages go even more nuts with ; and . than C-style terminators.
What do you mean by this? In Prolog, (;)/2 is simply a binary infix operator, so that you can write a term of the form:
;(A, B)
i.e, a binary term whose primary functor is a semicolon, equivalently as:
( A ; B )
where the semicolon now occurs as infix operator instead of at its canonical position at the beginning of the term.
In Prolog, you can always use the predicate write_canonical/1 to obtain the canonical representation of any term, which is completely regular. In the case above, we get:
?- write_canonical( (a ; b) ).
;(a,b)
This confirms that (a;b) and ;(a,b) denote the exact same Prolog term which cannot be distinguished at the AST level.
We have also several other ways to inspect the term, such as functor/3:
Thus, '.' is used to mark the end of clauses, since each clause is a valid Prolog term.
If you want, you can always use the canonical representation of Prolog terms in your code. However, this would make writing and reading Prolog at least as inconvenient as Lisp, because you then have to write everything in prefix notation instead of being able to benefit from infix operators too.
For example, using prefix syntax, you would define a Prolog rule as:
:-(Head, Body).
whereas with infix syntax, you can write the exact same term equivalently as:
Head :- Body.
Similarly, you would have to write:
#=(X,+(5,3))
instead of using the typical infix notation for arithmetic predicates and expressions:
X #= 5+3
Note that since the respective abstract syntax trees are completely the same, Prolog remains at least as amenable to metaprogramming as Lisp even though Prolog supports more flexible syntax!
Yes, I do like the fact that Prolog infix operators are just optional sugar for standard prefix definitions.
In talking against separators, in "Prolog-derived" languages like Erlang (and other custom derivations I've dealt with), separators are used even between clauses. Most other languages use separators just for fields, and use straight terminators for statements. Prolog itself gets a pass because these separators are literally ANDs and ORs which flow as logical expressions through the backtracking search.
Even toplevel expressions defining different match parameters for the same head require differentiating separators from a final terminator for that head in Erlang, whereas Prolog just deals with independently terminated head matches just fine. The Erlang code is much more removed from a logical AND/OR flow into more standard programming models, yet is still separator-based. Instead of thinking of the logical, partial evaluation possibilities of
x AND y AND z
it's a more traditional
statement 1,
statement 2,
statement 3.
with inconsistent characters at the end of each statement, when looking at it from a statement-centric point of view. Editing such code always requires a scan of the context of inter-statement relationship (especially when nesting) instead of just treating & terminating statements independently.
I went to Rosetta Code and grabbed the implementations of Conway's Game of Life in Racket, Ruby, Python, and Java. For each one, I manually removed comments and string literals. Then I wrote a little script. It looks at each character in the file and groups them into categories:
Then I looked at each file to see what fraction of non-whitespace characters are brackets or punctuation (including brackets). Here's the results: The Racket implementation has roughly twice as many bracketing characters as the other languages, and more of all kinds of punctuation than any of them. It may be that the languages and example programs I chose aren't representative, but I'd be surprised if another corpus was significantly different.Most languages have multiple levels of precedence. The entire point of that is to allow eliding explicit bracketing characters for commonly nested subexpressions. You may dislike that feature, which is reasonable. But I don't think you can argue that it doesn't exist and doesn't accomplish what it intends to do. Consider: