Re: Lisp syntax, was A simpler way to tokenize and parse?

Spiros Bousbouras <spibou@gmail.com>
Sat, 25 Mar 2023 11:55:35 -0000 (UTC)

          From comp.compilers

Related articles
A simpler way to tokenize and parse? costello@mitre.org (Roger L Costello) (2023-03-24)
Re: Lisp syntax, was A simpler way to tokenize and parse? spibou@gmail.com (Spiros Bousbouras) (2023-03-25)
Re: Lisp syntax, was A simpler way to tokenize and parse? anton@mips.complang.tuwien.ac.at (2023-03-25)
Re: Lisp syntax, was A simpler way to tokenize and parse? gah4@u.washington.edu (gah4) (2023-03-25)
Re: Lisp syntax, was A simpler way to tokenize and parse? 864-117-4973@kylheku.com (Kaz Kylheku) (2023-03-26)
| List of all articles for this month |

From: Spiros Bousbouras <spibou@gmail.com>
Newsgroups: comp.compilers
Date: Sat, 25 Mar 2023 11:55:35 -0000 (UTC)
Organization: Cyber23 news
References: 23-03-011
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="26179"; mail-complaints-to="abuse@iecc.com"
Keywords: Lisp, syntax
Posted-Date: 25 Mar 2023 10:48:42 EDT
In-Reply-To: 23-03-011

On Fri, 24 Mar 2023 14:45:40 +0000
Roger L Costello <costello@mitre.org> wrote:
> Hello Compiler Experts!
>
> I am reading the book, "Programming Languages, Application and Interpretation"
> by Shriram Krishnamurthi.
>
> The book says that Lisp and Scheme have a primitive called "read".
>
> The book says, "The read primitive is a crown jewel of Lisp and Scheme."
>
> Some of my notes from reading the book:
>
> - Read does tokenizing and reading.
> - Read returns a value known as an s-expression.
> - The s-expression is an intermediate representation.
> - The output of read is either a number or a list. That's it!


For educational examples perhaps it's only a number or a list. But for real
world usage it has to be more. In Common Lisp it can be a string or a symbol
(an identifier more or less) or a vector or a number of other Common Lisp
types. If the programmer defines his own classes (which then count as new
types) then automatically syntax is created to be able to read and return
such objects too.


[...]


> I've read several compiler books and none of them talked about this. They talk
> about creating a lexer to generate a stream of tokens and a parser that
> receives the tokens and arranges them into a tree data structure. Why no
> mention of the "crown jewel" of tokenizing/parsing? Why no mention of "one of
> the great ideas of computer science"?
>
> I have done some work with Flex and Bison and recently I've done some work
> with building parsers using read. My experience is the latter is much easier.
> Why isn't read more widely discussed and used in the compiler community?


Probably because there really isn't much to say. It's straightforward to parse
so if it works for your needs then you don't need to read any compiler books
about it.


> Surely the concept that read embodies is not specific to Lisp and Scheme,
> right?


It is specific to when you have a very simple and uniform syntax and experience
suggests that this isn't to most people's taste. Whether it is a result of
"mental wiring" or tradition (including mathematical tradition) to which one
gets exposed from a young age , I don't know. What I mean by this is that most
people seem to find it easier to read
      a + b * c
as opposed to
      (+ a (* b c))


and I don't know if this is just the result of early exposure or an inherent
part of how most humans' brains function.


Another issue is that sometimes people have to turn mathematical notation in
computer programmes and it is an extra step to transform
a + b * c to (+ a (* b c)) regardless of which one finds easier to read
in isolation.




In mathematical logic the formal syntax also specifies a uniform and simple
notation , usually based on parentheses , but even there authors immediately
introduce conventions about operator precedence so that you don't have to
read (and they don't have to type) so many parentheses. So what in formal
syntax would be for example
((A ∧ B) → C) becomes A ∧ B → C where ∧ is specified to have higher
precedence than → .




I note that Forth also has a very simple and uniform syntax and again Forth
isn't very popular.


Moderator wrote:
> /Roger
> [Yes, it's specific to Lisp and Scheme. They have an extremely simple
> symtax called S expressions of nested parenthesized lists of space
> separated tokens with some quoting. The original plan was that Lisp 2
> would have M expressions that looked more like a normal language but
> it's over 50 years later and they still haven't gotten around to it.


Actually Common Lisp has gotten around to it. I have seen Common Lisp
libraries which create a more conventional syntax for Common Lisp and even
claim to retain the power of writing macros. But I've never paid much
attention because the usual Common Lisp syntax works fine for me. So I can't
provide links , perhaps someone on comp.lang.lisp will know. I don't think
that such libraries have seen much if any use. From what I recall , even
their authors did not claim that they prefer the different syntax but they
were simply hoping that with a more conventional syntactic wrapper Common
Lisp (or some Lisp) would become more popular ; or perhaps they saw it as an
interesting intellectual exercise.


--
vlaho.ninja/prog


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.