Wed, 22 Jan 2020 17:12:22 GMT

Related articles |
---|

[9 earlier articles] |

Re: A minimal LL(1) parser generator ? gneuner2@comcast.net (George Neuner) (2020-01-02) |

Re: A minimal LL(1) parser generator ? rockbrentwood@gmail.com (2020-01-04) |

Re: A minimal LL(1) parser generator ? gaztoast@gmail.com (honey crisis) (2020-01-05) |

Re: A minimal LL(1) parser generator ? carlglassberg@gmail.com (2020-01-05) |

Re: A minimal LL(1) parser generator ? carlglassberg@gmail.com (2020-01-05) |

Re: A minimal LL(1) parser generator ? carlglassberg@gmail.com (2020-01-05) |

Re: A minimal LL(1) parser generator ? anton@mips.complang.tuwien.ac.at (2020-01-22) |

Re: A minimal LL(1) parser generator ? anton@mips.complang.tuwien.ac.at (2020-01-22) |

Re: A minimal LL(1) parser generator ? carlglassberg@gmail.com (2020-01-23) |

Re: A minimal LL(1) parser generator ? anton@mips.complang.tuwien.ac.at (2020-01-25) |

Re: A minimal LL(1) parser generator ? FredJScipione@alum.RPI.edu (Fred J. Scipione) (2020-01-25) |

From: | anton@mips.complang.tuwien.ac.at (Anton Ertl) |

Newsgroups: | comp.compilers |

Date: | Wed, 22 Jan 2020 17:12:22 GMT |

Organization: | Institut fuer Computersprachen, Technische Universitaet Wien |

References: | 19-12-016 19-12-030 19-12-032 19-12-040 20-01-001 20-01-003 20-01-008 |

Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="27414"; mail-complaints-to="abuse@iecc.com" |

Keywords: | parse, syntax |

Posted-Date: | 22 Jan 2020 16:10:13 EST |

carlglassberg@gmail.com writes:

[Applied the corrections from later postings]

*>On Thursday, January 2, 2020 at 10:21:53 AM UTC-8, Anton Ertl wrote:*

*>...*

*>> (( a b && c d && ))*

*>*

*>1. For concatenation of these, I would write, in Gray:*

*> ((a b &&))((c d &&)) for EBNF: a { b a } c { d c }*

*>because wouldn't*

*> ((a b && c d &&))*

*>be the same as?*

*> a { b a } c { d a { b a } c }*

In Gray, you cannot concatenate x and y by writing them adjacent to

one another, so

a b && c

are just two unrelated grammar expressions: "a b &&" and "c". If you continue that with

a b && c d

you now have three unrelated grammar expressions. Now apply the &&

operator:

a b && c d &&

and you have two unrelated grammar expressions: "a b &&" and "c d &&".

In order to combine them into a concatenatio, you normally surround

them with (( ... )):

(( a b && c d && ))

(or alternatively, you use the binary postfix operator "concat"). Likewise,

(( a b && )) (( c d && ))

would be two unrelated grammar expressions.

Also, note that each word (including grammar operators) has to be

separated from adjacent words with white space.

*>2. Let us compare separator-list notation.*

*>*

*>In Waite and Goos, "Compiler Construction", there is an infix operator, "||", for separator list, not to be confused with Gray's meta-symbol for "or".*

*>And Eiffel has a special notation for separator list.*

*>*

*>In comparison, Anton's Gray has a postfix operator, "&&", for separator-list.*

*>*

*>/Comparison of separator-list notation in*

*>Waite/Goos EBNF, Eiffel EBNF (BNF-E), and Gray EBNF/*

*>*

*> Wirth EBNF Gray EBNF:*

*>-------------------------------------------------------------------*

*>Waite/Goos: x || y ==========> x { y x } x y &&*

*>*

*>In Eiffel: { x y ...}* ===> [ x { y x } ] (( x y && )) ??*

*> { x y ...}+ ===> x { y x } x y &&*

*>-------------------------------------------------------------------*

(( x y && )) ??

is equivalent to

x y && ??

*>I do wish Gray had a production rule terminator, say semicolon ";" or stop "."*

*>*

*>The end of 1 rule would be clearly distinguishable from the beginning of the next without special processing of end-of-line which makes scanning and*

*>parsing space sensitive. We would avoid the need for LL(2) parsing or other*

*>parsing and scanning techniques.*

Gray just uses Forth's parser (which just scans for white space, i.e.,

as far as the parser is concerned, it's a regular, not a context-free

language). The rest works by putting grammar expressions on the

stack, and having grammar operators that take grammar expressions from

the stack and put the resulting grammar expression on the stack;

that's why the operators are postfix. The

(( ... || ... || ... ))

syntax is syntactic sugar for a more readable syntax than using the

binary postfix operators "concat" and "alt". If && had been part of

the original concept instead of an afterthought that demonstrates the

extensibility, maybe I would have had syntactic sugar for that, too.

Even as a Forth programmer, I admit that binary postfix operators are

not very readable when expressions with three or more binary operators

are involved (which is rare in most programming, but pretty common in

grammars).

Back to the question of production rule terminators: As a result of

the above, a rule typically reads

(( ... )) <- nonterminal

so it's pretty clear to the system and the programmer where a rule

starts and where it ends. However, there is little error checking

involved, and if you write

(( ... (( ... )) <- nonterminal

(i.e., if you forget to write all the necessary closing "))"), you

will get stuff left over on the stack and the nonterminal will not do

what you want. While the compiler does not flag that as error, this

error is easy to notice and find in testing on typical Forth systems

(and it's something you could also get with other Forth code).

- anton

--

M. Anton Ertl

anton@mips.complang.tuwien.ac.at

http://www.complang.tuwien.ac.at/anton/

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.