|Can Coco/R do multiple tokenizations email@example.com (2005-08-13)|
|Re: Can Coco/R do multiple tokenizations firstname.lastname@example.org (George Neuner) (2005-08-16)|
|Re: Can Coco/R do multiple tokenizations DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-08-16)|
|Re: Can Coco/R do multiple tokenizations email@example.com (Gene Wirchenko) (2005-08-16)|
|Re: Can Coco/R do multiple tokenizations cfc@shell01.TheWorld.com (Chris F Clark) (2005-08-21)|
|Re: Can Coco/R do multiple tokenizations firstname.lastname@example.org (Darius Blasband) (2005-08-21)|
|From:||Hans-Peter Diettrich <DrDiettrich@compuserve.de>|
|Date:||16 Aug 2005 11:17:35 -0400|
|Posted-Date:||16 Aug 2005 11:17:35 EDT|
> Consider a langauage, which allows ! and = in its identifiers.
> Of course usual C operators like !,= etc are also allowed.
Do you realize that your grammar is ambiguous at the *lexer* level?
> Consder this string (note no whitespaces ):
> Valid tokenization/parsing can yield several posibbilityes
> 1. 'a!=b' .. a single token.
> 2. 'a!' '=' 'b' .. an assignment
> 3. 'a' '!=' 'b' .. an comparision.
Most lexers will return the longest match, i.e. (1).
> The accpetenace of a particular choice is influenced by
> 1. has a token already been defined (if 'a' and 'b' have been
> defined, than (3) gets priority )
> 2. what are some tokens following this or preceding this string.
> .. if preceded by 'z =', than (3) can be omitted,
> .. if followed by '= z' then (1) is more probable.
> In case of ambiguity I'ld idealy like to generate error and abort.
Preceding tokens can be used to instruct an lexer about how to scan the
following input. Other conditions, in detail when depending on following
tokens, require very special (scannerless) parser generators.
> Now can Coco/R, (can any other parser/lexer generator ) can do
> multiple tokenizations & parser-tree-generations , so that I can
> give a priotity to each of these three, and it can call my function
> to accept one over others.
CoCo/R cannot do that - unless you transform it into something very
different. MetaS most probably can handle your language, when you can
provide the according grammar.
All in all I think that your language is b*llshit. The user cannot know
how his input is parsed, even if no error is reported - is this really
what you and your users want?
I'd suggest that your language should *require* whitespace between
identifiers and (at least) the ambiguous operators.
Return to the
Search the comp.compilers archives again.