Re: is lex useful? (Scott Nicol)
27 Jun 1996 11:34:47 -0400

          From comp.compilers

Related articles
[8 earlier articles]
Re: is lex useful? (1996-06-26)
Re: is lex useful? (Stefan Monnier) (1996-06-26)
Re: is lex useful? (1996-06-26)
Re: is lex useful? (1996-06-26)
Re: is lex useful? (Jerry Leichter) (1996-06-27)
Re: is lex useful? (Scott Stanchfield) (1996-06-27)
Re: is lex useful? (1996-06-27)
Re: is lex useful? (1996-06-27)
Re: is lex useful? 72510.2757@CompuServe.COM (Stephen Lindholm) (1996-06-27)
Re: is lex useful? (1996-06-27)
Re: is lex useful? (1996-06-30)
Re: is lex useful? Robert.Corbett@Eng.Sun.COM (1996-06-30)
Re: is lex useful? (1996-06-30)
[8 later articles]
| List of all articles for this month |

From: (Scott Nicol)
Newsgroups: comp.compilers
Date: 27 Jun 1996 11:34:47 -0400
Organization: Information Advantage
References: 96-06-073 96-06-105 96-06-112
Keywords: lex, i18n says...
>] - No support for wide (>8 bit) character sets. Even 8-bit support is
>] fairly recent. The obvious implementation for wide characters (expand
>] tables to 16 bits) isn't practical, because you would increase the tables
>] sizes (which are already huge) 256x.
>The other obvious option is to treat a 16bit char as two 8bit chars.
>It might be less readable, but it works great.

This won't work if you want to use POSIX-style character classes, i.e.

[[:alpha:]]+ return(IDENTIFIER);

It would be very messy getting Lex to do this if you wanted to have, say,
Japanese as your language.

Also, what is isalpha() in one locale is not necessarily isalpha() in
another. If you change your language (i.e. LANG, LC_COLLATE, or LC_ALL
environment variables on *nix), a Lex-generated scanner won't know the
difference, because the RE's are hard-coded in tables. At best you can
make Lex work for one locale.

Scott Nicol
Information Advantage, Inc.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.