Re: A question about lexer portability in C ?

Henry Spencer <>
28 Sep 1997 23:17:03 -0400

          From comp.compilers

Related articles
A question about lexer portability in C ? (Frederic Guerin) (1997-09-23)
Re: A question about lexer portability in C ? (1997-09-24)
Re: A question about lexer portability in C ? (Henry Spencer) (1997-09-28)
| List of all articles for this month |

From: Henry Spencer <>
Newsgroups: comp.compilers
Date: 28 Sep 1997 23:17:03 -0400
Organization: SP Systems, Toronto
References: 97-09-090
Keywords: lex, i18n

Frederic Guerin <> wrote:
>The question is : Can I fix this table at compile time or do I need to
>build it at run time so as to make sure that the correct codes will be
>assigned to the correct characters ?

In general, you must build it at run time. Different users, even on a
single system, may be using different character sets, with different
ideas about what constitutes (say) an alphabetic character. Except in
unusually favorable environments, there's just no way to pre-build a
single copy of the code and have it always get things right.

>...May I assume that all character sets used
>over the world are superset of the ANSI one ( with identical character
>code ) ?

Unfortunately, no. First, as our moderator mentioned, there is still
substantial use of totally non-ASCII character sets like EBCDIC. Second,
there is still substantial use of other ISO646-derived character sets
which resemble ASCII but are not supersets of it -- for example, some of
them have extra alphabetic characters where ASCII puts characters like "`"
and "[" and "|". Third, even when character sets are exact supersets of
ASCII, that doesn't mean you can just ignore the non-ASCII part, because
non-English users in particular will want to put non-ASCII alphabetics
into identifiers etc.
| Henry Spencer

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.