Is there a better flex ?

thuerman@ibr.cs.tu-bs.de (Urs Thuermann)
Mon, 8 Jul 91 16:25:26 +0200

          From comp.compilers

Related articles
Is there a better flex ? thuerman@ibr.cs.tu-bs.de (1991-07-08)
Re: Is there a better flex ? vern@horse.ee.lbl.gov (1991-07-26)
| List of all articles for this month |

Newsgroups: comp.compilers
From: thuerman@ibr.cs.tu-bs.de (Urs Thuermann)
Keywords: flex, question
Organization: Compilers Central
Date: Mon, 8 Jul 91 16:25:26 +0200

I have been working with the scanner generator 'flex' (version 04b of
30sep87) for some weeks and examined some of the automatons flex
constructed. I found out that flex, after converting the nfa to a dfa,
doesn't reduce the dfa, i.e. the dfa which flex outputs has equivalent
states which could be replaced by one state. For example, given the
regular expression for a C comment


                "/*"\/*([^*/]\/*|\*)*"*/"


flex generates an automaton with 8 states while the reduced automaton
would have only 5 states.
Another optimization I would like flex to do is generating ONE
accepting state for several rules with the same action. For the input


                digit [0-9]
                hex [0-9a-fA-F]
                oct [0-7]
                exp [eE][+-]?{digit}+


                %%


                [1-9]{digit}*[lL]? |
                0[xX]{hex}*[lL]? |
                0{oct}*[lL]? return (INT_CONSTANT);


                {digit}+\.{digit}*{exp}? |
                {digit}*\.{digit}+{exp}? |
                {digit}+{exp} return (FLOAT_CONSTANT);


flex could make an accepting state for the first three rules and one
for the last three rules instead of six accepting states. Combined
with elimination of equivalent states this would reduce the number of
states from 29 to 15 for the above example of int and float constants.
Maybe I also found a bug in flex: If I write


                [!%&()*+,-./:;<>?[\]^{}|~] return (yytext[0]);


one equivalence class for these 23 characters will be built, but if I
write


                !|%|&|\(|\)|\*|\+|,|-|\.|\/|:|;|\<|\>|\?|\[|\]|^|\{|\||\}|~
return (yytext[0]);


there is an equivalence class for each character.


Does anybody know of a version of flex which performs the optimizations
mentioned above and if so, where can I get it.


Please reply to thuerman@ibr.cs.tu-bs.de
[Vern Paxton, who wrote flex, often reads compilers. Perhaps he can say
why he did things the way he did. It is my recollection that there were
several cases where putative optimizations did not in fact make things
significantly faster or smaller. -John]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.