Lex and Yacc successor tools

"Margaret Crowe" <Malcolm_Crowe@msn.com>
Sun, 3 Sep 1995 09:31:09 GMT

          From comp.compilers

Related articles
Lex and Yacc successor tools Malcolm_Crowe@msn.com (Margaret Crowe) (1995-09-03)
| List of all articles for this month |

Newsgroups: comp.compilers
From: "Margaret Crowe" <Malcolm_Crowe@msn.com>
Keywords: lex, yacc, books
Organization: Compilers Central
Date: Sun, 3 Sep 1995 09:31:09 GMT

Compiler Writing Tools Using C++


Malcolm Crowe, August 1995


mkc@paisley.ac.uk
100127.116@compuserve.com


I have completed implementation of tools called LexerGenerator and
ParserGenerator in the tradition of lex and yacc. I would like to
hear of interest in this software. I am writing a book describing the
software and algorithms (20,000 words so far).


Please let me know if your are interested and how the results should
be published. I am happy to send copies "as is" free of charge to
anyone interested. The zip file is 144K in size including the book in
Word 6 format.


This book presents compiler writing tools in the tradition of lex and
yacc, but taking full advantage of the facilities offered by C++ and
the 32-bit Microsoft Foundation Classes Library (MFC). The tools are
written using the same object- oriented techniques and are provided
in source form to assist an understanding of the standard algorithms
used.


Lex and yacc have been with us for a long time, and continue to hold
their appeal despite the arrival of other approaches and toolkits for
compiler writing. It seems now to be time for a complete redesign of
these tools, rather than an attempt to supplant them. The objectives
of such a redesign should be that they and the compilers that result
from their use can again be seen as examples of good programming
practice. This is the aim of this little book. The tools are renamed
LexerGenerator and ParserGenerator to avoid confusion with their
predecessors. Their implementation is presented here for the Windows
operating system, using the 32-bit versions Windows 95 or Windows NT
or later. The tools are enabled for integration in Microsoft's Visual
Tools workbench.


The approach that has been taken to the compiler writing tools is to
leave untouched the core notations used by lex and yacc, of,
respectively, regular expressions to define lexical elements, and
BNF-style productions for the syntax, of the proposed compiler's
source language. To retain some further compatibility with lex and
yacc, both of these specifications can contain actions coded in C++.
For compatibility purposes, it is still possible to write these
actions in the lex and yacc form, and this still results in the
generation of some ugly code. In this version, however, the principal
way to implement the other stages of compilation is to define a set
(or hierarchy) of C++ classes for the different symbols in the
language being compiled, and the different nodes in the tree
structures used in the internal working of the compiler being
written. The resulting code is much more elegant and easier to
maintain.


This approach has motivated some compromises with the notational
conventions of the Microsoft Foundation Classes (MFC): it seemed
natural to use the name of the language symbol (e.g. Expression) for
the corresponding C++ classes, whereas the MFC convention is that all
class names begin with the letter C.


It is also convenient to make all parts of these classes public, so
that C++ classes are actually defined using the keyword struct rather
than class. Although many will argue that this defeats the whole
purpose of object-orientation, it is entirely a question of who the
users of the objects are that determines what should be public and
what private. In this case, it is usual in attribute grammar
definitions to allow direct access to attributes of the symbols
involved in productions when a rule reduces: this means that all such
attributes must be visible to the constructors of all other symbol
classes. While it is theoretically possible to arrange access to
these attributes by generating extra functions and friend
declarations, it complicates the task unduly in a project where being
able to explain the principles involved is as important as developing
robust compiler writing tools.


Full user documentation and a number of examples are provided, making
this book suitable for regular use by compiler writers. A full
technical account of the operation of the tools, and the algorithms
used, is also provided, to enable advanced users to add their own
extensions. It is therefore not intended as a self-contained text for
a Language Design and Implementation course. A full discussion of
compilation techniques and the different algorithms that are used
here can be found in Wilhelm, R, Maurer, D: Compiler Design
(Addison-Wesley, 1995) and Aho, A, Sethi, R, Ullman, J D: Compilers:
Principles, Techniques and Tools (Addison-Wesley, 1986). The reader
is expected to be familiar with the general approach to compilation
outlined in these texts, with the principles of programming in C++
and using MFC, so that the explanations of the algorithms and coding
techniques used can be provided in a reasonable space.


--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.