ANN: Wave - A Standard conformant C++ preprocessor

hkaiser@users.sourceforge.net (hkaiser)
9 Mar 2003 17:47:34 -0500

          From comp.compilers

Related articles
ANN: Wave - A Standard conformant C++ preprocessor hkaiser@users.sourceforge.net (2003-03-09)
| List of all articles for this month |

From: hkaiser@users.sourceforge.net (hkaiser)
Newsgroups: comp.compilers
Date: 9 Mar 2003 17:47:34 -0500
Organization: http://groups.google.com/
Keywords: C++, available
Posted-Date: 09 Mar 2003 17:47:34 EST

Hi all,


Sorry for the somewhat lenghty post, but I hope it will be helpful for
someone of you.


The Boost.Spirit based C++ preprocessor iterator (the project name is
'Wave') is functionally complete now. All pp operators and pp
statements are in place, the macro expansion engine works as expected.
So I've released a first version: Wave V0.9.0 (please consider it a
beta).


Conceptually, the Wave library is a conformant (to the C++ Standard)
preprocessing C++ lexer, which exposes an (forward-) iterator
interface for iteration over the preprocessed C++ tokens.


The main goals for this project are:
- full conformance with the C++ standard (INCITS/ISO/IEC 14882/1998)
- usage of Spirit for the parsing parts of the game (certainly :-)
- maximal usage of STL and/or Boost libraries (for compactness and
maintainability)
- straightforward extendability for the implementation of additional
features (as variadics and placemarkers)
- building a flexible library for different C++ lexing and
preprocessing needs.


At the first steps it is not planned to make a very high performance
or very small C++ preprocessor. If you are looking for these
objectives you probably have to look at other places. Although the C++
preprocessor should work as expected and will be usable as a reference
implementation, for instance for testing of other preprocessor
oriented libraries as Boost.Preprocessor et.al. or for developing new
pp functionalities. Tests done by Paul Mensonides showed, that the
Wave library is very conformant to the C++ Standard, such that it
compiles several strict conformant modules written by him, which are
even not compilable with EDG based preprocessors (i.e. Comeau or
Intel).


The C++ preprocessor is not built as a monolitic application, it's
rather a modular library, which exposes a context object and an
iterator interface. The context object helps to configure the actual
pp process (as search path's, predefined macros, etc.). The exposed
iterators are generated by this context object too. Iterating over the
sequence defined by the two iterators will return the preprocessed
tokens, which are generated on the fly from the underlying input
stream. The overall preprocessing is a two stage process:


          input stream
          (characters)
                    |
                    v
        +-----------+
        | C++ lexer | (tokenizer)
        +-----------+
                    |
                    v
            pp tokens
                    |
                    v
        +-----------+
        |preprocess.| (macro expansion etc.)
        +-----------+
                    |
                    v
          preprocessed
          C++ tokens


As you can see, the input stream feeds a full C++ lexer module (the
generated C++ tokens here are exposed through an iteration interface
too). This C++ lexer allows the preprocessing module to work on
tokens, not directly on the character stream (performance!),
additionally this helps to resolve language ambiguities such as


      'some_class<include<some_term> >'


or similar (see C++ standard 2.1.1.3), which is difficult to do in a
one step process. During token generation the C++ lexer does physical
source lines splicing into logical source lines (removal of '\\'
followed by a '\n'), trigraph and alternative token recognition etc.


The exposed C++ lexer iteration interface generates the preprocessing
tokens consumed by the preprocessing module, which does the actual
work, the preprocessing :-). After this the resulting tokens are
converted to
C++ tokens exposed by the preprocessor interator.


To make the C++ preprocessing library modular, the C++ lexer is held
completely separate and independend from the preprocessor (it is
actually a template parameter). To proof this concept I've implemented
two different full blown C++ lexers (one based on a re2c based C++
lexer written by Dan Nuffer some time ago [VERY fast], the other based
on the Spirit based Slex dynamic lexing engine - a table driven DFA
[quite compact]). Both lexers are plugable into the preprocessor
through a unified iterator interface and are completely
interchangeable.


BTW the C++ lexers are usable standalone, without using the
preprocessing part of the library. It would be very interesting to
see, how the other existing and ongoing C++ lexers (see the Spirit
examples) fit into the picture. So the user of the final library will
be able to decide, which C++ lexer fits best his/her needs.


There a couple of things left by now:
- report the concatination of unrelated tokens as an error
- write a more complete documentation (for now please see the samples)
- test the Wave pp iterator more thoroughly


There is already some documentation in place, which you may use as a
starting point. If this isn't enough, there is a sample driver program
for the Wave library (source: cpp.cpp etc.), which fully utilizes the
capabilities of the library, so you may look at the source for further
information (for now).


You can find the Wave library in the Spirit CVS
(cvs.spirit.sourceforge.net:/cvsroot/spirit): 'spirit/wave'.
Additionally there there are zip and tar.gz files, that can be
downloaded here: http://sourceforge.net/projects/spirit/


There will be eventually separate releases of binary packages, built
for different platforms.


Please note, that to build the enclosed sample driver (essentially a
full blown text stream --> text stream preprocessor) you will need to
have a correctly installed Boost (http://www.boost.org) distribution
in place, because there are used several different Boost libraries (as
Boost.Filesystem, Boost.inreview.program_options etc.)


It is planned to bundle the Wave library later on with a strict
version of the pp-lib from Paul Mensonides (Boost.Preprocessor) and
put it into the Boost CVS.


The Wave library compiles and works so far with
- VC7.1 (final beta)
- gcc 3.2 (Cygwin and linux)
- IntelV7/DinkumwareSTL (from VC6sp5)
(other compilers were not tested by now).


Regards Hartmut


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.