Re: Approaches to code formatters

"Nils M Holm" <nmh@t3x.org>
3 Dec 2002 00:40:24 -0500

          From comp.compilers

Related articles
Approaches to code formatters pioter@terramail.CUTTHIS.pl (Piotr Zgorecki) (2002-12-01)
Re: Approaches to code formatters aka@mvps.org (Alex K. Angelopoulos) (2002-12-03)
Re: Approaches to code formatters nmh@t3x.org (Nils M Holm) (2002-12-03)
Re: Approaches to code formatters idbaxter@semdesigns.com (Ira Baxter) (2002-12-03)
Re: Approaches to code formatters root@candle.pha.pa.us (2002-12-07)
| List of all articles for this month |

From: "Nils M Holm" <nmh@t3x.org>
Newsgroups: comp.compilers
Date: 3 Dec 2002 00:40:24 -0500
Organization: Compilers Central
References: 02-12-024
Keywords: tools
Posted-Date: 03 Dec 2002 00:40:22 EST

Piotr Zgorecki <pioter@terramail.cutthis.pl> wrote:
> I decided not to have any parser, [...]


A parser is not necessary, nor is a full scanner. It is sufficient to
recognize classes of tokens, like operators and literals, as well as
a few classes of keywords which are used to control the formatting of
the code.


White space and indentation can be handled by assigning properties to
classes of tokens. For example, the left brace in c may be assigned
the properties


pad (emit a space before this token (except at the beginning of a line))
rbrk (emit a newline after this token)
sid (start indentation)


and the right brace may be assigned


lbrk (emit a newline before this token)
rbrk (as above)
eid (end indentation)


Using these properties, the statement


if(x>y){x=y}


would be formatted this way:


if(x>y) {
x=y
}


Similar properties can be assigned to other classes of tokens. Of
course, there are many tokens which influence each other. For example,
a space character should be emitted between 'if' and '(', but no space
should be emitted between 'f' and '(' in 'f()'.


The T3X pretty printer (tools/txprint.t in the T3X source code
distribution at www.t3x.org/T3X/compilers.html) implements a complete
set of properties for formatting T3X code. Re-writing the formatter
to handle C should not be too hard.


An additional degree of flexibility could be added by reading
properties from a rule file rather than including them in the
source code.


Good luck,


Nils.
--
Nils M Holm <nmh@t3x.org> -- http://www.not-compatible.org/nmh/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.