Re: Use of XML in the compilers domain. (Was: sgml parser)

zalman@netcom.com (Zalman Stern)
24 Feb 1999 12:30:00 -0500

          From comp.compilers

Related articles
sgml parser taoudi@iname.com (1999-02-15)
Re: sgml parser jacob@jacob.remcomp.fr (1999-02-18)
Re: Use of XML in the compilers domain. (Was: sgml parser) zalman@netcom.com (1999-02-24)
| List of all articles for this month |

From: zalman@netcom.com (Zalman Stern)
Newsgroups: comp.compilers
Date: 24 Feb 1999 12:30:00 -0500
Organization: ICGNetcom
References: 99-02-069 99-02-095
Keywords: parse

Jacob Navia (jacob@jacob.remcomp.fr) wrote:
: I wonder if the BASIC LANGUAGE DESIGN of the 1990's is not centered
: around the basic need of justifying a new release every few months.
: Just look at C++ for instance.


XML is fairly much a subset of SGML which attempts to remove a lot of
features which make it difficult both to implement processors for the
language and to write documents and document definitions which
interoperate between different SGML procesors. (E.g. XML has *no*
optional features.) At the same time it works forward from HTML in an
attempt to provide a much more powerful format for representing
information. It is a refinement of existing practice more than a
fad. (Whether XML will be widely accepted is another story. It looks
like it will be.)


XML also came up recently in a discussion of grammar
representation. The world would be a better place if a well designed
XML DTD for language grammars was agreed upon. Think how many times
someone has had to manually extract a grammar from a language
specification, massage it into the proper format for their parser
generator, and then finally get down to the actual work of building
the parser. A standard XML DTD could eliminate that problem once and
for all by having the productions in the language specification
represented with standard tags. One would either feed the electronic
version of the spec to a tool to extract just the grammar, or have the
grammar referenced externally within the specification in the first
place. Either way, you would have an XML document representing just
the grammar. From there, one can either feed it to a parser generator
that understands XML, or feed it to a tool that converts the grammar
into a different parser generator language.


Complaints that XML is an "overkill representation" from an
implementation point of view are somewhat dubious because there
promises to be a boatload of reuseable XML parser technology for the
asking. I'm pretty sure grabbing a free XML parser will be faster than
writing your own parser from scratch for a slightly simpler
language. One can raise performance issues and perhaps one could say
"Why don't we all just use yacc?" But that's a different set of
arguments.


The above assumes language specifications will be done in XML at some
point in the near future. We can only hope... (Just try using a damned
PDF document as input to any other tool...)


-Z-


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.