Re: parsing html? (Ralph Corderoy)
27 Dec 2001 00:11:08 -0500

          From comp.compilers

Related articles
parsing html? (Ian) (2001-12-22)
Re: parsing html? (Brock) (2001-12-24)
Re: parsing html? (2001-12-27)
Re: parsing html? (Robert Sherry) (2001-12-27)
Re: parsing html? (Ian) (2001-12-29)
Re: parsing html? (2002-01-24)
| List of all articles for this month |

From: (Ralph Corderoy)
Newsgroups: comp.compilers
Date: 27 Dec 2001 00:11:08 -0500
Organization: InputPlus Ltd.
References: 01-12-140
Keywords: parse
Posted-Date: 27 Dec 2001 00:11:08 EST

Hi Ian,

> [There is an official grammar for HTML, but it bears remarkably little
> relationship to the actual sloppy error-filled HTML that most web
> browsers manage to interpret. -John]

You could consider passing the HTML through Raggett's tidy first so you
have an easier job of parsing. Depends if that's allowed for your


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.