Re: Regular Expressions

"Randall Hyde" <>
12 Oct 2004 00:51:07 -0400

          From comp.compilers

Related articles
Regular Expressions (2004-10-09)
Re: Regular Expressions (Eric Bodden) (2004-10-12)
Re: Regular Expressions (Randall Hyde) (2004-10-12)
Re: Regular Expressions (Sylvain Schmitz) (2004-10-12)
Re: Regular Expressions (Martin Ward) (2004-10-12)
Re: Regular Expressions (2004-10-12)
Re: Regular Expressions (David Z Maze) (2004-10-12)
Re: Regular Expressions (Martin Ward) (2004-10-17)
Re: Regular Expressions (ChokSheak Lau) (2004-10-21)
[8 later articles]
| List of all articles for this month |

From: "Randall Hyde" <>
Newsgroups: comp.compilers
Date: 12 Oct 2004 00:51:07 -0400
Organization: EarthLink Inc. --
References: 04-10-069
Keywords: lex
Posted-Date: 12 Oct 2004 00:51:07 EDT

"Mark" <> wrote in message

> I just can't seem to figure out how to invent a regular expression
> that will strip all HTML tags (except TABLE tags) out of a string and
> leave the rest of the text. When a TABLE tag is encountered i need to
> strip everything under it.
> This will strip all HTML out <[^>]*>
> But how do I make it also strip entire TABLE elements?
> Perhaps something like <table[^</table>]*</table>|<[^>]*>
> Thanks,
> Mark
> [That seems awfully complex for a single regex. -John]

Indeed, in general this requires a context-free grammar. I don't
understand the OP's exact problem well enough to determine if you can
get by with a regex.

Randy Hyde

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.