Regular Expressions::Removing HTML
ASPN : Rx Cookbook
From: http://aspn.activestate.com/ASPN/Cookbook/Rx/Recipe/59820 - a site containing excellent examples of useful regular expressions
When writing CGI scripts which suck in textual content from users (such as discussion threads, for example), it's often useful to be able to detect and/or remove HTML tags in user-submitted content. This regular expression, documented in perlfaq6, is relatively effective at getting rid of HTML:
while(<>) {
s/<(?:[^>'"]*|(['"]).*?\1)*>//gs;
}
No comments:
Post a Comment