Thursday 24 July 2008

REGEX to remove HTML formatting

I've seen a few ways of using Regex to remove HTML formatting, but the following seems to work fine for me!:

public string Strip(string text)
{
string sReturn = System.Text.RegularExpressions.Regex.Replace(text, @"<(.|\n)*?>", string.Empty);
sReturn = System.Text.RegularExpressions.Regex.Replace(text, @"(&\w*;)", string.Empty);
return sReturn;
}

No comments:

Post a Comment