I'm using asp.net 4 and c#.
I have a string that can contains:
Example string:
#Hi this is rèally/ special strìng!!!
I would like to:
a) Remove all Special Characters, like:
Hi this is rèally special strìng
b) Convert all Accented letters to NON Accented letters, like:
Hi this is really special string
c) Remove all Empty spaces and replace theme with a dash (-), like:
Hi-this-is-really-special-string
My aim is to creating a string suitable for URL path for better SEO.
Any idea how to do it with Regular Expression or another techniques?
Thanks for your help on this!
Similar to mathieu's answer, but more custom made for you requirements. This solution first strips special characters and diacritics from the input string, and then replaces whitespace with dashes:
string s = "#Hi this is rèally/ special strìng!!!";
string normalized = s.Normalize(NormalizationForm.FormD);
StringBuilder resultBuilder = new StringBuilder();
foreach (var character in normalized)
{
UnicodeCategory category = CharUnicodeInfo.GetUnicodeCategory(character);
if (category == UnicodeCategory.LowercaseLetter
|| category == UnicodeCategory.UppercaseLetter
|| category == UnicodeCategory.SpaceSeparator)
resultBuilder.Append(character);
}
string result = Regex.Replace(resultBuilder.ToString(), @"\s+", "-");
See it in action at ideone.com.
You should have a look a this answer : Ignoring accented letters in string comparison
Code here :
static string RemoveDiacritics(string sIn)
{
string sFormD = sIn.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
foreach (char ch in sFormD)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch);
if (uc != UnicodeCategory.NonSpacingMark)
{
sb.Append(ch);
}
}
return (sb.ToString().Normalize(NormalizationForm.FormC));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With