Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegExp, Remove dots in tags

Tags:

c#

regex

I've been searching and searching but just can't find the solution.
I need to remove the dots in the tags of an XML doc with RegExp in c#....

so for example:

test <12.34.56>test.test<12.34>

should be:

test <12346>test.test<1234>

So basically removing dots but only in the tags.... any ideas?

like image 223
NicolajB Avatar asked Feb 15 '12 11:02

NicolajB


2 Answers

resultString = Regex.Replace(subjectString, @"\.(?=[^<>]*>)", "");

replaces a dot with the empty string only if the next following angle bracket is a closing angle bracket.

This is of course brittle since closing angle brackets might occur inside the text between tags, but if you're sure that won't be the case, you should be OK.

Explanation:

\.      # Match a dot
(?=     # only if the following regex can be matched at the current position:
 [^<>]* #  - zero or more characters except < or >
 >      #  - followed by a >
)       # End of lookahead assertion
like image 66
Tim Pietzcker Avatar answered Oct 05 '22 22:10

Tim Pietzcker


I would use an xml parser for it

XDocument xdoc = XDocument.Load(new StringReader("<root><s123.45><s678.9>aaaa</s678.9></s123.45></root>"));
foreach (var elem in xdoc.Descendants()) 
    elem.Name = elem.Name.LocalName.Replace(".", "");
Console.WriteLine(xdoc);
like image 21
L.B Avatar answered Oct 06 '22 00:10

L.B