I have a string with xml data that I pulled from a web service. The data is ugly and has some invalid chars in the Name tags of the xml. For example, I may see something like:
<Author>Scott the Coder</Author><Address#>My address</Address#>
The # in the Address name field is invalid. I am looking for a regular expression that will remove all the invalid chars from the name tags BUT leave all the chars in the Value section of the xml. In other words, I want to use RegEx to remvove chars only from the opening name tags and closing name tags. Everything else should remaing the same.
I don't have all the invalid chars yet, but this will get me started: #{}&()
Is it possible to do what I am trying to do?
If your intention is to only check validity of a name for a Xml node, I suggest you to take a look at the XmlConvert
class; especially the VerifyName
and VerifyNCName
methods.
Also note that with that class, you could accept any text as node name using the EncodeName
and EncodeLocalName
methods.
Using those methods will be far easier, safe and faster than performing a Regular Expression.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With