I have an XML document from which I want to remove white spaces and carriage returns. How can I get the modified XML using C#.
White space is used in XML for readability and has no business meaning. Input XML messages can include line breaks, blanks lines, and spaces between tags (all shown in the following example). If you process XML messages that contain any of these spaces, they are represented as elements in the message tree.
The XmlDocument class is an in-memory representation of an XML document. It implements the W3C XML Document Object Model (DOM) Level 1 Core and the Core DOM Level 2. DOM stands for document object model. To read more about it, see XML Document Object Model (DOM).
Set the preserveWhitespace flag to false:
XmlDocument doc = new XmlDocument();
doc.PreserveWhitespace = false;
doc.Load("foo.xml");
// doc.InnerXml contains no spaces or returns
To remove white spaces between the tags:
# Regex regex = new Regex(@">\s*<");
# string cleanedXml = regex.Replace(dirtyXml, "><");
Source and other usefull info here
I solved this just using a more complete regex:
var regex = new Regex(@"[\s]+(?![^><]*(?:>|<\/))");
var cleanedXml = regex.Replace(xml, "");
This regex will remove all the spaces between closed tags.
Input example:
<root>
<result success="1"/>
<userID>12345</userID>
<classID> 56543 </classID>
</root>
Output for the input:
<root><result success="1"/><userID>12345</userID><classID> 56543 </classID>
</root>
A more complete explanation of this regex can be found on this post: https://stackoverflow.com/a/25771445/6846888
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With