Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove whitespace from XElement object created from XElement.ReadFrom(XmlReader)

I am parsing a large xml file. So I am using an XmlReader in combination with XElement instead of XElement.Load().

I have created as XElement object from XmlReader as shown below and here.

static IEnumerable<XElement> StreamRootChildDoc(string uri)
{
    using (XmlReader reader = XmlReader.Create(uri, xmlReaderSettings))
    {
        reader.MoveToContent();
        // Parse the file and display each of the nodes.
        while (reader.Read())
        {
            switch (reader.NodeType)
            {
                case XmlNodeType.Element:
                    if (reader.Name == "Child")
                    {
                        XElement el = XElement.ReadFrom(reader) as XElement;
                        if (el != null)
                            yield return el;
                    }
                    break;
            }
        }
    }
}

I want to save this XElement object content in the database as string without the whitespace. None of the below 3 ways work. Just a note, if I load the xml in memory using XElement.Load(), the ToString(SaveOptions.DisableFormatting) works.

<root>  <child></child>  </root> //xml saved in db with whitespace
<root><child></child></root> //want to save as this

XElement.ToString(SaveOptions.DisableFormatting) //
XElement.ToString(SaveOptions.None)
XElement.ToString()

The XmlReaderSettings I am using for the XmlReader object are below. I tried IgnoreWhitespace =true/false with no luck but I cannot set it as true as some elements are skipped (for reason for skipping, see Why does XmlReader skip every other element if there is no whitespace separator?).

    XmlReaderSettings xmlReaderSettings = new XmlReaderSettings();
    xmlReaderSettings.ProhibitDtd = false;
    //xmlReaderSettings.IgnoreWhitespace = true;//cannot use this setting

It works if I parse the XElement object but that defeats the whole purpose of using XmlReader as XElement.Parse() loads the xml in memory.

XElement el = XElement.ReadFrom(reader) as XElement;
XElement.Parse(el.ToString(), LoadOptions.None)

How can I remove the whitespace?

Edit: This is what I had to do:

  1. The elements skipping is due to two reads reader.Read() and XElement.ReadFrom(reader) in the same iteration thereby skipping every other element. Fixing the loop mentioned in the link above solves that. The issue has nothing to do with XmlReaderSettings.
  2. xelement.ToString(SaveOptions.DisableFormatting) removes the pretty formatting.
like image 378
hIpPy Avatar asked Feb 17 '10 01:02

hIpPy


1 Answers

Try using this example form the XMLTextReader class. The XMLTextReader has a method "WhitespaceHandling", which you can set to none. It would be helpfull to answer this question, if you could have provided a test XML file, to test is XMLTextReader works.

like image 77
Ali Khalid Avatar answered Nov 18 '22 20:11

Ali Khalid