Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all XML Attributes with a Given Name

Tags:

c#

replace

xml

I am editing a series of XML files, and I need to remove all attributes with the name "foo". This attribute appears in more than one type of element. An example snippet from the XML might be:

<bodymatter id="######">
  <level1 id="######">
    <pagenum page="#####" id="######" foo="######" />
    <h1 id="#####" foo="#####">Header</h1>
    <imggroup id="#######">
               .
               .
              etc.

The best solution I have uses Regex:

Regex regex = new Regex("foo=\"" + ".*?" + "\"", RegexOptions.Singleline);
content = regex.Replace(content, "");

I know built-in XML parsers could help, but ideally I want to make simple XML replacements/removals without having to deal with the baggage of an entire XML parser. Is Regex the best solution in this case?

Edit:

After some research in the XmlDocument class, here is one possible solution I came up with (to remove more than one attribute type stored in the array "ids"):

private void removeAttributesbyName(string[] ids)
{
    XmlDocument doc = new XmlDocument();
    doc.Load(path);
    XmlNodeList xnlNodes = doc.GetElementsByTagName("*");
    foreach (XmlElement el in xnlNodes)
    {
        for (int i = 0; i <= ids.Length - 1; i++)
        {
            if (el.HasAttribute(ids[i]))
            {
                el.RemoveAttribute(ids[i]);
            }
            if (el.HasChildNodes)
            {
                foreach (XmlNode child in el.ChildNodes)
                {
                    if (child is XmlElement && (child as XmlElement).HasAttribute(ids[i]))
                    {
                        (child as XmlElement).RemoveAttribute(ids[i]);
                    }
                }
            }
        }
    }
}

I don't know if this is as efficient as it possibly could be, but I've tested it and it seems to work fine.

like image 596
CW_20161 Avatar asked Jul 26 '13 20:07

CW_20161


People also ask

Can XML elements have multiple attributes with same name?

You can't. Attribute names are unique per element. If you need to have multiple bits of data under the same name, then the usual solutions are either a space separated list or child elements.

What is attr in XML?

The Attr object represents an attribute of an Element object. The allowable values for attributes are usually defined in a DTD. Because the Attr object is also a Node, it inherits the Node object's properties and methods.

Can XML attributes be empty?

An element with no content is said to be empty. The two forms produce identical results in XML software (Readers, Parsers, Browsers). Empty elements can have attributes.

How do you delete an element in XML?

In the XML Files explorer, right-click the XML file or XML element that you want to remove and click Delete.


2 Answers

Do not use regex for XML manipulation. You can use Linq to XML:

XDocument xdoc = XDocument.Parse(xml);
foreach (var node in xdoc.Descendants().Where(e => e.Attribute("foo")!=null))
{
    node.Attribute("foo").Remove();
}

string result = xdoc.ToString();
like image 178
fcuesta Avatar answered Oct 02 '22 21:10

fcuesta


Is Regex the best solution in this case?

No.

You'll want to use something that works on XML at the object level (as an XmlElement, for example) and not at the string level.

like image 28
Andrew Coonce Avatar answered Oct 02 '22 21:10

Andrew Coonce