I am editing a series of XML files, and I need to remove all attributes with the name "foo". This attribute appears in more than one type of element. An example snippet from the XML might be:
<bodymatter id="######">
<level1 id="######">
<pagenum page="#####" id="######" foo="######" />
<h1 id="#####" foo="#####">Header</h1>
<imggroup id="#######">
.
.
etc.
The best solution I have uses Regex:
Regex regex = new Regex("foo=\"" + ".*?" + "\"", RegexOptions.Singleline);
content = regex.Replace(content, "");
I know built-in XML parsers could help, but ideally I want to make simple XML replacements/removals without having to deal with the baggage of an entire XML parser. Is Regex the best solution in this case?
Edit:
After some research in the XmlDocument class, here is one possible solution I came up with (to remove more than one attribute type stored in the array "ids"):
private void removeAttributesbyName(string[] ids)
{
XmlDocument doc = new XmlDocument();
doc.Load(path);
XmlNodeList xnlNodes = doc.GetElementsByTagName("*");
foreach (XmlElement el in xnlNodes)
{
for (int i = 0; i <= ids.Length - 1; i++)
{
if (el.HasAttribute(ids[i]))
{
el.RemoveAttribute(ids[i]);
}
if (el.HasChildNodes)
{
foreach (XmlNode child in el.ChildNodes)
{
if (child is XmlElement && (child as XmlElement).HasAttribute(ids[i]))
{
(child as XmlElement).RemoveAttribute(ids[i]);
}
}
}
}
}
}
I don't know if this is as efficient as it possibly could be, but I've tested it and it seems to work fine.
You can't. Attribute names are unique per element. If you need to have multiple bits of data under the same name, then the usual solutions are either a space separated list or child elements.
The Attr object represents an attribute of an Element object. The allowable values for attributes are usually defined in a DTD. Because the Attr object is also a Node, it inherits the Node object's properties and methods.
An element with no content is said to be empty. The two forms produce identical results in XML software (Readers, Parsers, Browsers). Empty elements can have attributes.
In the XML Files explorer, right-click the XML file or XML element that you want to remove and click Delete.
Do not use regex for XML manipulation. You can use Linq to XML:
XDocument xdoc = XDocument.Parse(xml);
foreach (var node in xdoc.Descendants().Where(e => e.Attribute("foo")!=null))
{
node.Attribute("foo").Remove();
}
string result = xdoc.ToString();
Is Regex the best solution in this case?
No.
You'll want to use something that works on XML at the object level (as an XmlElement
, for example) and not at the string
level.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With