I'm trying to search for a string in an XML
file, increment the number by 1
that immediately follows it, and then save the changes back to that same file. There is only one instance of this string.
My file looks like this:
<attribute>
<name>test</name>
<type>java.lang.String</type>
<value>node1-3</value>
</attribute>
I'm trying to change the 3
(after node1-) and increment it by 1
every time I run a command. I've tried the following sed, separating that line into 4
pieces, and replacing it with those 4
pieces, plus an increment. Unfortunately, it doesn't seem to do anything:
sed -i -r -e 's/(.*)(\node1-)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/g' filepath
I've also tried awk
, which seems to be getting me somewhere, but I'm not sure how to append the second half of the line back in (
awk '{FS=OFS="-" }/node1/{$2+=1}1' filepath
Finally, I tried perl, but its incrementing the wrong number, from node1
to node2
, rather than after the dash:
perl -i -pe '/node1-/ && s/(\d+)(.*)/$1+1 . $2/e' filepath
I'm new to these commands, and am not so solid on my regex. I'm trying to get this command working, so that I could use this in a bash script I'm writing. What is the best approach to take? Which command has an advantage over the other? I'd like to have a 1
line command to simplify things for later.
Process the file using an XML parser. This is just better in every way than hacking it with a regex.
use warnings;
use strict;
use XML::LibXML;
my $file = shift // die "Usage: $0 file\n";
my $doc = XML::LibXML->load_xml(location => $file);
my ($node) = $doc->findnodes('//value');
my $new_value = $node->to_literal =~ s/node1\-\K([0-9]+)/1+$1/er;
$node->removeChildNodes();
$node->appendText($new_value);
$doc->toFile('new_' . $file); # or just $file to overwrite
Change the output filename to the input name ($file
) to overwrite, once tested fully.
Removing and adding a node like above is one way to change an XML object.
Or, setData on the first child
$node->firstChild->setData($new_value);
where setData
can be used on a node of type text
, cdata
or comment
.
Or, search for text and then work with a text node directly
my ($tnode) = $doc->findnodes('//value/text()');
my $new_value = $tnode =~ s/node1\-\K([0-9]+)/1+$1/er;
$tnode->setData($new_value);
print $doc->toString;
There's more. What method to use depends on all that need be done. If the sole job is indeed to just edit that text then the simplest way is probably to get a text
node.
I don't like using line-oriented text processing for modifying XML. You lose context and position and you can't tell if you are actually modifying what you think you are (inside comments, CDATA, etc).
But, ignoring that, here's your one-liner that has an easy fix. Basically, you aren't anchoring correctly. You match the first group of digits when you want the second:
$ perl -i -pe '/node1-/ && s/(\d+)(.*)/$1+1 . $2/e' filepath
Instead, match a group of digits immediately before a <
. The (?=...)
is a positive lookahead that doesn't match characters (just the condition), so you don't substitute those:
$ perl -i -pe '/node1-/ && s/(\d+)(?=<)/$1+1/e' filepath
However, I'd combine the first match. The \K
allows you to ignore part of a substitution's match. You have to match the stuff before \K
, but you won't replace that part:
$ perl -i -pe 's/node1-\K(\d+)/$1+1/e' filepath
Again, these might work, but eventually you (more likely the next guy) will be burned by it. I don't know your situation, but as I often advise people: it's not the rarity, it's the calamity.
Can you just hard-code the final part of the node line?
$ awk '{FS=OFS="-" }/node1/{$2+=1; print $1 "-" $2 "</value>"} $0 !~ /node1/ {print}' file
<attribute>
<name>test</name>
<type>java.lang.String</type>
<value>node1-4</value>
</attribute>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With