Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find and Increment a Number in an XML File

Tags:

xml

awk

perl

I'm trying to search for a string in an XML file, increment the number by 1 that immediately follows it, and then save the changes back to that same file. There is only one instance of this string.

My file looks like this:

        <attribute>
                <name>test</name>
                <type>java.lang.String</type>
                <value>node1-3</value>
        </attribute>

I'm trying to change the 3 (after node1-) and increment it by 1 every time I run a command. I've tried the following sed, separating that line into 4 pieces, and replacing it with those 4 pieces, plus an increment. Unfortunately, it doesn't seem to do anything:

 sed -i -r -e 's/(.*)(\node1-)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/g' filepath

I've also tried awk, which seems to be getting me somewhere, but I'm not sure how to append the second half of the line back in (

awk '{FS=OFS="-" }/node1/{$2+=1}1' filepath

Finally, I tried perl, but its incrementing the wrong number, from node1 to node2, rather than after the dash:

perl -i -pe '/node1-/ && s/(\d+)(.*)/$1+1 . $2/e' filepath

I'm new to these commands, and am not so solid on my regex. I'm trying to get this command working, so that I could use this in a bash script I'm writing. What is the best approach to take? Which command has an advantage over the other? I'd like to have a 1 line command to simplify things for later.

like image 666
tarekeldarwiche Avatar asked Apr 27 '20 20:04

tarekeldarwiche


Video Answer


3 Answers

Process the file using an XML parser. This is just better in every way than hacking it with a regex.

use warnings;
use strict;

use XML::LibXML;

my $file = shift // die "Usage: $0 file\n";

my $doc = XML::LibXML->load_xml(location => $file);

my ($node) = $doc->findnodes('//value');

my $new_value = $node->to_literal =~ s/node1\-\K([0-9]+)/1+$1/er;

$node->removeChildNodes();
$node->appendText($new_value);

$doc->toFile('new_' . $file);   # or just $file to overwrite

Change the output filename to the input name ($file) to overwrite, once tested fully.

Removing and adding a node like above is one way to change an XML object.

Or, setData on the first child

$node->firstChild->setData($new_value);

where setData can be used on a node of type text, cdata or comment.

Or, search for text and then work with a text node directly

my ($tnode) = $doc->findnodes('//value/text()');

my $new_value = $tnode =~ s/node1\-\K([0-9]+)/1+$1/er;

$tnode->setData($new_value);

print $doc->toString;

There's more. What method to use depends on all that need be done. If the sole job is indeed to just edit that text then the simplest way is probably to get a text node.

like image 173
zdim Avatar answered Oct 24 '22 08:10

zdim


I don't like using line-oriented text processing for modifying XML. You lose context and position and you can't tell if you are actually modifying what you think you are (inside comments, CDATA, etc).

But, ignoring that, here's your one-liner that has an easy fix. Basically, you aren't anchoring correctly. You match the first group of digits when you want the second:

$ perl -i -pe '/node1-/ && s/(\d+)(.*)/$1+1 . $2/e' filepath

Instead, match a group of digits immediately before a <. The (?=...) is a positive lookahead that doesn't match characters (just the condition), so you don't substitute those:

$ perl -i -pe '/node1-/ && s/(\d+)(?=<)/$1+1/e' filepath

However, I'd combine the first match. The \K allows you to ignore part of a substitution's match. You have to match the stuff before \K, but you won't replace that part:

$ perl -i -pe 's/node1-\K(\d+)/$1+1/e' filepath

Again, these might work, but eventually you (more likely the next guy) will be burned by it. I don't know your situation, but as I often advise people: it's not the rarity, it's the calamity.

like image 6
brian d foy Avatar answered Oct 24 '22 09:10

brian d foy


Can you just hard-code the final part of the node line?

$ awk '{FS=OFS="-" }/node1/{$2+=1; print $1 "-" $2 "</value>"} $0 !~ /node1/ {print}' file
  <attribute>
          <name>test</name>
          <type>java.lang.String</type>
          <value>node1-4</value>
  </attribute>
like image 3
OpenSauce Avatar answered Oct 24 '22 09:10

OpenSauce