I want to convert the output from cppclean into cppcheck-like xml sections, such that:
./bit_limits.cpp:25: static data 'bit_limits::max_name_length'
becomes:
<error id="static data" msg="bit_limits::max_name_length">
<location file="./bit_limits.cpp" line="25"/>
</error>
I started with some awk:
test code:
echo "./bit_limits.cpp:25: static data 'bit_limits::max_name_length'" > test
cat test.out | awk -F ":" '{print "<error id=\""$3"\""}
{print "msg=\""}{for(i=4;i<=NF;++i)print ":"$i}{print "\">"}
{print "<location file=\""$1"\" line=\""$2"\"/>"}
{print "</error>"}'
Note: to run this you need to put the cat
command back into one line - I printed it over multi-lines for ease of reading.
Explanation:
I am using awk
and delimiting by colon ":" - which splits the line into useful chunks which I try to construct into the XML:
{print "<error id=\""$3"\""}
- Extract the error ID part{print "msg=\""}{for(i=4;i<=NF;++i)print ":"$i}{print "\">"}
- extract the message (replacing the missing colons, this is all the remaining sections{print "<location file=\""$1"\" line=\""$2"\"/>"}
- extract the file and line, this part is easy since the colons line up nicely{print "</error>"}
- finally print the end tagThis is close, but not quite right, it produces:
<error id=" static data 'bit_limits"
msg="
:
:max_name_length'
">
<location file="./bit_limits.cpp" line="25"/>
</error>
The id field should just be "static data" and the msg field should be "'bit_limits::max_name_length'", but other then that it is ok (I don't mind it being split of multi-lines at the moment - though I would prefer that awk did not print a new line each time.
Update As @charlesduffy pointed out - for context - I want to do this in bash because I want to embed this code into a makefile (or just a normal bash script) for maximum portability (i.e. no need for python or other tools).
With bash and a regex:
x="./bit_limits.cpp:25: static data 'bit_limits::max_name_length'"
[[ $x =~ (.+):([0-9]+):\ (.+)\ \'(.+)\' ]]
declare -p BASH_REMATCH
Output:
declare -ar BASH_REMATCH='([0]="./bit_limits.cpp:25: static data '\''bit_limits::max_name_length'\''" [1]="./bit_limits.cpp" [2]="25" [3]="static data" [4]="bit_limits::max_name_length")'
The elements 1 to 4 in array BASH_REMATCH contain the searched strings.
From man bash
:
BASH_REMATCH
: An array variable whose members are assigned by the=~
binary operator to the[[
conditional command. The element with index 0 is the portion of the string matching the entire regular expression. The element with index n is the portion of the string matching the nth parenthesized subexpression. This variable is read-only.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With