I have a list of numbers, comma-separated:
123711184642,02,3583090366663629,639f02012437d4 123715942138,01,3538710295145500,639f02afd6c643 123711616258,02,3548370476972758,639f0200485732
I need to split the 3rd column into three as below:
123711184642,02,3583090366663629,639f02,0124,37d4 123715942138,01,3538710295145500,639f02,afd6,c643 123711616258,02,3548370476972758,639f02,0048,5732
And convert the digits in the last two columns into decimal:
123711184642,02,3583090366663629,639f02,292,14292 123715942138,01,3538710295145500,639f02,45014,50755 123711616258,02,3548370476972758,639f02,72,22322
The format "%d.. %d\n" specifies that we want decimal numbers in the output separated by two periods. strtonum is awk's function for converting a string to a number. To tell awk that the number is hexadecimal, we put 0x in front of it as in "0x"$1 or "0x"$2 .
The conversion of hexadecimal to decimal is done by using the base number 16. The hexadecimal digit is expanded to multiply each digit with the power of 16. The power starts at 0 from the right moving forward towards the right with the increase in power. For the conversion to complete, the multiplied numbers are added.
To convert a hexadecimal to a decimal manually, you must start by multiplying the hex number by 16. Then, you raise it to a power of 0 and increase that power by 1 each time according to the hexadecimal number equivalent.
Here's a variation on Jonathan's answer:
awk $([[ $(awk --version) = GNU* ]] && echo --non-decimal-data) -F, ' BEGIN {OFS = FS} { $6 = sprintf("%d", "0x" substr($4, 11, 4)) $5 = sprintf("%d", "0x" substr($4, 7, 4)) $4 = substr($4, 1, 6) print }'
I included a rather contorted way of adding the --non-decimal-data option if it's needed.
Edit
Just for the heck of it, here's the pure-Bash equivalent:
saveIFS=$IFS IFS=, while read -r -a line do printf '%s,%s,%d,%d\n' "${line[*]:0:3}" "${line[3]:0:6}" "0x${line[3]:6:4}" "0x${line[3]:10:4}" done IFS=$saveIFS
The "${line[*]:0:3}"
(quoted *
) works similarly to AWK's OFS
in that it causes Bash's IFS
(here a comma) to be inserted between array elements on output. We can take further advantage of that feature by inserting array elements as follows which more closely parallels my AWK version above.
saveIFS=$IFS IFS=, while read -r -a line do line[6]=$(printf '%d' "0x${line[3]:10:4}") line[5]=$(printf '%d' "0x${line[3]:6:4}") line[4]=$(printf '%s' "${line[3]:0:6}") printf '%s\n' "${line[*]}" done IFS=$saveIFS
Unfortunately, Bash doesn't allow printf -v
(which is similar to sprintf()
) to make assignments to array elements, so printf -v "line[6]" ...
doesn't work.
Edit: As of Bash 4.1, printf -v
can now make assignments to array elements. Example:
printf -v 'line[6]' '%d' "0x${line[3]:10:4}"
The quotes around the array reference are needed to prevent possible filename matching. If a file named "line6" existed in the current directory and the reference wasn't quoted, then a variable named line6
would be created (or updated) containing the printf output. Nothing else about the file, such as its contents, would come into play. Only the name - and only tangentially.
This answer concentrates on showing how to do the conversion by awk portably.
Using --non-decimal-data
for gawk is not recommended according to GNU Awk User's Guide. And using strtonum()
is not portable.
In the following examples the first word of each record is converted.
The most portable way of doing conversion is by a user-defined awk function [reference]:
function parsehex(V,OUT) { if(V ~ /^0x/) V=substr(V,3); for(N=1; N<=length(V); N++) OUT=(OUT*16) + H[substr(V, N, 1)] return(OUT) } BEGIN { for(N=0; N<16; N++) { H[sprintf("%x",N)]=N; H[sprintf("%X",N)]=N } } { print parsehex($1) }
You could use this
awk '{cmd="printf %d 0x" $1; cmd | getline decimal; close(cmd); print decimal}'
but it is relatively slow. The following one is faster, if you have many newline-separated hexadecimal numbers to convert:
awk 'BEGIN{cmd="printf \"%d\n\""}{cmd=cmd " 0x" $1}END{while ((cmd | getline dec) > 0) { print dec }; close(cmd)}'
There might be a problem if very many arguments are added for the single printf command.
In my experience the following works in Linux:
awk -Wposix '{printf("%d\n","0x" $1)}'
I tested it by gawk, mawk and original-awk in Ubuntu Linux 14.04. By original-awk the command displays a warning message, but you can hide it by redirection directive 2>/dev/null
in shell. If you don't want to do that, you can strip the -Wposix
in case of original-awk like this:
awk $(awk -Wversion >/dev/null 2>&1 && printf -- "-Wposix") '{printf("%d\n","0x" $1)}'
(In Bash 4 you could replace >/dev/null 2>&1
by &>/dev/null
)
Note: The -Wposix trick probably doesn't work with nawk which is used in OS X and some BSD OS variants, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With