I'm working on an AWK script that parses millions of lines of text. Each line contains (among other things) a date & time on the form:
16-FEB-2008 14:17:59.994669
I need to convert this into the following form
20080216141759994669000
And I would like avoid translating the month from text into a numerical value manually if it's possible. In bash I can simply do the following command to get the desired result:
date -d "16-FEB-2008 14:17:59.994669" +"%Y%m%d%H%M%S%N"
I have tried invoking this command into AWK but I cannot figure out howto. I would like to know
Thanks in advance
Converting month names to numbers in awk is easy, and so is the reformatting as long as you don't need the (additional) validation date
does 'for free':
$ echo this 16-FEB-2008 14:17:59.994669 that \
> | awk '{ split($2,d,"-"); split($3,t,"[:.]");
m=sprintf("%02d",index("JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC",d[2])/4+1);
print $1,d[3] m d[1] t[1] t[2] t[3] t[4] "000",$4 }'
this 20080216141759994669000 that
$ # or can put the script in a file and use with awk -f
$ # or the whole thing in a shebang file like #!/bin/awk -f
This is not much longer than the code to run date
and much more efficient for 'millions of lines'.
You can call an external command like this:
awk '{
cmd="date -d \""$0"\" +%Y%m%d%H%M%S%N"
cmd | getline ts
print $0, ts
# awk opened a pipe for the communication with
# the command. close that pipe to avoid running
# out of file descriptors
close(cmd)
}' <<< '16-FEB-2008 14:17:59.994669'
Output:
16-FEB-2008 14:17:59.994669 20080216141759994669000
Thanks to dave_thompson_085's comment you can significantly improve the performance if you have date
from GNU coreutils and gawk
. GNU's date
supports reading dates from stdin and gawk
supports co-processes which allows to start a single instance of date
in the background, write into it's stdin and read from stdout:
{
cmd = "stdbuf -oL date -f /dev/stdin +%Y%m%d%H%M%S%N"
print $0 |& cmd
cmd |& getline ts
print $0, ts
}
Note that you need to use the stdbuf command in addition to force date
to output the results line by line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With