Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to change date-format in a log file using bash, avoiding while loop

This is not a new question here and here, but the details make it differ.

My input log file looks like:

TEMP MON -=- Sat Aug 15 02:20:24 EEST 2020 -=- 48.6
TEMP MON -=- Sat Aug 15 02:20:50 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:13 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:44 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:45 EEST 2020 -=- 48.6
TEMP MON -=- Sat Aug 15 02:21:52 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:53 EEST 2020 -=- 48.6
TEMP MON -=- Sat Aug 15 02:21:54 EEST 2020 -=- 49.6
TEMP MON -=- Sat Aug 15 02:21:56 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:57 EEST 2020 -=- 49.1

and the output should look like:

TEMP MON -=- 2020-08-15_02:20:24 EEST -=- 48.6
...

So it is simple enough to change the format of a date in bash using

date -d ${date_in_current_format} "+DATE_IN_NEW_FORMAT"

It is also possible (albeit inefficient) to iterate over the log file using a while loop and change the dates line by line (see the 1st link again).

However, I am looking for a bash solution that uses sed or perl (or awk or anything else for that matter) to carry out the same task.

The tip of what I have tried but still does not work are the following search and replace functions:

perl -pe "s/(.*) -=- (.*) -=- (.*)/\1 -=- $( date \2 "+%Z %Y-%m-%d_%H:%M:%S" ) -=- \3/" <file>

and with sed something similar:

sed "s:\(.*\) -=- \(.*\) -=- \(.*\):\1 -=- $( date -d \2 "+%Z %Y-%m-%d_%H:%M:%S" ) -=- \3:" <file>

In both cases the problem is that I cannot get the search and replace substitution "\2" to be expanded within the bash date command execution.

like image 384
nass Avatar asked Dec 30 '22 21:12

nass


2 Answers

With awk using only string functions, you can avoid calling the GNU awk datetime functions or the external command date, as we want to modify only the month and re-order the data.

> cat tst.awk
BEGIN { OFS=FS="-=-" }
{
    split($2, arr, " ")
    m=(index("JanFebMarAprMayJunJulAugSepOctNovDec", arr[2])+2)/3
    $2=sprintf(" %04d-%02d-%02d_%s %s ", arr[6], m, arr[3], arr[4], arr[5])
    print
}

Usage:

> awk -f tst.awk file
TEMP MON -=- 2020-08-15_02:20:24 EEST -=- 48.6
TEMP MON -=- 2020-08-15_02:20:50 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:13 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:44 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:45 EEST -=- 48.6
TEMP MON -=- 2020-08-15_02:21:52 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:53 EEST -=- 48.6
TEMP MON -=- 2020-08-15_02:21:54 EEST -=- 49.6
TEMP MON -=- 2020-08-15_02:21:56 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:57 EEST -=- 49.1
like image 100
thanasisp Avatar answered Jan 13 '23 15:01

thanasisp


You may use this awk solution:

awk 'BEGIN {
   FS=OFS=" -=- "
}
{
   cmd = sprintf("TZ=EET date -d \"%s\" +\"%Y-%m-%%d_%T %Z\"", $2);
   if ((cmd | getline output) > 0)
      $2 = output
   close(cmd)
} 1' file
TEMP MON -=- 2020-08-15_02:20:24 EEST -=- 48.6
TEMP MON -=- 2020-08-15_02:20:50 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:13 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:44 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:45 EEST -=- 48.6
TEMP MON -=- 2020-08-15_02:21:52 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:53 EEST -=- 48.6
TEMP MON -=- 2020-08-15_02:21:54 EEST -=- 49.6
TEMP MON -=- 2020-08-15_02:21:56 EEST -=- 49.1
TEMP MON -=- 2020-08-15_02:21:57 EEST -=- 49.1
like image 37
anubhava Avatar answered Jan 13 '23 15:01

anubhava