I have a huge file, and it has around 200 lines like this:
started at Wed Jun 5 08:45:01 PM +0330 2024 -- ended at Wed Jun 5 10:35:34 PM +0330 2024.
started at Thu Jun 6 01:30:01 AM +0330 2024 -- ended at Thu Jun 6 03:17:18 AM +0330 2024.
started at Thu Jun 6 07:30:01 AM +0330 2024 -- ended at Thu Jun 6 09:19:19 AM +0330 2024.
started at Thu Jun 6 01:30:01 PM +0330 2024 -- ended at Thu Jun 6 03:19:16 PM +0330 2024.
I'm going to change the lines in the old format which is $(date)'s output to the new one $(date +%Y-%m-%d %H:%M).
How can I do that? is that even possible?
Expected output is:
started at 2024-06-05 20:45:01 -- ended at 2024-06-05 22:35:34.
started at 2024-06-06 01:30:01 -- ended at 2024-06-06 03:17:18.
started at 2024-06-06 07:30:01 -- ended at 2024-06-06 09:19:19.
started at 2024-06-06 13:30:01 -- ended at 2024-06-06 15:19:16.
As whole log file is referenced +0330, I use TZ=Asia/Tehran as this seem match your time zone. The better is to use your own locale settings.
If your log file do contain exactly two date to be converted by lines, You could try something like:
sed < datedlogs.txt 's/^started at \(.*\) +0330 \(.*\) -- ended at \(.*\) +0330 \(.*\)\./TZ="Asia\/Tehran" \1 \2\nTZ="Asia\/Tehran" \3 \4/' |
TZ="Asia/Tehran" date -f - +'%F %T' |
paste -d + - - |
sed 's/^\(.*\)+\(.*\)$/started at \1 -- ended at \2/'
Based on your sample, this produce:
started at 2024-06-05 20:45:01 -- ended at 2024-06-05 22:35:34
started at 2024-06-06 01:30:01 -- ended at 2024-06-06 03:17:18
started at 2024-06-06 07:30:01 -- ended at 2024-06-06 09:19:19
started at 2024-06-06 13:30:01 -- ended at 2024-06-06 15:19:16
started at 2024-06-06 19:30:01 -- ended at 2024-06-06 21:16:15
started at 2024-06-07 01:30:01 -- ended at 2024-06-07 03:17:47
started at 2024-06-07 07:30:01 -- ended at 2024-06-07 09:03:05
started at 2024-06-07 13:30:01 -- ended at 2024-06-07 15:19:55
started at 2024-06-07 19:30:01 -- ended at 2024-06-07 21:17:41
started at 2024-06-08 01:30:01 -- ended at 2024-06-08 03:18:12
started at 2024-06-08 07:30:01 -- ended at 2024-06-08 09:20:31
started at 2024-06-08 13:30:01 -- ended at 2024-06-08 15:19:16
started at 2024-06-08 19:30:01 -- ended at 2024-06-08 21:20:01
started at 2024-06-09 01:30:01 -- ended at 2024-06-09 03:15:19
started at 2024-06-09 07:30:01 -- ended at 2024-06-09 09:19:07
started at 2024-06-09 13:30:01 -- ended at 2024-06-09 15:16:44
started at 2024-06-09 19:30:01 -- ended at 2024-06-09 21:15:16
started at 2024-06-10 01:30:01 -- ended at 2024-06-10 03:17:37
started at 2024-06-10 07:30:01 -- ended at 2024-06-10 09:16:38
started at 2024-06-10 13:30:01 -- ended at 2024-06-10 15:17:45
Quickly, as this run date command only once!
... Or better:
sed 's/^started at \(.*\) \([+-][0-2][0-9][0-5][0-9]\) \(.*\) -- ended at \(.*\) \([+-][0-2][0-9][0-5][0-9]\) \(.*\)\./TZ="\2" \1 \3\nTZ="\5" \4 \6/' |
TZ="Asia/Tehran" date -f - +'%F %T' |
paste -d + - - |
sed 's/^\(.*\)+\(.*\)$/started at \1 -- ended at \2/'
Where original TZ are extracted from input.
Perl can handle this:
Create the log file with the mixed timestamps
cat >logfile <<END
started at Thu Jul 18 01:30:01 PM +0330 2024 -- ended at Thu Jul 18 05:48:36 PM +0330 2024.
started at Fri Jul 19 01:30:01 AM +0330 2024 -- ended at Fri Jul 19 04:47:38 AM +0330 2024.
started at Fri Jul 19 07:30:01 AM +0330 2024 -- ended at Fri Jul 19 10:43:25 AM +0330 2024.
started at Fri Jul 19 01:30:01 PM +0330 2024 -- ended at Fri Jul 19 05:51:24 PM +0330 2024.
started at 2024-07-19 19:30 -- ended at 2024-07-19 23:43.
started at 2024-07-20 01:30 -- ended at 2024-07-20 04:48.
started at 2024-07-20 07:30 -- ended at 2024-07-20 10:55.
END
and then normalize them:
perl -MTime::Piece -pe '
s/[+-]\d{4} //g;
s{(started|ended) at \K(\w{3} \w{3} \d{2} [\d:]{8} .. \d{4})}
{ Time::Piece->strptime($2, "%a %b %e %r %Y")->strftime("%F %T") }ge;
' logfile
started at 2024-07-18 13:30 -- ended at 2024-07-18 17:48.
started at 2024-07-19 01:30 -- ended at 2024-07-19 04:47.
started at 2024-07-19 07:30 -- ended at 2024-07-19 10:43.
started at 2024-07-19 13:30 -- ended at 2024-07-19 17:51.
started at 2024-07-19 19:30 -- ended at 2024-07-19 23:43.
started at 2024-07-20 01:30 -- ended at 2024-07-20 04:48.
started at 2024-07-20 07:30 -- ended at 2024-07-20 10:55.
The first s/// command was needed to remove the timezone offset from the old timestamp format. With the offset there, Time::Piece would use it to locate the parsed timestamp in UTC, so I was seeing 10:00 instead of 13:30.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With