I have a data set with the following format
The first and second fields denote the dates (M/D/YYYY) of starting and ending of a study.
How one expand the data into the desired output format, taking into account the leap years using AWK or BASH scripts?
Your help is very much appreciated.
Input
7/2/2009 7/7/2009
2/28/1996 3/3/1996
12/30/2001 1/4/2002
Desired Output
7/7/2009
7/6/2009
7/5/2009
7/4/2009
7/3/2009
7/2/2009
3/3/1996
3/2/1996
3/1/1996
2/29/1996
2/28/1996
1/4/2002
1/3/2002
1/2/2002
1/1/2002
12/31/2001
12/30/2001
It can be done nicely with bash alone:
for i in `seq 1 5`;
do
date -d "2017-12-01 $i days" +%Y-%m-%d;
done;
or with pipes:
seq 1 5 | xargs -I {} date -d "2017-12-01 {} days" +%Y-%m-%d
If you have gawk
:
#!/usr/bin/gawk -f
{
split($1,s,"/")
split($2,e,"/")
st=mktime(s[3] " " s[1] " " s[2] " 0 0 0")
et=mktime(e[3] " " e[1] " " e[2] " 0 0 0")
for (i=et;i>=st;i-=60*60*24) print strftime("%m/%d/%Y",i)
}
Demonstration:
./daterange.awk inputfile
Output:
07/07/2009
07/06/2009
07/05/2009
07/04/2009
07/03/2009
07/02/2009
03/03/1996
03/02/1996
03/01/1996
02/29/1996
02/28/1996
01/04/2002
01/03/2002
01/02/2002
01/01/2002
12/31/2001
12/30/2001
Edit:
The script above suffers from a naive assumption about the length of days. It's a minor nit, but it could produce unexpected results under some circumstances. At least one other answer here also has that problem. Presumably, the date
command with subtracting (or adding) a number of days doesn't have this issue.
Some answers require you to know the number of days in advance.
Here's another method which hopefully addresses those concerns:
while read -r d1 d2
do
t1=$(date -d "$d1 12:00 PM" +%s)
t2=$(date -d "$d2 12:00 PM" +%s)
if ((t2 > t1)) # swap times/dates if needed
then
temp_t=$t1; temp_d=$d1
t1=$t2; d1=$d2
t2=$temp_t; d2=$temp_d
fi
t3=$t1
days=0
while ((t3 > t2))
do
read -r -u 3 d3 t3 3<<< "$(date -d "$d1 12:00 PM - $days days" '+%m/%d/%Y %s')"
((++days))
echo "$d3"
done
done < inputfile
You can do this in the shell without awk, assuming you have GNU date (which is needed for the date -d @nnn
form, and possibly the ability to strip leading zeros on single digit days and months):
while read start end ; do
for d in $(seq $(date +%s -d $end) -86400 $(date +%s -d $start)) ; do
date +%-m/%-d/%Y -d @$d
done
done
If you are in a locale that does daylight savings, then this can get messed up if requesting a date sequence where a daylight saving switch occurs in between. Use -u to force to UTC, which also strictly observes 86400 seconds per day. Like this:
while read start end ; do
for d in $(seq $(date -u +%s -d $end) -86400 $(date -u +%s -d $start)) ; do
date -u +%-m/%-d/%Y -d @$d
done
done
Just feed this your input on stdin.
The output for your data is:
7/7/2009
7/6/2009
7/5/2009
7/4/2009
7/3/2009
7/2/2009
3/3/1996
3/2/1996
3/1/1996
2/29/1996
2/28/1996
1/4/2002
1/3/2002
1/2/2002
1/1/2002
12/31/2001
12/30/2001
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With