(This example is edited, following a user's recommendation, considering a mistake in my table display)
I have a .csv table from where I need certain info. My table looks like this:
Name, Birth
James,2001/02/03 California
Patrick,2001/02/03 Texas
Sarah,2000/03/01 Alabama
Sean,2002/02/01 New York
Michael,2002/02/01 Ontario
From here, I would need to print only the unique birthdates, in an ascending order, like this:
2000/03/01
2001/02/03
2002/02/01
I have thought of a regular expression to identify the dates, such as:
awk '/[0-9]{4}/[0-9]{2}/[0-9]/{2}/' students.csv
However, I'm getting a syntax error in the regex, and I wouldn't know how to follow from this step.
Any hints?
Use cut and sort with -u option to print unique values:
cut -d' ' -f2 students.csv | sort -u > out_file
You can also use grep instead of cut:
grep -Po '\d\d\d\d/\d\d/\d\d' students.csv | sort -u > out_file
Here, GNU grep uses the following options:-P : Use Perl regexes.-o : Print the matches only (1 match per line), not the entire lines.
SEE ALSO:
perlre - Perl regular expressions
Here is a gnu awk solution to get this done in a single command:
awk 'NF > 2 && !seen[$2]++{} END {
PROCINFO["sorted_in"]="@ind_str_asc"; for (i in seen) print i}' file
2000/03/01
2001/02/03
2002/02/01
                        Using any awk and whether your names have 1 word or more and whether blank chars exist after the commas or not:
$ awk -F', *' 'NR>1{sub(/ .*/,"",$2); print $2}' file | sort -u
2000/03/01
2001/02/03
2002/02/01
                        With your shown samples, could you please try following. Written and tested in GNU awk, should work in any awk though.
awk '
match($0,/[0-9]{4}(\/[0-9]{2}){2}/){
  arrDate[substr($0,RSTART,RLENGTH)]
}
END{
  for(i in arrDate){
    print i
  }
}
'  Input_file
Explanation: Adding detailed explanation for above.
awk '                                   ##Starting awk program from here.
match($0,/[0-9]{4}(\/[0-9]{2}){2}/){    ##using match function to match regex to match only date format.
  arrDate[substr($0,RSTART,RLENGTH)]    ##Creating array arrDate which has index as sub string of matched one.
}
END{                                    ##Starting END block of this awk program from here.
  for(i in arrDate){                    ##Traversing through arrDate here.
    print i                             ##Printing index of array here.
  }
}
'  Input_file                           ##Mentioning Input_file name here.
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With