recently i've started to learn bash scripting and im wondering how i can count occurences in a column of a .csv file, the file is structured like this:
DAYS,SOMEVALUE,SOMEVALUE
sunday,something,something
monday,something,something
wednesday,something,something
sunday,something,something
monday,something,something
so my question is: how can i count each time every value of first column (days) appear? In this case the outputs must be:
Sunday : 2
Monday : 2
Wednesday: 1
The first column is named DAYS, so the script has to not take care of the single value DAYS, DAYS is just a way to identify the column.
if possible i want to see a solution without the awk command and without phyton ecc..
Thx guys and sorry for my bad English
Edit: I thought to do this:
count="$( cat "${FILE}" | grep -c "OCCURENCE")"
echo "OCCURENCE": ${count}
Where OCCURENCE is the single values (sunday,monday...) But this solution is not automatic, i need to make a list of single occurences in the first column of .csv file and put each one on an array and then count each one with the code i written before. I need some help to do this thx
import csv f = 'testfile. csv' d = '\t' reader = csv. reader(f,delimiter=d) for row in reader: if reader. line_num == 1: fields = len(row) if len(row) !=
All what has left is to simply use wc command to count number of characters.
Method 1: Using head + sed + wc Commands As you might have noted, the above command has outputted 6 commas. We will again pipe this output to the wc command which will count these comma characters plus the carriage (\n) character as the total number of columns.
cut -f1 -d, test.csv | tail -n +2 | sort | uniq -c
This gets you this far:
2 monday
2 sunday
1 wednesday
To get your format (Sunday : 1
), I think awk
would be an easy and clear way (something like awk '{print $2 " : " $1}'
, but if you really really must, here's a complete non-awk version:
cut -f1 -d, test.csv | uniq -c | tail -n +2 | while read line; do words=($line); echo ${words[1]} : ${words[0]}; done
A variation of @sneep's answer that uses sed
to format the result:
cut -f1 -d, /tmp/data | tail -n +2 | sort | uniq -c | sed 's|^ *\([0-9]*\) \(.*\)|\u\2: \1|g'
Output:
Monday: 2
Sunday: 2
Wednesday: 1
The sed
is matching:
^ *
: Beginning of line and then any number of spaces\([0-9]*\)
: Any number of numbers (storing them in a group \1)
: A single space\(.*\)
: Any character until the end, storing it in group \2And replaces the match with:
\u\2
: Second group, capitalizing first character: \1
: Colon, space and the first groupIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With