Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BASH: Finding maximum value in a particular CSV column

Tags:

grep

bash

csv

sed

awk

I have a CSV file million_songs_metadata_and_sales.csv having the following schema.

track_id    
sales_date  
sales_count
title
song_id 
release 
artist_id   
artist_mbid 
artist_name 
duration    
artist_familiarity  
artist_hotttnesss
year

Sample data:

TRZZZZZ12903D05E3A,2014-06-19,79,Infra Stellar,SOZPUEF12AF72A9F2A,Archives Vol. 2,ARBG8621187FB54842,4279aba0-1bde-40a9-8fb2-c63d165dc554,Delerium,495.22893,0.69652442519,0.498471038842,2001

I need to write a query in BASH to find the artist_name with maximum sales using the file million_songs_metadata_and_sales.csv.

I have written the following script but it fails to give me the correct data:

awk 'max=="" || $3 > max {max=$3} END{ print $9}' FS="," million_songs_metadata_and_sales.csv

Any work around to this issue? Thanks!

like image 459
AngryPanda Avatar asked Feb 11 '23 19:02

AngryPanda


2 Answers

$N can be used only when awk is processing a line.

$ cat file.csv
TRZZZZZ12903D05E3A,2014-06-19,77,Infra Stellar,SOZPUEF12AF72A9F2A,Archives Vol. 2,ARBG8621187FB54842,4279aba0-1bde-40a9-8fb2-c63d165dc554,Delerium 1,495.22893,0.69652442519,0.498471038842,2001
TRZZZZZ12903D05E3A,2014-06-19,79,Infra Stellar,SOZPUEF12AF72A9F2A,Archives Vol. 2,ARBG8621187FB54842,4279aba0-1bde-40a9-8fb2-c63d165dc554,Delerium,495.22893,0.69652442519,0.498471038842,2001
TRZZZZZ12903D05E3A,2014-06-19,78,Infra Stellar,SOZPUEF12AF72A9F2A,Archives Vol. 2,ARBG8621187FB54842,4279aba0-1bde-40a9-8fb2-c63d165dc554,Delerium 2,495.22893,0.69652442519,0.498471038842,2001
$ awk 'BEGIN { max=0 } $3 > max { max=$3; name=$9 } END { print name }' FS="," file.csv
Delerium
$
like image 178
pynexj Avatar answered Feb 13 '23 09:02

pynexj


The

cut -d, -f3,9 < data.csv | sort -nr | head -1

will do it.

And will fail immediately if some columns containing a comma. For correct CSV parsing you need to use some cvs-parsing library.

like image 20
jm666 Avatar answered Feb 13 '23 07:02

jm666