Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic way to loop over dictionary

I am practicing Pandas and have the following task:

Create a list whose elements are the # of columns of each .csv file


.csv files are stored in the dictionary directory keyed by year

I use a dictionary comprehension dataframes (again keyed by year) to store the .csv files as pandas dataframes

directory = {2009: 'path_to_file/data_2009.csv', ... , 2018: 'path_to_file/data_2018.csv'}

dataframes = {year: pandas.read_csv(file) for year, file in directory.items()}

# My Approach 1 
columns = [df.shape[1] for year, df in dataframes.items()]

# My Approach 2
columns = [dataframes[year].shape[1] for year in dataframes]

Which way is more "Pythonic"? Or is there a better way to approach this?

like image 962
Vivek Jha Avatar asked Dec 11 '22 09:12

Vivek Jha


1 Answers

Your method will get it done... but I don't like reading in the entire file and creating a dataframe just to count the columns. You could do the same thing by just reading the first line of each file and counting the number of commas. Notice that I add 1 because there will always be one less comma than there are columns.

columns = [open(f).readline().count(',') + 1 for _, f in directory.items()]
like image 140
piRSquared Avatar answered Dec 20 '22 03:12

piRSquared