Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

generate multiple pandas data frames

I am retrieving multiple data frames in csv format from a website. I save the data frames in a an empty list and then read one by one. I can not append them into a single data frame since they have different column names and column orders. So I have the following questions:

Can I create a data frame with a different name inside the loop I use to read the files, so instead of saving them to a list I create a new dataframe for every file retrieved? If this is not possible/recommendable is there a way to iterate my list to extract the data frames? Currently I read one dataframe at the time but I would love to come up with a way to automate this code to create something like data_1, data_2, etc. Right now my code is not terribly time consuming since I only have 4 data frames, but this can become burdensome with more data. Here is my code:

import pandas as pd
import urllib2
import csv

#we write the names of the files in a list so we can iterate to download the files
periods=['2012-1st-quarter','2012-2nd-quarter', '2012-3rd-quarter', '2012-4th-quarter']
general=[]
#we generate a loop to read the files from the capital bikeshare website
for i in periods:
    url = 'https://www.capitalbikeshare.com/assets/files/trip-history-data/'+i+'.csv'
    response = urllib2.urlopen(url)
    x=pd.read_csv(response)
    general.append(x)
q1=pd.DataFrame(general[0])

Thanks!

like image 507
asado23 Avatar asked Jan 18 '26 22:01

asado23


1 Answers

It would be better if you use a dict, also you can directly pass a url to pandas.read_csv. So the simplified code would look like this:

import pandas as pd

periods = ['2012-1st-quarter','2012-2nd-quarter', '2012-3rd-quarter', '2012-4th-quarter']
url = 'https://www.capitalbikeshare.com/assets/files/trip-history-data/{}.csv'
d = {period: pd.read_csv(url.format(period)) for period in periods}

Then you can access a specific DataFrame like this:

 d['2012-4th-quarter']

To iterate through all Dataframes:

for period, df in d.items():
    print period
    print df
like image 195
elyase Avatar answered Jan 20 '26 13:01

elyase