Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create multiple dataframes in loop

I have a list, with each entry being a company name

companies = ['AA', 'AAPL', 'BA', ....., 'YHOO']

I want to create a new dataframe for each entry in the list.

Something like

(pseudocode)

for c in companies:
     c = pd.DataFrame()

I have searched for a way to do this but can't find it. Any ideas?

like image 683
Luis Ibáñez Herrera Avatar asked Jun 04 '15 04:06

Luis Ibáñez Herrera


People also ask

Can we append multiple Dataframes in pandas?

pandas. DataFrame. append() method is used to append one DataFrame row(s) and column(s) with another, it can also be used to append multiple (three or more) DataFrames.


3 Answers

Just to underline my comment to @maxymoo's answer, it's almost invariably a bad idea ("code smell") to add names dynamically to a Python namespace. There are a number of reasons, the most salient being:

  1. Created names might easily conflict with variables already used by your logic.

  2. Since the names are dynamically created, you typically also end up using dynamic techniques to retrieve the data.

This is why dicts were included in the language. The correct way to proceed is:

d = {}
for name in companies:
    d[name] = pd.DataFrame()

Nowadays you can write a single dict comprehension expression to do the same thing, but some people find it less readable:

d = {name: pd.DataFrame() for name in companies}

Once d is created the DataFrame for company x can be retrieved as d[x], so you can look up a specific company quite easily. To operate on all companies you would typically use a loop like:

for name, df in d.items():
    # operate on DataFrame 'df' for company 'name'

In Python 2 you are better writing

for name, df in d.iteritems():

because this avoids instantiating a list of (name, df) tuples.

like image 95
holdenweb Avatar answered Oct 19 '22 08:10

holdenweb


You can do this (although obviously use exec with extreme caution if this is going to be public-facing code)

for c in companies:
     exec('{} = pd.DataFrame()'.format(c))
like image 22
maxymoo Avatar answered Oct 19 '22 07:10

maxymoo


Adding to the above great answers. The above will work flawless if you need to create empty data frames but if you need to create multiple dataframe based on some filtering:

Suppose the list you got is a column of some dataframe and you want to make multiple data frames for each unique companies fro the bigger data frame:-

  1. First take the unique names of the companies:-

    compuniquenames = df.company.unique()
    
  2. Create a data frame dictionary to store your data frames

    companydict = {elem : pd.DataFrame() for elem in compuniquenames}
    

The above two are already in the post:

for key in DataFrameDict.keys():
    DataFrameDict[key] = df[:][df.company == key]

The above will give you a data frame for all the unique companies with matching record.

like image 6
ak3191 Avatar answered Oct 19 '22 08:10

ak3191