Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python looping and creating new dataframe for each value of a column

Tags:

python

loops

I want to create a new dataframe for each unique value of station.

I tried below which gives me only last station data updated in the dataframe = tai_new.i

tai['station'].unique() has 500 values.

for i in tai['station'].unique():
   tai_new.i = tai[tai_2['station'] ==i]

Another approach is creating a separate list of

tai_stations = tai['station'].unique()

And then create two loops however, I do not want to type 500 (IF) conditions.

like image 476
harold_noobie Avatar asked Sep 28 '17 18:09

harold_noobie


1 Answers

You can create dict of DataFrames by convert groupby object to tuples and then to dict:

dfs = dict(tuple(tai.groupby('station')))

Sample:

tai = pd.DataFrame({'A':list('abcdef'),
                   'B':[4,5,4,5,5,4],
                   'C':[7,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':[5,3,6,9,2,4],
                   'station':list('aabbcc')})

print (tai)
   A  B  C  D  E station
0  a  4  7  1  5       a
1  b  5  8  3  3       a
2  c  4  9  5  6       b
3  d  5  4  7  9       b
4  e  5  2  1  2       c
5  f  4  3  0  4       c

dfs = dict(tuple(tai.groupby('station')))

#select each DataFrame by key - name of station
print (dfs['a'])
   A  B  C  D  E station
0  a  4  7  1  5       a
1  b  5  8  3  3       a

print (type(dfs['a']))
<class 'pandas.core.frame.DataFrame'>
like image 51
jezrael Avatar answered Oct 13 '22 09:10

jezrael