Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python split pd dataframe by column

Tags:

python

pandas

Is there a function that splits a pandas.dataframe object into multiple sub-dataframes, by a specific column value? For example, if I have

A   1
B   2
A   3
B   4

I want the result as follow:

A   1
A   3

and

B   2
B   4

In R, it is the split function. How is it being done in python? I know I can use subset within a forloop. But is there a function does that? Thanks.

like image 917
NewbieDave Avatar asked Nov 17 '25 11:11

NewbieDave


1 Answers

You can use groupby() with list-comprehension to extract a list of sub data frames where each of them contains only a single ind value:

import pandas as pd
from StringIO import StringIO

df = pd.read_csv(StringIO("""A   1
B   2
A   3
B   4"""), sep = "\s+", names=['ind', 'value'])

lst = [g for _, g in df.groupby('ind')]

lst[0]
# ind  value
#0  A      1
#2  A      3

lst[1]
# ind  value
#1  B      2
#3  B      4
like image 197
Psidom Avatar answered Nov 19 '25 01:11

Psidom



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!