I have one pandas dataframe that I need to split into multiple dataframes. The number of dataframes I need to split depends on how many months of data I have i.e I need to create a new dataframe for every month. So df:
MONTH NAME INCOME
201801 A 100$
201801 B 20$
201802 A 30$
So I need to create 2 dataframes . Problem is i dont know how many months of data I will have in advance. How do i do that
div() method divides element-wise division of one pandas DataFrame by another. DataFrame elements can be divided by a pandas series or by a Python sequence as well. Calling div() on a DataFrame instance is equivalent to invoking the division operator (/).
You could use split() , with rep() to create the groupings. How will I write a code such that it iteratively saves each of the 10 chunks as a csv file each with a unique filename? The x and each arguments are flippled if the goal is to split the df into n parts.
You can use groupby to create a dictionary of data frames,
df['MONTH'] = pd.to_datetime(df['MONTH'], format = '%Y%m')
dfs = dict(tuple(df.groupby(df['MONTH'].dt.month)))
dfs[1]
MONTH NAME INCOME
0 2018-01-01 A 100$
1 2018-01-01 B 20$
If your data is across multiple years, you will need to include year in the grouping
dfs = dict(tuple(df.groupby([df['MONTH'].dt.year,df['MONTH'].dt.month])))
dfs[(2018, 1)]
MONTH NAME INCOME
0 2018-01-01 A 100$
1 2018-01-01 B 20$
You can use groupby
to split dataframes in to list of dataframes or a dictionary of datframes:
Dictionary of dataframes:
dict_of_dfs = {}
for n, g in df.groupby(df['MONTH']):
dict_of_dfs[n] = g
List of dataframes:
list_of_dfs = []
for _, g in df.groupby(df['MONTH']):
list_of_dfs.append(g)
Or as @BenMares suggests use comprehension:
dict_of_dfs = {
month: group_df
for month, group_df in df.groupby('MONTH')
}
list_of_dfs = [
group_df
for _, group_df in df.groupby('MONTH')
]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With