Is there a way to split a pandas data frame based on the column name? As an example consider the data frame has the following columns df = ['A_x', 'B_x', 'C_x', 'A_y', 'B_y', 'C_y']
and I want to create two data frames X = ['A_x', 'B_x', 'C_x']
and Y = ['A_y', 'B_y', 'C_y']
.
I know there is a possibility to do this:
d = {'A': df.A_x, 'B': df.B_x, 'C': df.B_x}
X = pd.DataFrame (data=d)
but this would not be ideal as in my case I have 2200 columns in df
. Is there a more elegant solution?
In the above example, the data frame 'df' is split into 2 parts 'df1' and 'df2' on the basis of values of column 'Weight'. Method 2: Using Dataframe. groupby(). This method is used to split the data into groups based on some criteria.
get_group() to split a DataFrame into multiple DataFrame. Call pandas. DataFrame. groupby(column) to group the Dataframe by the unique values found in the column named column .
Method #2 : Using apply() function. Split Name column into two different columns. Output : Split Name column into two different columns named as “First” and “Last” respectively and then add it to the existing Dataframe.
You could use df.filter(regex=...)
:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(2, 10),
columns='Time A_x A_y A_z B_x B_y B_z C_x C_y C-Z'.split())
X = df.filter(regex='_x')
Y = df.filter(regex='_y')
yields
In [15]: X
Out[15]:
A_x B_x C_x
0 -0.706589 1.031368 -0.950931
1 0.727826 0.879408 -0.049865
In [16]: Y
Out[16]:
A_y B_y C_y
0 -0.663647 0.635540 -0.532605
1 0.326718 0.189333 -0.803648
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With