I have a table with the following columns titles and a row example:
Subject Test1-Result1 Test1-Result2 Test2-Result1 Test2-Result2
0 John 10 0.5 20 0.3
I would like to transform it to:
Subject level_1 Result1 Result2
0 John Test1 10 0.5
1 John Test2 20 0.3
With the subjects list repeated once for Test1 and then again for Test2.
I think I can do this using for loops, but it's there a more pythonic way?
For extra complexity, I need to add an extra column of information for each test. I suppose I can use a dictionary, but how can I insert the information about, say Test1, in each corresponding row?
You can use the following basic syntax to convert a pandas DataFrame from a wide format to a long format: df = pd.melt(df, id_vars='col1', value_vars= ['col2', 'col3', ...]) In this scenario, col1 is the column we use as an identifier and col2, col3, etc. are the columns we unpivot. The following example shows how to use this syntax in practice.
Reshaping a data from wide to long in pandas python is done with melt () function. melt function in pandas is one of the efficient function to transform the data from wide to long format. melt () Function in python pandas depicted with an example. Let’s create a simple data frame to demonstrate our reshape example in python pandas.
The DataFrame is now in a long format. We used the ‘team’ column as the identifier column and we unpivoted the ‘points’, ‘assists’, and ‘rebounds’ columns. Note that we can also use the var_name and value_name arguments to specify the names of the columns in the new long DataFrame:
Each row of these wide variables are assumed to be uniquely identified by i (can be a single column name or a list of column names) All remaining variables in the data frame are left intact. The wide-format DataFrame. The stub name (s). The wide format variables are assumed to start with the stub names. Column (s) to use as id variable (s).
You can split your columns into a multi-index column and then reshape your data frame:
df.set_index('Subject', inplace=True)
df.columns = df.columns.str.split("-", expand=True)
df.stack(level=0).rename_axis(['Subject', 'Test']).reset_index()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With