I have a dataframe as shown below (top 3 rows):
Sample_Name Sample_ID Sample_Type IS Component_Name IS_Name Component_Group_Name Outlier_Reasons Actual_Concentration Area Height Retention_Time Width_at_50_pct Used Calculated_Concentration Accuracy
Index
1 20170824_ELN147926_HexLacCer_Plasma_A-1-1 NaN Unknown True GluCer(d18:1/12:0)_LCB_264.3 NaN NaN NaN 0.1 2.733532e+06 5.963840e+05 2.963911 0.068676 True NaN NaN
2 20170824_ELN147926_HexLacCer_Plasma_A-1-1 NaN Unknown True GluCer(d18:1/17:0)_LCB_264.3 NaN NaN NaN 0.1 2.945190e+06 5.597470e+05 2.745026 0.068086 True NaN NaN
3 20170824_ELN147926_HexLacCer_Plasma_A-1-1 NaN Unknown False GluCer(d18:1/16:0)_LCB_264.3 GluCer(d18:1/17:0)_LCB_264.3 NaN NaN NaN 3.993535e+06 8.912731e+05 2.791991 0.059864 True 125.927659773487 NaN
When trying to generate a pivot table:
pivoted_report_conc = raw_report.pivot(index = "Sample_Name", columns = 'Component_Name', values = "Calculated_Concentration")
I get the following error:
ValueError: Index contains duplicate entries, cannot reshape
I tried resetting the index but it did not help. I couldn't find any duplicate values in the "Index" column. Could someone please help identify the problem here?
The expected output would be a reshaped dataframe with only the unique component names as columns and respective concentrations for each sample name:
Sample_Name GluCer(d18:1/12:0)_LCB_264.3 GluCer(d18:1/17:0)_LCB_264.3 GluCer(d18:1/16:0)_LCB_264.3
20170824_ELN147926_HexLacCer_Plasma_A-1-1 NaN NaN 125.927659773487
To clarify, I am not looking to aggregate the data, just reshape it.
You can avoid this by retaining the default index column (row #) and while setting the index using " id ", " date " and " location ", add it in " append " mode instead of the default overwrite mode. Once this is done, your index columns will still have the default index along with the set indexes.
Basically, the pivot_table() function is a generalization of the pivot() function that allows aggregation of values — for example, through the len() function in the previous example. Pivot only works — or makes sense — if you need to pivot a table and show values without any aggregation.
To use the pivot method in Pandas, you need to specify three parameters: Index: Which column should be used to identify and order your rows vertically. Columns: Which column should be used to create the new columns in our reshaped DataFrame.
You can use groupby()
and unstack()
to get around the error you're seeing with pivot()
.
Here's some example data, with a few edge cases added, and some column values removed or substituted for MCVE:
# df
Sample_Name Sample_ID IS Component_Name Calculated_Concentration Outlier_Reasons
Index
1 foo NaN True x NaN NaN
1 foo NaN True y NaN NaN
2 foo NaN False z 125.92766 NaN
2 bar NaN False x 1.00 NaN
2 bar NaN False y 2.00 NaN
2 bar NaN False z NaN NaN
(df.groupby(['Sample_Name','Component_Name'])
.Calculated_Concentration
.first()
.unstack()
)
Output:
Component_Name x y z
Sample_Name
bar 1.0 2.0 NaN
foo NaN NaN 125.92766
You should be able to accomplish what you are looking to do by using the the pandas.pivot_table()
functionality as documented here.
With your dataframe stored as df
use the following code:
import pandas as pd
df = pd.read_table('table_from_which_to_read')
new_df = pd.pivot_table(df,index=['Simple Name'], columns = 'Component_Name', values = "Calculated_Concentration")
If you want something other than the mean of the concentration value, you will need to change the aggfunc
parameter.
EDIT
Since you don't want to aggregate over the values, you can reshape the data by using the set_index
function on your DataFrame with documentation found here.
import pandas as pd
df = pd.DataFrame({'NonUniqueLabel':['Item1','Item1','Item1','Item2'],
'SemiUniqueValue':['X','Y','Z','X'], 'Value':[1.0,100,5,None])
new_df = df.set_index(['NonUniqueLabel','SemiUniqueLabel'])
The resulting table should look like what you expect the results to be and will have a multi-index.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With