ValueError: "cannot reindex from a duplicate axis" in groupby Pandas

Tags:

My dataframe looks like this:

    SKU #    GRP    CATG   PRD
0   54995  9404000  4040  99999
1   54999  9404000  4040  99999
2   55037  9404000  4040  1556894
3   55148  9404000  4040  1556894
4   55254  9404000  4040  1556894
5   55291  9404000  4040  1556894
6   55294  9404000  4040  1556895
7   55445  9404000  4040  1556895
8   55807  9404001  4040  1556896
9   49021  9404002  4040  1556897
10  49035  9404002  4040  1556897
11  27538  9404000  4040  1556898
12  27539  9404000  4040  1556899
13  27540  9404000  4040  1556894
14  27542  9404000  4040  1556900
15  27543  9404000  4040  1556900
16  27544  9404003  4040  1556901
17  27546  9404004  4040  1556902
18  99111  9404005  4040  1556903
19  99112  9404006  4040  1556904
20  99113  9404007  4040  1556905
21  99116  9404008  4040  1556906
22  99119  9404009  4040  1556907
23  99122  94040010 4040  1556908
24  99125  94040011 4040  1556909
25  86007  94040012 4040  1556910
26  86010  94040013 4040  1556911

And when I try to perform a group by operation on the above dataframe, I get the "cannot reindex from a duplicate axis" error.

df.groupby(['GRP','CATG'],as_index=False)['PRD'].min()

I tried to find out the duplicate indices using:

df[df.index.duplicated()]

But didn't return any thing. How can I go about resolving this issue?

778

asked Feb 17 '20 20:02

vgaurav

1 Answers

This error is often thrown due to duplications in your column names (not necessarily values)

First, just check if there is any duplication in your column names using the code: df.columns.duplicated().any()

If it's true, then remove the duplicated columns

df.loc[:,~df.columns.duplicated()]

After you remove the duplicated columns, you should be able to run your groupby operation.

198

answered Oct 13 '22 17:10

Gene Burinsky

Related questions
                            
                                What's difference between using metrics 'acc' and tf.keras.metrics.Accuracy()
                            
                                on_epoch_end() not called in keras fit_generator()
                            
                                How to force all strings to floats? [duplicate]
                            
                                How to remap or revert a point into its former coordinate system after warpAffine has transformed it?
                            
                                how to replace just first instance of max value in dataframe pandas?
                            
                                Post-install script with Python Poetry
                            
                                Clean text images with OpenCV for OCR reading
                            
                                Creating subplots with equal axis scale, Python, matplotlib
                            
                                How do I make more efficient code for a search for multiple strings in column in pandas
                            
                                How to get probability of prediction per entity from Spacy NER model?
                            
                                How to find code that is missing type annotations?
                            
                                Multi-Page Dash App Callbacks Not Registering
                            
                                installing spyder_autopep8 on spyder 4 and getting it to work
                            
                                PyOpenGL how do I import an obj file?
                            
                                RuntimeValueProviderError when creating a google cloud dataflow template with Apache Beam python
                            
                                How to run python on GPU with CuPy?
                            
                                Airflow: Proper way to run DAG for each file
                            
                                Value filter in pandas dataframe keeping NaN
                            
                                How to log from a custom ai platform model
                            
                                Does it make sense to use sklearn GridSearchCV together with CalibratedClassifierCV?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

ValueError: "cannot reindex from a duplicate axis" in groupby Pandas

Tags:

python

pandas

pandas-groupby

vgaurav

People also ask

1 Answers

Gene Burinsky

Recent Activity

Donate For Us