I am trying to short zipcodes into various files but I keep getting
ValueError: cannot reindex from a duplicate axis
I've read through other documentation on Stackoverflow, but I haven't been about to figure out why its duplicating axis.
import csv
import pandas as pd
from pandas import DataFrame as df
fp = '/Users/User/Development/zipcodes/file.csv'
file1 = open(fp, 'rb').read()
df = pd.read_csv(fp, sep=',')
df = df[['VIN', 'Reg Name', 'Reg Address', 'Reg City', 'Reg ST', 'ZIP',
'ZIP', 'Catagory', 'Phone', 'First Name', 'Last Name', 'Reg NFS',
'MGVW', 'Make', 'Veh Model','E Mfr', 'Engine Model', 'CY2010',
'CY2011', 'CY2012', 'CY2013', 'CY2014', 'CY2015', 'Std Cnt',
]]
#reader.head(1)
df.head(1)
zipBlue = [65355, 65350, 65345, 65326, 65335, 64788, 64780, 64777, 64743,
64742, 64739, 64735, 64723, 64722, 64720]
Also contains zipGreen, zipRed, zipYellow, ipLightBlue
But did not include in example.
def IsInSort():
blue = df[df.ZIP.isin(zipBlue)]
green = df[df.ZIP.isin(zipGreen)]
red = df[df.ZIP.isin(zipRed)]
yellow = df[df.ZIP.isin(zipYellow)]
LightBlue = df[df.ZIP.isin(zipLightBlue)]
def SaveSortedZips():
blue.to_csv('sortedBlue.csv')
green.to_csv('sortedGreen.csv')
red.to_csv('sortedRed.csv')
yellow.to_csv('sortedYellow.csv')
LightBlue.to_csv('SortedLightBlue.csv')
IsInSort()
SaveSortedZips()
1864 # trying to reindex on an axis with duplicates 1865
if not self.is_unique and len(indexer): -> 1866 raise ValueError("cannot reindex from a duplicate axis") 1867 1868 def reindex(self, target, method=None, level=None, limit=None):ValueError: cannot reindex from a duplicate axis
I'm pretty sure your problem is related to your mask
df = df[['VIN', 'Reg Name', 'Reg Address', 'Reg City', 'Reg ST', 'ZIP',
'ZIP', 'Catagory', 'Phone', 'First Name', 'Last Name', 'Reg NFS',
'MGVW', 'Make', 'Veh Model','E Mfr', 'Engine Model', 'CY2010',
'CY2011', 'CY2012', 'CY2013', 'CY2014', 'CY2015', 'Std Cnt',
]]
'ZIP'
is in there twice. Removing one of them should solve the problem.
The error ValueError: cannot reindex from a duplicate axis
is one of these very very cryptic pandas errors which simply does not tell you what the error is.
The error is often related to two columns being named the same either before or after (internally in) the operation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With