Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas isin returns an error when encountering an empty DF after using concat

When using concat and then isin to drop all rows, I encounter:

ValueError: cannot compute isin with a duplicate axis.

Meanwhile if there are any rows left in the DF, there are no issues. Also, not using concat works in any case, returning an empty DF gracefully. I'm attempting to use concat as it's slightly faster in my use case.

Pandas 0.24.2 (latest).

df = pd.DataFrame()
for r in range(5):
    df = df.append({'type':'teine', 'id':r}, ignore_index=True)

# Problem line
df = pd.concat([df.reset_index(drop=True), pd.DataFrame({'type':'teine', 'id':5}, index=[0])], sort=True)
dfCopy = df.copy()
df.query("(type == 'teine')", inplace=True)
df = dfCopy[~dfCopy.isin(df).all(axis=1)]
like image 565
misantroop Avatar asked Nov 24 '25 13:11

misantroop


2 Answers

you can just avoid the duplicated index problem not duplicating it:

df = pd.DataFrame()
for r in range(5):
    df = df.append({'type':'teine', 'id':r}, ignore_index=True)

# Problem line
df = pd.concat([df.reset_index(drop=True), pd.DataFrame({'type':'teine', 'id':5}, index=[max(df.index.values)+1])], sort=True)
dfCopy = df.copy()
df.query("(type == 'teine')", inplace=True)
df = dfCopy[~dfCopy.isin(df).all(axis=1)]

Or better yet, you can reset your index after concatenate:

df = pd.DataFrame()
for r in range(5):
    df = df.append({'type':'teine', 'id':r}, ignore_index=True)

# Problem line
df = pd.concat([df, pd.DataFrame({'type':'teine', 'id':5}, index=[0])], sort=True).reset_index(drop=True)
dfCopy = df.copy()
df.query("(type == 'teine')", inplace=True)
df = dfCopy[~dfCopy.isin(df).all(axis=1)]
like image 144
Bruno Aquino Avatar answered Nov 26 '25 04:11

Bruno Aquino


In the line :

df = pd.concat([df.reset_index(drop=True), pd.DataFrame({'type':'teine', 'id':5}, index=[0])], sort=True)

there is no need to reset the index inside the concat (df.reset_index(drop=True)). But you have to reset the index after the concat to avoid your error. Here is what it looks like :

df = pd.concat([df, pd.DataFrame({'type':'teine', 'id':5}, index=[0])], sort=True).reset_index(drop=True)
like image 30
vlemaistre Avatar answered Nov 26 '25 02:11

vlemaistre



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!