Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas unable to reset index because name exist

Tags:

python

pandas

I have a multi-level pandas dataframe which im trying to level. I use reset_index but its giving me error that the name already exist.

I dont want to use reset_index(drop=True) because i want to keep one of the column names still.

enter image description here

i want as my new dataframe:

country,listing_neighborhood,count

right now,

df.columns only gives count.

my code:

df.columns = ['count']
df.reset_index() -> gives error that `ValueError: cannot insert country, already exists`

I also tried:

df.columns.droplevel(0) -> gives error that 'Index' object has no attribute 'droplevel'

like image 451
jxn Avatar asked Feb 13 '18 07:02

jxn


People also ask

How do I reset my index to default?

Use DataFrame.reset_index() function reset_index() to reset the index of the updated DataFrame. By default, it adds the current row index as a new column called 'index' in DataFrame, and it will create a new row index as a range of numbers starting at 0.

How do I rename a Pandas index?

You can use the rename() method of pandas. DataFrame to change column/index name individually. Specify the original name and the new name in dict like {original name: new name} to columns / index parameter of rename() . columns is for the column name, and index is for the index name.

How do I reset Multiindex?

A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.


2 Answers

You need remove first duplicated level:

df = pd.DataFrame({
        'A':list('abcdef'),
         'B':[4,5,4,5,5,4],
         'C':[7,8,9,4,2,3],
         'F':list('aaabbb')
})

df = (df.set_index(['A','F','C'])
        .rename_axis(['country','country','listing_neighborhood'])
        .rename(columns={'B':'count'}))

print (df)
                                      count
country country listing_neighborhood       
a       a       7                         4
b       a       8                         5
c       a       9                         4
d       b       4                         5
e       b       2                         5
f       b       3                         4

df = df.reset_index(level=0, drop=True).reset_index()
print (df)
  country  listing_neighborhood  count
0       a                     7      4
1       a                     8      5
2       a                     9      4
3       b                     4      5
4       b                     2      5
5       b                     3      4

Or:

df = df.droplevel(0).reset_index()
like image 75
jezrael Avatar answered Sep 23 '22 03:09

jezrael


You can change the existing name so that it would not be duplicated anymore:

df.reset_index(name="new_name")

Hope this help

like image 23
Catbuilts Avatar answered Sep 23 '22 03:09

Catbuilts