Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas - change df.index from float64 to unicode or string

I want to change a dataframes' index (rows) from float64 to string or unicode.

I thought this would work but apparently not:

#check type type(df.index) 'pandas.core.index.Float64Index'  #change type to unicode if not isinstance(df.index, unicode):     df.index = df.index.astype(unicode) 

error message:

TypeError: Setting <class 'pandas.core.index.Float64Index'> dtype to anything other than float64 or object is not supported 
like image 811
Boosted_d16 Avatar asked Feb 12 '16 17:02

Boosted_d16


People also ask

How do I change the Dtype of index in pandas?

To change the type of a DataFrame's index in Pandas, use the DataFrame. index. astype(~) method.

How do I change the index of a DF?

To reset the index in pandas, you simply need to chain the function . reset_index() with the dataframe object. On applying the . reset_index() function, the index gets shifted to the dataframe as a separate column.

Can we change index in DataFrame?

Pandas DataFrame: set_index() function The set_index() function is used to set the DataFrame index using existing columns. Set the DataFrame index (row labels) using one or more existing columns or arrays of the correct length. The index can replace the existing index or expand on it.

What is Int64Index?

Int64Index is a special case of Index with purely integer labels. . Deprecated since version 1.4. 0: In pandas v2. 0 Int64Index will be removed and NumericIndex used instead. Int64Index will remain fully functional for the duration of pandas 1.


1 Answers

You can do it that way:

# for Python 2 df.index = df.index.map(unicode)   # for Python 3 (the unicode type does not exist and is replaced by str) df.index = df.index.map(str) 

As for why you would proceed differently from when you'd convert from int to float, that's a peculiarity of numpy (the library on which pandas is based).

Every numpy array has a dtype, which is basically the machine type of its elements : in that manner, numpy deals directly with native types, not with Python objects, which explains how it is so fast. So when you are changing the dtype from int64 to float64, numpy will cast each element in the C code.

There's also a special dtype : object, that will basically provide a pointer toward a Python object.

If you want strings, you thus have to use the object dtype. But using .astype(object) would not give you the answer you were looking for : it would instead create an index with object dtype, but put Python float objects inside.

Here, by using map, we convert the index to strings with the appropriate function: numpy gets the string objects and understand that the index has to have an object dtype, because that's the only dtype that can accomodate strings.

like image 190
Arthur Avatar answered Oct 16 '22 02:10

Arthur