Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert column with dtype as object to string in Pandas Dataframe [duplicate]

Tags:

python

pandas

People also ask

How do you convert a column object to a string in Python?

Use pandas DataFrame. astype() function to convert a column from int to string, you can apply this on a specific column or on an entire DataFrame. The Below example converts Fee column from int to string dtype. You can also use numpy.

How do I change Dtype of a column in Pandas to string?

Pandas Change Column Type To String You can use it by using the astype() method and mentioning the str as target datatype. In the sample dataframe, the column Unit_Price is float64. When the below line is executed, Unit_Price column will be converted to String format.

Is Dtype object a string?

Moreover, having dtype as Object will make it less clear to work with just text and exclude the non-text values. With the new String dtype, the values are explicitly treated as strings.

How do I convert an object to a string in Python?

Python is all about objects thus the objects can be directly converted into strings using methods like str() and repr(). Str() method is used for the conversion of all built-in objects into strings. Similarly, repr() method as part of object conversion method is also used to convert an object back to a string.


since strings data types have variable length, it is by default stored as object dtype. If you want to store them as string type, you can do something like this.

df['column'] = df['column'].astype('|S80') #where the max length is set at 80 bytes,

or alternatively

df['column'] = df['column'].astype('|S') # which will by default set the length to the max len it encounters

Did you try assigning it back to the column?

df['column'] = df['column'].astype('str') 

Referring to this question, the pandas dataframe stores the pointers to the strings and hence it is of type 'object'. As per the docs ,You could try:

df['column_new'] = df['column'].str.split(',') 

Not answering the question directly, but it might help someone else.

I have a column called Volume, having both - (invalid/NaN) and numbers formatted with ,

df['Volume'] = df['Volume'].astype('str')
df['Volume'] = df['Volume'].str.replace(',', '')
df['Volume'] = pd.to_numeric(df['Volume'], errors='coerce')

Casting to string is required for it to apply to str.replace

pandas.Series.str.replace
pandas.to_numeric


You could try using df['column'].str. and then use any string function. Pandas documentation includes those like split