Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I change data-type of pandas data frame to string with a defined format?

I'm starting to tear my hair out with this - so I hope someone can help. I have a pandas DataFrame that was created from an Excel spreadsheet using openpyxl. The resulting DataFrame looks like:

print image_name_data      id           image_name 0  1001  1001_mar2014_report 1  1002  1002_mar2014_report 2  1003  1003_mar2014_report  [3 rows x 2 columns] 

…with the following datatypes:

print image_name_data.dtypes id            float64 image_name     object dtype: object 

The issue is that the numbers in the id column are, in fact, identification numbers and I need to treat them as strings. I've tried converting the id column to strings using:

image_name_data['id'] = image_name_data['id'].astype('str') 

This seems a bit ugly but it does produce a variable of type 'object' rather than 'float64':

print image_name_data.dyptes id            object image_name    object dtype: object 

However, the strings that are created have a decimal point, as shown:

print image_name_data        id           image_name 0  1001.0  1001_mar2014_report 1  1002.0  1002_mar2014_report 2  1003.0  1003_mar2014_report  [3 rows x 2 columns] 

How can I convert a float64 column in a pandas DataFrame to a string with a given format (in this case, for example, '%10.0f')?

like image 956
user1718097 Avatar asked Mar 08 '14 23:03

user1718097


People also ask

How do I change DataFrame data type in pandas?

In order to convert data types in pandas, there are three basic options: Use astype() to force an appropriate dtype. Create a custom function to convert the data. Use pandas functions such as to_numeric() or to_datetime()

How do you convert a DataFrame to a string in Python?

If you want to change the data type for all columns in the DataFrame to the string type, you can use df. applymap(str) or df. astype(str) methods.


2 Answers

I'm unable to reproduce your problem but have you tried converting it to an integer first?

image_name_data['id'] = image_name_data['id'].astype(int).astype('str') 

Then, regarding your more general question you could use map (as in this answer). In your case:

image_name_data['id'] = image_name_data['id'].map('{:.0f}'.format) 
like image 144
exp1orer Avatar answered Sep 23 '22 08:09

exp1orer


If you could reload this, you might be able to use dtypes argument.

pd.read_csv(..., dtype={'COL_NAME':'str'}) 
like image 33
smishra Avatar answered Sep 24 '22 08:09

smishra