Import pandas dataframe column as string not int

People also ask

Can we convert DataFrame to string in Python?

Using DataFrame. You can convert the column “Fee” to a string by simply using DataFrame. apply(str) , for example df["Fee"]=df["Fee"].

Can a pandas column have different data types?

Pandas uses other names for data types than Python, for example: object for textual data. A column in a DataFrame can only have one data type. The data type in a DataFrame's single column can be checked using dtype . Make conscious decisions about how to manage missing data.

What does Parse_dates in pandas do?

We can use the parse_dates parameter to convince pandas to turn things into real datetime types. parse_dates takes a list of columns (since you could want to parse multiple columns into datetimes ).

Just want to reiterate this will work in pandas >= 0.9.1:

In [2]: read_csv('sample.csv', dtype={'ID': object})
Out[2]: 
                           ID
0  00013007854817840016671868
1  00013007854817840016749251
2  00013007854817840016754630
3  00013007854817840016781876
4  00013007854817840017028824
5  00013007854817840017963235
6  00013007854817840018860166

I'm creating an issue about detecting integer overflows also.

EDIT: See resolution here: https://github.com/pydata/pandas/issues/2247

Update as it helps others:

To have all columns as str, one can do this (from the comment):

pd.read_csv('sample.csv', dtype = str)

To have most or selective columns as str, one can do this:

# lst of column names which needs to be string
lst_str_cols = ['prefix', 'serial']
# use dictionary comprehension to make dict of dtypes
dict_dtypes = {x : 'str'  for x in lst_str_cols}
# use dict on dtypes
pd.read_csv('sample.csv', dtype=dict_dtypes)

This probably isn't the most elegant way to do it, but it gets the job done.

In[1]: import numpy as np

In[2]: import pandas as pd

In[3]: df = pd.DataFrame(np.genfromtxt('/Users/spencerlyon2/Desktop/test.csv', dtype=str)[1:], columns=['ID'])

In[4]: df
Out[4]: 
                       ID
0  00013007854817840016671868
1  00013007854817840016749251
2  00013007854817840016754630
3  00013007854817840016781876
4  00013007854817840017028824
5  00013007854817840017963235
6  00013007854817840018860166

Just replace '/Users/spencerlyon2/Desktop/test.csv' with the path to your file

Since pandas 1.0 it became much more straightforward. This will read column 'ID' as dtype 'string':

pd.read_csv('sample.csv',dtype={'ID':'string'})

As we can see in this Getting started guide, 'string' dtype has been introduced (before strings were treated as dtype 'object').

Related questions
                            
                                JavaScript function similar to Python range()
                            
                                How to convert column with dtype as object to string in Pandas Dataframe [duplicate]
                            
                                What is the correct way to document a **kwargs parameter?
                            
                                How to sort my paws?
                            
                                Unpacking, extended unpacking and nested extended unpacking
                            
                                Where is Python's sys.path initialized from?
                            
                                How to downgrade python from 3.7 to 3.6
                            
                                Why can't I use a list as a dict key in python?
                            
                                Use numpy array in shared memory for multiprocessing
                            
                                How to use a custom comparison function in Python 3?
                            
                                In python, why use logging instead of print?
                            
                                What is the difference between i = i + 1 and i += 1 in a 'for' loop? [duplicate]
                            
                                What values are valid in Pandas 'Freq' tags?
                            
                                Numpy array assignment with copy
                            
                                How to select Python version in PyCharm?
                            
                                Why declare unicode by string in python?
                            
                                What's the difference between MySQLdb, mysqlclient and MySQL connector/Python?
                            
                                Way to read first few lines for pandas dataframe
                            
                                What does model.eval() do in pytorch?
                            
                                Use curly braces to initialize a Set in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Import pandas dataframe column as string not int

Tags:

python

type-conversion

casting

pandas

dtype

People also ask

Recent Activity

Donate For Us