Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent pandas.read_csv from inferring dtypes

Tags:

python

pandas

How to prevent pandas.read_csv() from inferring the data types. For example, its converting strings true and false to Bool: True and False. The columns are many for many files, therefore not feasible to do:

df['field_name'] = df['field_name'].astype(np.float64) for each of the columns in each file. I prefer pandas to just read file as it is, no type inferring.

like image 499
DougKruger Avatar asked Nov 22 '16 20:11

DougKruger


People also ask

Is read_csv faster than Read_excel?

Importing csv files in Python is 100x faster than Excel files. We can now load these files in 0.63 seconds. That's nearly 10 times faster! Python loads CSV files 100 times faster than Excel files.

What does parse_dates do in pandas?

We can use the parse_dates parameter to convince pandas to turn things into real datetime types. parse_dates takes a list of columns (since you could want to parse multiple columns into datetimes ).

How do I get rid of unnamed columns in pandas?

Method 1: Use the index = False argument But you should also include index = False argument. It will automatically drop the unnamed column in pandas. And if you want to set the index for the dataframe then you can call the df.


1 Answers

Use the parameter dtype=object for Pandas to keep the data as such at load time.

like image 73
Zeugma Avatar answered Sep 19 '22 11:09

Zeugma