Pandas read scientific notation and change

Q: Can pandas read scientific notation?

Scientific notations isn't helpful when you are trying to make quick comparisons across your dataset. However, Pandas will introduce scientific notations by default when the data type is a float.

Q: Can Python interpret scientific notation?

Python can deal with floating point numbers in both scientific and standard notation.

Tags:

csv

I have a dataframe in pandas that i'm reading in from a csv.

One of my columns has values that include NaN, floats, and scientific notation, i.e. 5.3e-23

My trouble is that as I read in the csv, pandas views these data as an object dtype, not the float32 that it should be. I guess because it thinks the scientific notation entries are strings.

I've tried to convert the dtype using df['speed'].astype(float) after it's been read in, and tried to specify the dtype as it's being read in using df = pd.read_csv('path/test.csv', dtype={'speed': np.float64}, na_values=['n/a']). This throws the error ValueError: cannot safely convert passed user dtype of <f4 for object dtyped data in column ...

So far neither of these methods have worked. Am I missing something that is an incredibly easy fix?

this question seems to suggest I can specify known numbers that might throw an error, but i'd prefer to convert the scientific notation back to a float if possible.

EDITED TO SHOW DATA FROM CSV AS REQUESTED IN COMMENTS

7425616,12375,28,2015-08-09 11:07:56,0,-8.18644,118.21463,2,0,2
7425615,12375,28,2015-08-09 11:04:15,0,-8.18644,118.21463,2,NaN,2
7425617,12375,28,2015-08-09 11:09:38,0,-8.18644,118.2145,2,0.14,2
7425592,12375,28,2015-08-09 10:36:34,0,-8.18663,118.2157,2,0.05,2
65999,1021,29,2015-01-30 21:43:26,0,-8.36728,118.29235,1,0.206836151554794,2
204958,1160,30,2015-02-03 17:53:37,2,-8.36247,118.28664,1,9.49242000872744e-05,7
384739,,32,2015-01-14 16:07:02,1,-8.36778,118.29206,2,Infinity,4
275929,1160,30,2015-02-17 03:13:51,1,-8.36248,118.28656,1,113.318511172611,5

674

asked Dec 01 '15 06:12

hselbie

1 Answers

It's hard to say without seeing your data but it seems that problem in your rows that they contain something else except for numbers and 'n/a' values. You could load your dataframe and then convert it to numeric as show in answers for that question. If you have pandas version >= 0.17.0 then you could use following:

df1 = df.apply(pd.to_numeric, args=('coerce',))

Then you could drop row with NA values with dropna or fill them with zeros with fillna

126

answered Sep 21 '22 19:09

Anton Protopopov

Related questions
                            
                                How do I compile a Fortran library for use with Python? (f2py may not be an option)
                            
                                Finding word on page(s) in document
                            
                                Ansible permission issue
                            
                                Compile plugins for Uwsgi
                            
                                SettingWithCopyWarning when one column of DataFrame is strings
                            
                                Shutdown an SimpleXMLRPCServer server in python
                            
                                Python regex partial extract
                            
                                Why does scipy linear interpolation run faster than nearest neighbor interpolation?
                            
                                Scrapy 1.0+ proper settings access in CsvItemExporter subclass?
                            
                                How to change python version in windows git bash?
                            
                                Save Excel as HTML in Python
                            
                                Running Python code in Markdown
                            
                                Continuous Fourier Transform with Python / Sympy (Analytical Solution)
                            
                                What is the Sphinx docstring standard for data structure types such as lists?
                            
                                Create Custom Cross Validation in Spark ML
                            
                                How do i make pyinvoke use python3?
                            
                                Delimit a specific column and add them as columns in CSV (Python3, CSV)
                            
                                How do I use python-WikEdDiff?
                            
                                How to automatically accept SSL certs in chrome?
                            
                                pycharm multiple interpreters in same project? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas read scientific notation and change

Tags:

python

pandas

csv

hselbie

People also ask

1 Answers

Anton Protopopov

Recent Activity

Donate For Us