I use pandas read_csv to read a simple csv file. However, it has ValueError: could not convert string to float: which I do not understand why.
The code is simply
rawdata = pd.read_csv( r'Journal_input.csv' ,
dtype = { 'Base Amount' : 'float64' } ,
thousands = ',' ,
decimal = '.',
encoding = 'ISO-8859-1')
But I get this error
pandas\parser.pyx in pandas.parser.TextReader.read (pandas\parser.c:10415)()
pandas\parser.pyx in pandas.parser.TextReader._read_low_memory (pandas\parser.c:10691)()
pandas\parser.pyx in pandas.parser.TextReader._read_rows (pandas\parser.c:11728)()
pandas\parser.pyx in pandas.parser.TextReader._convert_column_data (pandas\parser.c:13162)()
pandas\parser.pyx in pandas.parser.TextReader._convert_tokens (pandas\parser.c:14487)()
ValueError: could not convert string to float: '79,026,695.50'
How can it possible to get error when converting a string of '79,026,695.50' to float? I have already specified the two options
thousands = ',' ,
decimal = '.',
Is it some problem our my code or a bug in pandas?
It seems there is problem with quoting, because if separator is , and thousands is , too, some quoting has to be in csv:
import pandas as pd
from pandas.compat import StringIO
import csv
temp=u"""'a','Base Amount'
'11','79,026,695.50'"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp),
dtype = { 'Base Amount' : 'float64' },
thousands = ',' ,
quotechar = "'",
quoting = csv.QUOTE_ALL,
decimal = '.',
encoding = 'ISO-8859-1')
print (df)
a Base Amount
0 11 79026695.5
temp=u'''"a","Base Amount"
"11","79,026,695.50"'''
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp),
dtype = { 'Base Amount' : 'float64' },
thousands = ',' ,
quotechar = '"',
quoting = csv.QUOTE_ALL,
decimal = '.',
encoding = 'ISO-8859-1')
print (df)
a Base Amount
0 11 79026695.5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With