I am using fread
from data.table
to load csv files. However my csv files uses dec=","
as a decimal-separator (1.23
will be 1,23
). Unlike in read.csv
it seems that dec
is not an allowed parameter.
R) args(fread)
function (input = "test.csv", sep = "auto", sep2 = "auto", nrows = -1,
header = "auto", na.strings = "NA", stringsAsFactors = FALSE,
verbose = FALSE, autostart = 30)
Do you see a work around (a R option to set may be) that will enable me to use fread
(it is so much faster that it saves me a lot of time)?
PS: colClasses
is not yet implemented so setAs
cannot be used like in this post
Update Oct 2014 : Now in v1.9.5
fread
now acceptsdec=','
(and other non-'.' decimal separators), #917. A new paragraph has been added to?fread
. If you are located in a country that usesdec=','
then it should just work. If not, you will need to read the paragraph for an extra step. In case it somehow breaksdec='.'
, this new feature can be turned off withoptions(datatable.fread.dec.experiment=FALSE)
.
Previous answer ...
Matt Dowle found a nice work-around with locales.
First my sessionInfo
sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 LC_NUMERIC=C
[5] LC_TIME=C
...
Trying the following shows the culprit:
Sys.localeconv()["decimal_point"]
decimal_point
"."
Trying to set the LC_NUMERIC worked on Ubuntu(Matthew) and WinXP(me)
Sys.setlocale("LC_NUMERIC", "French_France.1252")
[1] "French_France.1252"
Message d'avis :
In Sys.setlocale("LC_NUMERIC", "French_France.1252") :
changer 'LC_NUMERIC' peut résulter en un fonctionnement étrange de R
The behaviour is fine and changes as:
DT = fread("A,B\n3,14;123\n4,22;456\n",sep=";")
str(DT)
Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables:
$ V1: num 3.14 4.22
$ V2: int 123 456
The "." decimal separators are now loaded as strings (as it should), it was the opposite previously.
DT = fread("A,B\n3.14;123\n4.22;456\n",sep=";")
str(DT)
Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables:
$ V1: chr "3.14" "4.22"
$ V2: int 123 456
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With