I have a series of CSV files where numbers are formatted in the european style using commas instead of decimal points, i.e. 0,5
instead of 0.5
.
There are too many of these files to edit them before importing to R. I was hoping there is an easy parameter for the read.csv()
function, or a method to apply to the extracted dataset in order for R to treat the data as a number rather than a string.
In large numbers, commas are used to help the reader. A comma is placed every third digit to the left of the decimal point and so is used in numbers with four or more digits. Continue to place a comma after every third digit.
Place the cursor at the location you want to insert the 1000 separator, click Insert > Symbol > More Symbols. 2. In the Symbol dialog, under Symbols tab select Verdana from Font drop-down list, then select Basic Latin from Subset drop-down list, now select the 1000 separator from the list, click Insert to insert it.
When you check ?read.table
you will probably find all the answer that you need.
There are two issues with (continental) European csv files:
c
in csv stand for? For standard csv this is a ,
, for European csv this is a ;
sep
is the corresponding argument in read.table
.
, for European csv this is a ,
dec
is the corresponding argument in read.table
To read standard csv use read.csv
, to read European csv use read.csv2
. These two functions are just wrappers to read.table
that set the appropriate arguments.
If your file does not follow either of these standards set the arguments manually.
From ?read.table
:
dec the character used in the file for decimal points.
And yes, you can use that for read.csv
as well. (to me: no stupid, you cannot!)
Alternatively, you can also use
read.csv2
which assumes a "," decimal separator and a ";" for column separators.
read.csv(... , sep=";")
Suppose this imported field is called "amount", you can fix the type in this way if your numbers are being read in as character:
d$amount <- sub(",",".",d$amount)
d$amount <- as.numeric(d$amount)
I have this happen to me frequently along with a bunch of other little annoyances when importing from excel or excel csv. As it seems that there's no consistent way to ensure getting what you expect when you import into R, post-hoc fixes seem to be the best method. By that I mean, LOOK at what you imported - make sure it's what you expected and fix it if it's not.
can be used as follow:
mydata <- read.table(fileIn, dec=",")
input file (fileIn):
D:\TEST>more input2.txt
06-05-2014 09:19:38 3,182534 0
06-05-2014 09:19:51 4,2311 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With