Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read in numbers with a comma as decimal separator?

I have a series of CSV files where numbers are formatted in the european style using commas instead of decimal points, i.e. 0,5 instead of 0.5.

There are too many of these files to edit them before importing to R. I was hoping there is an easy parameter for the read.csv() function, or a method to apply to the extracted dataset in order for R to treat the data as a number rather than a string.

like image 886
klonq Avatar asked May 25 '11 10:05

klonq


People also ask

How do you read a comma with numbers?

In large numbers, commas are used to help the reader. A comma is placed every third digit to the left of the decimal point and so is used in numbers with four or more digits. Continue to place a comma after every third digit.

How do you make a separator in numbers?

Place the cursor at the location you want to insert the 1000 separator, click Insert > Symbol > More Symbols. 2. In the Symbol dialog, under Symbols tab select Verdana from Font drop-down list, then select Basic Latin from Subset drop-down list, now select the 1000 separator from the list, click Insert to insert it.


4 Answers

When you check ?read.table you will probably find all the answer that you need.

There are two issues with (continental) European csv files:

  1. What does the c in csv stand for? For standard csv this is a ,, for European csv this is a ;
    sep is the corresponding argument in read.table
  2. What is the character for the decimal point? For standard csv this is a ., for European csv this is a ,
    dec is the corresponding argument in read.table

To read standard csv use read.csv, to read European csv use read.csv2. These two functions are just wrappers to read.table that set the appropriate arguments.

If your file does not follow either of these standards set the arguments manually.

like image 74
Henrik Avatar answered Nov 03 '22 09:11

Henrik


From ?read.table:

dec     the character used in the file for decimal points.

And yes, you can use that for read.csv as well. (to me: no stupid, you cannot!)

Alternatively, you can also use

read.csv2

which assumes a "," decimal separator and a ";" for column separators.

like image 33
aL3xa Avatar answered Nov 03 '22 10:11

aL3xa


read.csv(... , sep=";")

Suppose this imported field is called "amount", you can fix the type in this way if your numbers are being read in as character:

d$amount <- sub(",",".",d$amount)
d$amount <- as.numeric(d$amount)

I have this happen to me frequently along with a bunch of other little annoyances when importing from excel or excel csv. As it seems that there's no consistent way to ensure getting what you expect when you import into R, post-hoc fixes seem to be the best method. By that I mean, LOOK at what you imported - make sure it's what you expected and fix it if it's not.

like image 41
Brandon Bertelsen Avatar answered Nov 03 '22 09:11

Brandon Bertelsen


can be used as follow:

mydata <- read.table(fileIn, dec=",")

input file (fileIn):

D:\TEST>more  input2.txt

06-05-2014 09:19:38     3,182534        0

06-05-2014 09:19:51     4,2311          0
like image 42
Lowreno Avatar answered Nov 03 '22 10:11

Lowreno