Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preserving large numbers

I am trying to read a CSV file that has barcodes in the first column, but when R gets it into a data.frame, it converts 1665535004661 to 1.67E+12.

Is there a way to preserve this number in an integer format? I tried assigning a class of "double", but that didn’t work, nor did assigning a class of "character". Once it is in the 1.67E+12 format any attempt to convert it back to an integer returns 167000000000.

like image 526
James Avatar asked May 22 '12 23:05

James


2 Answers

It's not in a "1.67E+12 format", it just won't print entirely using the defaults. R is reading it in just fine and the whole number is there.

x <- 1665535004661
> x
[1] 1.665535e+12
> print(x, digits = 16)
[1] 1665535004661

See, the numbers were there all along. They don't get lost unless you have a really large number of digits. Sorting on what you brought in will work fine and you can just explicitly call print() with the digits option to see your data.frame instead of implicitly by typing the name.

like image 143
John Avatar answered Oct 12 '22 23:10

John


Picking up on what you said in the comments, you can directly import the text as a character by specifying the colClasses in read.table(). For example:

num <- "1665535004661"
dat.char <- read.table(text = num, colClasses="character")
str(dat.char)
#------
'data.frame':   1 obs. of  1 variable:
 $ V1: chr "1665535004661"
dat.char
#------
             V1
1 1665535004661

Alternatively (and for other uses), you can specify the digits variable under options(). The default is 7 digits and the acceptable range is 1-22. To be clear, setting this option in no way changes or alters the underlying data, it merely controls how it is displayed on screen when printed. From the help page for ?options:

controls the number of digits to print when printing numeric values. It is a suggestion only.
Valid values are 1...22 with default 7. See the note in print.default about values greater than
15.

Example illustrating this:

options(digits = 7)
dat<- read.table(text = num)

dat
#------
            V1
1 1.665535e+12

options(digits = 22)
dat
#------
             V1
1 1665535004661

To flesh this out completely and to account for the cases when setting a global setting is not preferable, you can specify digits directly as an argument to print(foo, digits = bar). You can read more about this under ?print.default. This is what John describes in his answer so credit should go to him for illuminating that nuance.

like image 40
Chase Avatar answered Oct 13 '22 00:10

Chase