Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Long Numbers As A Character String

Tags:

As part of my dataset, one of the columns is a series of 24-digit numbers.

Example:

bigonumber <- 429382748394831049284934 

When I import it using either data.table::fread or read.csv, it shows up as numeric in exponential format (EG: 4.293827e+23).

options(digits=...) won't work since the number is longer than 22 digits.

When I do

as.character(bigonumber)  

what I get is "4.29382748394831e+23"

Is there a way to get bigonumber converted to a character string and show all of the digits as characters? I don't need to do any math on it, but I do need to search against it and do dplyr joins on it.

I need to this after import, since the column number varies from month to month.

(Yes, in the perfect world, my upstream data provider would use a hash instead of a long number and a static number of columns that stay the same every month, but I don't get to dictate that to them.)

like image 387
ClintWeathers Avatar asked Sep 01 '15 19:09

ClintWeathers


People also ask

Can numbers be a character?

A character can be a single letter, number, symbol, or whitespace. The char data type is an integral type, meaning the underlying value is stored as an integer.

Can you write numbers in string?

A string consists of one or more characters, which can include letters, numbers, and other types of characters. You can think of a string as plain text.

What is digits character?

Numbers (digits) are represented by the character d , whereas letters (words) are represented by the character w in regular expressions.


2 Answers

You can specify colClasses on your fread or read.csv statement.

bignums 429382748394831049284934 429382748394831049284935 429382748394831049284936 429382748394831049284937 429382748394831049284938 429382748394831049284939  bignums <- read.csv("~/Desktop/bignums.txt", sep="", colClasses = 'character') 
like image 89
Randy Zwitch Avatar answered Dec 24 '22 14:12

Randy Zwitch


You can suppress the scientific notation with

options(scipen=999) 

If you define the number then

bigonumber <- 429382748394831049284934 

you can convert it into a string:

big.o.string <- as.character(bigonumber) 

Unfortunately, this does not work because R converts the number to a double, thereby losing precision:

#[1] "429382748394831019507712" 

The last digits are not preserved, as pointed out by @SabDeM. Even setting

options(digits=22) 

doesn't help, and in any case 22 is the largest number that is allowed; and in your case there are 24 digits. So it seems that you will have to read the data directly as character or factor. Great answers have been posted showing how this can be achieved.

As a side note, there is a package called gmp that allows using arbitrarily large integer numbers. However, there is a catch: they have to be read as characters (again, in order to prevent R's internal conversion into double).

library(gmp) bigonumber <- as.bigz("429382748394831049284934") > bigonumber Big Integer ('bigz') : [1] 429382748394831049284934 > class(bigonumber) [1] "bigz" 

The advantage is that you can indeed treat these entries as numbers and perform calculations while preserving all the digits.

> bigonumber * 2 #Big Integer ('bigz') : #[1] 858765496789662098569868 

This package and my answer here may not solve your problem, because reading the numbers directly as characters is an easier way to achieve your goal, but I thought I might post this anyway as an information for users who may need to use large integers with more than 22 digits.

like image 24
RHertel Avatar answered Dec 24 '22 13:12

RHertel