Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R cut() results in odd handling of zero

Tags:

r

cut

I am using cut() to categorize a list ranging from negative to positive values in increments of 0.05. However, the handling of zero over the range -0.05 to 0.05 changes depending on the range (see example below) used in the cut function, so that I get [-0.05, 0), [0, 0.05) or [-0.05, 2.8e-17), [2.8e-17, 0.05). My preference is for zeros to be displayed as 0.

I would like to have, for my intended purposes, a generic enough range so that it can handle the range in whatever list I supply. I have used dig.lab, used explicit breakpoints instead of seq(), but this does not help, or at least for my setup of R v3.0.2 on Win7 64-bit machine.

I am sure I am missing something obvious but I just cant figure it out. Any help or guidance is greatly appreciated. Many thanks in advance !

Example I am having trouble with is:

x<-c(-0.0262, 0.0426, 0.0212, 0.0166, 0.0225
, -0.0089, 0.0418, 0.0246, -0.0128, -0.0841)
y1<-cut(x, breaks=seq(from= -0.15, to=0.1, by=0.05), right=FALSE)
y1 # undesired handling of 0 by using a more generic range in seq
y2<-cut(x, breaks=seq(from= -0.1, to=0.1, by=0.05), right=FALSE)
y2 # desired handling of 0

For y1:

[1] [-0.05,2.78e-17) [2.78e-17,0.05)  [2.78e-17,0.05)  [2.78e-17,0.05)  [2.78e-17,0.05) 
 [6] [-0.05,2.78e-17) [2.78e-17,0.05)  [2.78e-17,0.05)  [-0.05,2.78e-17) [-0.1,-0.05)    
Levels: [-0.15,-0.1) [-0.1,-0.05) [-0.05,2.78e-17) [2.78e-17,0.05) [0.05,0.1)

For y2:

[1] [-0.05,0)    [0,0.05)     [0,0.05)     [0,0.05)     [0,0.05)     [-0.05,0)    [0,0.05)    
 [8] [0,0.05)     [-0.05,0)    [-0.1,-0.05)
Levels: [-0.1,-0.05) [-0.05,0) [0,0.05) [0.05,0.1)
like image 883
user2892709 Avatar asked Mar 19 '23 16:03

user2892709


1 Answers

Dealing with floating point numbers is a notoriously messy problem in computer science. Since computer store numbers in base 2 rather than base 10, certain numbers that we commonly use in base 10 simply cannot be expressed succinctly in base 10. I'd recommend doing as much of the work as possible with integers. For example, This should work for y1

y1<-cut(x, breaks=seq(from= -15, to=10, by=5)/100, right=FALSE)
levels(y1)
#[1] "[-0.15,-0.1)" "[-0.1,-0.05)" "[-0.05,0)"    "[0,0.05)"     "[0.05,0.1)" 
like image 86
MrFlick Avatar answered Mar 29 '23 09:03

MrFlick