Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I only keep observations based on the max values after their decimal point?

I want to make this dataframe:

(edited to show that it's an actual data frame with more than 1 column)

ID = c(100.00, 100.12, 100.36, 101.00, 102.00, 102.24, 103.00, 103.36, 103.90)
blood = c(55, 54, 74, 42, 54, 45, 65, 34, 44)
df = data.frame(ID, blood)

  ID       blood
1 100.00    55
2 100.12    54
3 100.36    74
4 101.00    42
5 102.00    54
6 102.24    45
7 103.00    65
8 103.36    34
9 103.90    44

Become this one:

ID = c(100.36, 101.00, 102.24, 103.36)
df2 = data.frame(ID)

  ID2        blood2
1 100.36     74
2 101.00     42
3 102.24     45
4 103.90     44

In other words, for any given whole number (like 102) I only want to keep the highest decimal version of it. So basically I need to tell R to only keep the highest "version" of each whole number. Any ideas how?

like image 873
StatsNTats Avatar asked Jan 02 '23 02:01

StatsNTats


2 Answers

> ID = c(100.00, 100.12, 100.36, 101.00, 102.00, 102.24, 103.00, 103.36)
> ID2 <- tapply( ID, floor(ID), FUN=max)
> ID2
   100    101    102    103 
100.36 101.00 102.24 103.36 
> (df2 <- data.frame(ID2))
       ID2
100 100.36
101 101.00
102 102.24
103 103.36
> (df2 <- data.frame(ID=as.vector(ID2)))
      ID
1 100.36
2 101.00
3 102.24
4 103.36

expanded

> ID = c(100.00, 100.12, 100.36, 101.00, 102.00, 102.24, 103.00, 103.36, 103.9)
> blood = c(55, 54, 74, 42, 54, 45, 65, 34, 44)
> df = data.frame(ID, blood)
> 
> tmp <- tapply( df$ID, floor(df$ID), FUN=function(x) x==max(x))
> 
> (df2 <- df[unlist(tmp),])
      ID blood
3 100.36    74
4 101.00    42
6 102.24    45
9 103.90    44
like image 178
Greg Snow Avatar answered Jan 04 '23 15:01

Greg Snow


Here is an option using base R

df[with(df, ave(ID, floor(ID), FUN = max) == ID),, drop = FALSE]
like image 29
akrun Avatar answered Jan 04 '23 15:01

akrun