Returning a true value if it is a close match among different columns in r

Tags:

r

The data is as follows :

a <- c('id1','id2','id3','id4','id5')
b <- c(5,10,7,2,3)
d <- c(5.2,150,123,5,7)
e <- c(5.4,0,10,3,5)

df1 <- data.frame(a,b,d,e)

I want to create a new column in this data frame returning TRUE and FALSE. It should be true if all the values are within 5% difference of each other, else false.

For example, for 'id1' the values are 5,5.2,5.4 respectively for b,d and e column. So all these are within 5% of each other hence the new_col should be true. For 'id2' the values are 10,150,0 respectively for b,d and e column.So, they are not with 5% of each other, hence it should be false.

Desired Output

enter image description here

237

asked Nov 24 '20 05:11

Rohan Bali

3 Answers

This looks at 1.05 times the minimum values is less than the 0.95 times the maximum value for each of the rows. (I assumed that's what you meant by within %5 of each other.)

sapply(1:nrow(df1), function(i) (min(df1[i, 2:4]) * 1.05) > 
     (0.95 * max(df1[i, 2:4])))
# [1]  TRUE FALSE FALSE FALSE FALSE

Slightly different way to do the same.

sapply(1:nrow(df1), function(i) diff(range(df1[i, 2:4]) * 
    c(1.05, 0.95)) <= 0)
# [1]  TRUE FALSE FALSE FALSE FALSE

answered Oct 21 '22 17:10

Suren

Does this work:

library(dplyr)
library(data.table)
df1 %>% rowwise() %>% mutate(new_col = case_when(between(d, 0.95*b, 1.05*b) & between(e, 0.95*d, 1.05*d) ~ 'TRUE', TRUE ~ 'FALSE'))
# A tibble: 5 x 5
# Rowwise: 
  a         b     d     e new_col
  <chr> <dbl> <dbl> <dbl> <chr>  
1 id1       5   5.2   5.4 TRUE   
2 id2      10 150     0   FALSE  
3 id3       7 123    10   FALSE  
4 id4       2   5     3   FALSE  
5 id5       3   7     5   FALSE

answered Oct 21 '22 16:10

Karthik S

Is this what you're after?

a <- c('id1','id2','id3','id4','id5')
b <- c(5,10,7,2,3)
d <- c(5.2,150,123,5,7)
e <- c(5.4,0,10,3,5)

df1 <- data.frame(a,b,d,e)
library(tidyverse)
df1 %>% 
  mutate(new_col = ifelse((b >= (0.95 * d) & b <= (1.05 * d) & d >= (0.95 * e) & d <= (1.05 * e)),
                          "TRUE", "FALSE"))

    a  b     d    e new_col
1 id1  5   5.2  5.4    TRUE
2 id2 10 150.0  0.0   FALSE
3 id3  7 123.0 10.0   FALSE
4 id4  2   5.0  3.0   FALSE
5 id5  3   7.0  5.0   FALSE

answered Oct 21 '22 17:10

jared_mamrot

Related questions
                            
                                R function Sink isn't redirecting messages or warnings to a file
                            
                                iterate through data frame where each iteration is dependent on the previous item in R efficiently
                            
                                I want ggplot gridline thickness to be different for major/minor gridlines
                            
                                How can I add a data frame as vertex attributes with matching ids in igraph?
                            
                                Unknown graphics device error in Rstudio
                            
                                Using mosaic in r to merge multiple geotiff
                            
                                Incremental sequences with interruptions
                            
                                .Rprofile not sourced
                            
                                Using as.formula with a comma
                            
                                Plot only one side/half of the violin plot
                            
                                HTML table does not show on source file
                            
                                Convert a DataFrame into Adjacency/Weights Matrix in R
                            
                                How to detect TIME when reading from an excel sheet using R
                            
                                How to save and read output of read_html as an RDS file?
                            
                                How to fill geometric figures created by lines and curves?
                            
                                How do I use html_nodes to select nodes with "attribute = x" in R?
                            
                                Remove part of a string based on overlapping patterns
                            
                                Combine ggplotly and ggplot with patchwork?
                            
                                data.table: performance of binary search VS vector scan
                            
                                Math mode in bsTooltip in shiny

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With