Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summarize Table based on a Threshold

Tags:

r

dplyr

plyr

It might be a very simple problem. But I failed to so by using my known dplyr functions. Here's the data:

tab1 <- read.table(header=TRUE, text="
    Col1    A1  A2  A3  A4  A5
    ID1 43  52  33  25  59
                       ID2  27  41  20  71  22
                       ID3  37  76  36  27  44
                       ID4  23  71  62  25  63                  
                      ")
tab1
  Col1 A1 A2 A3 A4 A5
1  ID1 43 52 33 25 59
2  ID2 27 41 20 71 22
3  ID3 37 76 36 27 44
4  ID4 23 71 62 25 63

I intend to get a contingency table like the following by keeping values lower than 30.

Col1  Col2  Val
ID1   A4    25
ID2   A1    27
ID2   A3    20
ID2   A5    22
ID3   A4    27
ID4   A1    23
ID4   A4    25
like image 271
S Das Avatar asked Mar 14 '23 21:03

S Das


2 Answers

Or if you insist on dplyrness, you can gather the data first and then filter as desired

library(dplyr)
library(tidyr)
tab1 %>%
  gather(Col2, Val, -Col1) %>%
  filter(Val < 30)

#   Col1 Col2 Val
# 1  ID2   A1  27
# 2  ID4   A1  23
# 3  ID2   A3  20
# 4  ID1   A4  25
# 5  ID3   A4  27
# 6  ID4   A4  25
# 7  ID2   A5  22
like image 149
David Arenburg Avatar answered Mar 17 '23 14:03

David Arenburg


Use the reshape2 package with melt:

library(reshape2)
tab2 = melt(tab1)
tab2[tab2$value < 30,]

output:

   Col1 variable value
2   ID2       A1    27
4   ID4       A1    23
10  ID2       A3    20
13  ID1       A4    25
15  ID3       A4    27
16  ID4       A4    25
18  ID2       A5    22
like image 32
pcantalupo Avatar answered Mar 17 '23 15:03

pcantalupo