Keep rows with certain values within a data frame and delete all others [R]

Question

I am using R

set.seed(1)
Data <- data.frame(id = seq(1, 10), 
               Diag1 = sample(c("A123", "B123", "C123"), 10, replace = TRUE), 
               Diag2 = sample(c("D123", "E123", "F123"), 10, replace = TRUE), 
               Diag3 = sample(c("G123", "H123", "I123"), 10, replace = TRUE), 
               Diag4 = sample(c("A123", "B123", "C123"), 10, replace = TRUE), 
               Diag5 = sample(c("J123", "K123", "L123"), 10, replace = TRUE), 
               Diag6 = sample(c("M123", "N123", "O123"), 10, replace = TRUE), 
               Diag7 = sample(c("P123", "Q123", "R123"), 10, replace = TRUE))
Data

I've got a data frame like this. In reality it has 34 variables and 1.5 Mio observations. It is a data frame with patient data. (ID & diagnoses (ICD10) A123 and B123 stand for certain diagnoses. I want to extract all the patients with these diagnoses. In fact i am looking for 6 diagnoses within 100s of different ICD10 diagnoses. Every of those diagnoses i look for can be appear in any column but they are mutually exclusive. In the end I will have a data frame of about 4000 observations instead of 1.5 Mio.

My goal is to get a data frame where I just keep the rows which contain A123 or B123. A123 and B123 cannot be in the same row. But they can appear in every column.

I manage to do that for one single variable when i do this:

DataA123 <- Data[Data$Diag1 == "A123", ]

But i want to do it for every variable and for A123 and B123 (there are actually 6 factors like this) together.

Is this possible?

ROLO · Accepted Answer

How about this?

Select all rows with A123 and/or B123:

Data[apply(Data,1,function(x) {any(c("A123", "B123") %in% x)}),]

Select all rows with either A123 or B123:

Data[apply(Data,1,function(x) {Reduce(xor, c("A123", "B123") %in% x)}),]

Keep rows with certain values within a data frame and delete all others [R]

Tags:

r

rows

Roccer

1 Answers

ROLO

Recent Activity

Donate For Us

Keep rows with certain values within a data frame and delete all others [R]

Tags:

r

rows

Roccer

1 Answers

ROLO

Related questions

Recent Activity

Donate For Us