R - How to make a subset of columns based on values in a row in a data frame

Tags:

I have a matrix that I would like to subset and eventually use to make a plot. The data is a list of counts for specific blood markers for each patient in a population. It looks like this:

    df <- data.frame(MarkerID=c("Class","A123","A124"),
             MarkerName=c("","X","Y"),
             Patient.1=c(0,1,5),
             Patent.2=c(1,2,6),
             Patent.3=c(0,3,7),
             Patient.4=c(1,4,8))

I would like to make a data frame of all of the patients (columns 3-6) that have a class value of zero (1st row) and a second data frame of all of the patients with a class value of 1.

In the past I have used the subset function to select rows based on the values in a column, is it possible to select a subset of columns based on the values in a row?

I've tried this:

x <- subset(data, data[1,] == 0)

however, when I do dim(x) the number of columns is the same as dim(data) but the number of rows is different. Any ideas on how I can make this return just those columns whose value in row 1 is 0?

Roland, Yes. You're example df is what the data frame looks like. There are ~30,000 markers and >400 patients in the data frame so I didn't post the dput(head(data)). Thanks for the reshaping tip, I'll give that a try.

Your example code did work to subset the columns based on the rows

data[,c(TRUE,TRUE,data[1,-(1:2)]==1)]

on the data I was then able to get a data frame with all of the rows and only the columns with the indicated class.

693

asked Jan 26 '13 17:01

jeran stratford

1 Answers

Your data is nor arranged in a good way. It would be better to reshape it.

In absence of input data this is just a guess:

df <- data.frame(MarkerID=c("Class","A123","A124"),
                 MarkerName=c("","X","Y"),
                 Patient.1=c(0,1,5),
                 Patent.2=c(1,2,6),
                 Patent.3=c(0,3,7),
                 Patient.4=c(1,4,8))

#  MarkerID MarkerName Patient.1 Patent.2 Patent.3 Patient.4
#1    Class                    0        1        0         1
#2     A123          X         1        2        3         4
#3     A124          Y         5        6        7         8

df[,c(TRUE,TRUE,df[1,-(1:2)]==0)]

#  MarkerID MarkerName Patient.1 Patent.3
#1    Class                    0        0
#2     A123          X         1        3
#3     A124          Y         5        7

Here c(TRUE,TRUE,df[1,-(1:2)]==0) creates a logical vector, which is TRUE for the first two columns and for those columns, which have a 0 in the first row. Then I subset the columns based on this vector.

df[,c(TRUE,TRUE,df[1,-(1:2)]==1)]

#  MarkerID MarkerName Patent.2 Patient.4
#1    Class                   1         1
#2     A123          X        2         4
#3     A124          Y        6         8

This would reshape your data into a more common format (for statistical software):

library(reshape2)  
df2 <- merge(melt(df[1,],variable.name="Patient",value.name="class")[-(1:2)],
             melt(df[-1,],variable.name="Patient"),all=TRUE)

#    Patient class MarkerID MarkerName value
#1  Patent.2     1     A123          X     2
#2  Patent.2     1     A124          Y     6
#3  Patent.3     0     A123          X     3
#4  Patent.3     0     A124          Y     7
#5 Patient.1     0     A123          X     1
#6 Patient.1     0     A124          Y     5
#7 Patient.4     1     A123          X     4
#8 Patient.4     1     A124          Y     8

You could then use subset:

subset(df2,class==0)

#    Patient class MarkerID MarkerName value
#3  Patent.3     0     A123          X     3
#4  Patent.3     0     A124          Y     7
#5 Patient.1     0     A123          X     1
#6 Patient.1     0     A124          Y     5

175

answered Sep 21 '22 09:09

Roland

Related questions
                            
                                Roxygen2 - how to @export reference class generator?
                            
                                How to easily execute R commands on remote server?
                            
                                Overlay raster plot using plot(...,add=T) leads to arbitrary misalignment of final plot
                            
                                Disconnecting src_tbls connection in dplyr
                            
                                Recursive list.files for FTP-Server
                            
                                Automatically scale font size (etc.) of ggplot2 inside an Rmarkdown document
                            
                                Plot animation in knitr rmarkdown
                            
                                R shiny: center and resize textInput
                            
                                Defining an infix operator for use within a formula
                            
                                Subplots using Plotly in R (bug fixed)
                            
                                Error 43 while knitting a r markdown to pdf on rStudio on Windows
                            
                                Use xtable to print html table in R markdown file
                            
                                Unable to load IRKernel in Jupyter notebook
                            
                                Excessive depth in document: XML_PARSE_HUGE option for xml2::read_html() in R
                            
                                R Bookdown _bookdown.yml
                            
                                rmarkdown beamer presentation: how to not print section slides?
                            
                                Randomizing balanced experimental designs
                            
                                How to add a column in the data frame within a function
                            
                                R - Reading STDIN line by line
                            
                                R: creating a map of selected Canadian provinces and U.S. states

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R - How to make a subset of columns based on values in a row in a data frame

Tags:

dataframe

r

subset

jeran stratford

People also ask

1 Answers

Roland

Recent Activity

Donate For Us