Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting column names with condition from a data frame

Tags:

dataframe

r

dput(new)
structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20, 21, 22), A1 = c(1, 1, 1, 1, 0, 
0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), A2 = c(1, 
1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
), A3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0), A4 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0), A5 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), A6 = c(0, 0, 0, 0, 0, 
0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), A7 = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0
), A8 = c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 
1, 1, 1, 0, 0), A9 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -22L), class = c("tbl_df", 
"tbl", "data.frame"))

I have the following data frame. I need to extract and print the id's and comma separated column names where 1 is appearing. For example:

1 A1,A2
2 A1,A2
3 A1
4 A1
6 A2,A8
7 A6,A8

and so on...

How to proceed?

This is my attempt:

vec_ID <- c()
vec_JOB <- c()
job <- 0
for(i in 1 : length(ID)){
  for(j in 2:10){
    if(new[i,j]==1){
      vec_ID[i] <- ID[i] 
    }
  }
}
print(vec_ID)
vec_ID <- vec_ID[!is.na(vec_ID)]
#vec_ID <- as.data.frame(vec_ID)
print(vec_ID)

new_df <- new[ID[vec_ID],]
View(new_df)

for (i in 1:nrow(vec_ID)) {

}
like image 753
Dovini Jayasinghe Avatar asked Mar 13 '20 10:03

Dovini Jayasinghe


People also ask

How do I extract DataFrame column names?

To access the names of a Pandas dataframe, we can the method columns(). For example, if our dataframe is called df we just type print(df. columns) to get all the columns of the Pandas dataframe. After this, we can work with the columns to access certain columns, rename a column, and so on.

How do I extract column names from a DataFrame in R?

To access a specific column in a dataframe by name, you use the $ operator in the form df$name where df is the name of the dataframe, and name is the name of the column you are interested in. This operation will then return the column you want as a vector.

How do I get column names from a DataFrame in Python?

You can get column names in Pandas dataframe using df. columns statement. Usecase: This is useful when you want to show all columns in a dataframe in the output console (E.g. in the jupyter notebook console).


1 Answers

You can do:

apply(df[-1], 1, function(x) toString(names(df[-1])[as.logical(x)]))

 [1] "A1, A2" "A1, A2" "A1"     "A1"     ""       "A2, A8" "A6, A8" "A1, A8" "A6, A8" "A8"     "A1, A8" "A6"    
[13] "A5, A8" ""       "A8"     "A8"     "A8"     "A8"     "A8"     "A8"     "A7"     ""
like image 164
Ritchie Sacramento Avatar answered Oct 30 '22 21:10

Ritchie Sacramento