I have some columns in R and for each row there will only ever be a value in one of them, the rest will be NA's. I want to combine these into one column with the non-NA value. Does anyone know of an easy way of doing this. For example I could have as follows:
data <- data.frame('a' = c('A','B','C','D','E'), 'x' = c(1,2,NA,NA,NA), 'y' = c(NA,NA,3,NA,NA), 'z' = c(NA,NA,NA,4,5))
So I would have
'a' 'x' 'y' 'z' A 1 NA NA B 2 NA NA C NA 3 NA D NA NA 4 E NA NA 5
And I would to get
'a' 'mycol' A 1 B 2 C 3 D 4 E 5
The names of the columns containing NA changes depending on code earlier in the query so I won't be able to call the column names explicitly, but I have the column names of the columns which contains NA's stored as a vector e.g. in this example cols <- c('x','y','z')
, so could call the columns using data[, cols]
.
Any help would be appreciated.
Thanks
How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.
To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).
If we need to drop such columns that contain NA, we can use the axis=column s parameter of DataFrame. dropna() to specify deleting the columns. By default, it removes the column where one or more values are missing.
The easiest way to replace NA's with the mean in multiple columns is by using the functions mutate_at() and vars(). These functions let you select the columns in which you want to replace the missing values. To actually replace the NA with the mean, you can use the replace_na() and mean() function.
A dplyr::coalesce
based solution could be as:
data %>% mutate(mycol = coalesce(x,y,z)) %>% select(a, mycol) # a mycol # 1 A 1 # 2 B 2 # 3 C 3 # 4 D 4 # 5 E 5
Data
data <- data.frame('a' = c('A','B','C','D','E'), 'x' = c(1,2,NA,NA,NA), 'y' = c(NA,NA,3,NA,NA), 'z' = c(NA,NA,NA,4,5))
You can use unlist
to turn the columns into one vector. Afterwards, na.omit
can be used to remove the NA
s.
cbind(data[1], mycol = na.omit(unlist(data[-1]))) a mycol x1 A 1 x2 B 2 y3 C 3 z4 D 4 z5 E 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With