I have a dataset that looks something like this
site <- c("A", "B", "C", "D", "E")
D01_1 <- c(1, 0, 0, 0, 1)
D01_2 <- c(1, 1, 0, 1, 1)
D02_1 <- c(1, 0, 1, 0, 1)
D02_2 <- c(0, 1, 0, 0, 1)
D03_1 <- c(1, 1, 0, 0, 0)
D03_2 <- c(0, 1, 0, 0, 1)
df <- data.frame(site, D01_1, D01_2, D02_1, D02_2, D03_1, D03_2)
I am trying to unite the D0x_1
and D0x_2
columns so that the values in the columns are separated by a slash. I can do this with the following code and it works just fine:
library(dplyr)
library(tidyr)
df.unite <- df %>%
unite(D01, D01_1, D01_2, sep = "/", remove = TRUE) %>%
unite(D02, D02_1, D02_2, sep = "/", remove = TRUE) %>%
unite(D03, D03_1, D03_2, sep = "/", remove = TRUE)
...but the problem is that it requires me to type out each unite
pair multiple times and it is unwieldy across the large number of columns in my dataset. Is there a way in dplyr
to unite across similarly patterned column names and then loop across the columns? unite_each
doesn't seem to exist.
To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.
Convert multiple columns into a single column, To combine numerous data frame columns into one column, use the union() function from the tidyr package.
tidyr provides three main functions for tidying your messy data: gather() , separate() and spread() . Sometimes two variables are clumped together in one column. separate() allows you to tease them apart ( extract() works similarly but uses regexp groups instead of a splitting pattern or position).
Similarly to readr , dplyr and tidyr are also part of the tidyverse. These packages were loaded in R's memory when we called library(tidyverse) earlier.
Two options, which are really the same thing rearranged.
First, you can use lapply
to apply unite_
(the standard evaluation version to which you can pass strings) programmatically across columns. To do so, you'll need to build a list of names for it to use, and then wrap the lapply
in do.call(cbind
to catch columns, and cbind
site
back to it. Altogether:
cols <- unique(substr(names(df)[-1], 1, 3))
cbind(site = df$site, do.call(cbind,
lapply(cols, function(x){unite_(df, x, grep(x, names(df), value = TRUE),
sep = '/', remove = TRUE) %>% select_(x)})
))
# site D01 D02 D03
# 1 A 1/1 1/0 1/0
# 2 B 0/1 0/1 1/1
# 3 C 0/0 1/0 0/0
# 4 D 0/1 0/0 0/0
# 5 E 1/1 1/1 0/1
Alternately, if you really like pipes, you can actually hack the whole thing into a chain (lapply
included!), swapping out a few of the base functions for dplyr
ones:
df %>% select(-site) %>% names() %>% substr(1,3) %>% unique() %>%
lapply(function(x){unite_(df, x, grep(x, names(df), value = TRUE),
sep = '/', remove = TRUE) %>% select_(x)}) %>%
bind_cols() %>% mutate(site = as.character(df$site)) %>% select(site, starts_with('D'))
# Source: local data frame [5 x 4]
#
# site D01 D02 D03
# (chr) (chr) (chr) (chr)
# 1 A 1/1 1/0 1/0
# 2 B 0/1 0/1 1/1
# 3 C 0/0 1/0 0/0
# 4 D 0/1 0/0 0/0
# 5 E 1/1 1/1 0/1
Check out the intermediate products to see how it fits together, but it's pretty much the same logic as the base approach.
This is a solution with base functions. First, I looked for indexes of ***_1 in columns. I also created names for columns for the final process, using gsub()
and unique()
. The sapply part pastes two columns with /
. If x = 1, then, x +1 = 2. So you always choose two columns next to each other and handle the pasting job. Then, I added site
with cbind()
and created a data frame. The last job is to assign column names.
library(magrittr)
ind <- grep(pattern = "1$", x = names(df))
names <- unique(gsub(pattern = "_\\d+$",
replacement = "", x = names(df)))
sapply(ind, function(x){
foo <- paste(df[,x], df[, x+1], sep = "/")
foo
}) %>%
cbind(as.character(df$site), .) %>%
data.frame -> out
names(out) <- names
# site D01 D02 D03
#1 A 1/1 1/0 1/0
#2 B 0/1 0/1 1/1
#3 C 0/0 1/0 0/0
#4 D 0/1 0/0 0/0
#5 E 1/1 1/1 0/1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With