(Related question that does not include sorting. It's easy to just use paste
when you don't need to sort.)
I have a less-than-ideally-structured table with character columns that are generic "item1","item2" etc. I would like to create a new character variable that is the alphabetized, comma-separated concatenation of these columns. So for example, in row 5, if item1 = "milk", item2 = "eggs", and item3 = "butter", the new variable in row 5 might be "butter, eggs, milk"
I wrote a function f()
below that works on two character variables. However, I am having trouble
mapply
or other "vectorization" (I know it's really just a for loop)Any help much appreciated.
df <- data.frame(a =c("foo","bar"),
b= c("baz","qux"))
paste(df$a,df$b, sep=", ")
# returns [1] "foo, baz" "bar, qux" ... but I want [1] "baz, foo" "bar, qux"
f <- function(a,b) paste(c(a,b)[order(c(a,b))],collapse=", ")
f("foo","baz")
# returns [1] "baz, foo" ... which is what I want ... how to vectorize?
df$new_var <- mapply(f, df$a, df$b)
df
# a b new_var <- new_var is not what I want
# 1 foo baz 1, 2
# 2 bar qux 1, 2
# Interestingly, data.table is smart enough to fix my bad mapply
library(data.table)
dt <- data.table(a =c("foo","bar"),
b= c("baz","qux"))
dt[,new_var:=mapply(f, a, b)]
dt
# a b new_var <- new var IS what I want
# 1: foo baz baz, foo
# 2: bar qux bar, qux
Concatenate function can take two or more arrays of the same shape and by default it concatenates row-wise i.e. axis=0. The resulting array after row-wise concatenation is of the shape 6 x 3, i.e. 6 rows and 3 columns.
rowwise() allows you to compute on a data frame a row-at-a-time. This is most useful when a vectorised function doesn't exist. Most dplyr verbs preserve row-wise grouping. The exception is summarise() , which return a grouped_df.
How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.
Use pandas.concat() method to concat two DataFrames by rows meaning appending two DataFrames. By default, it performs append operations similar to a union where it bright all rows from both DataFrames to a single DataFrame.
Just apply down rows:
apply(df,1,function(x){
paste(sort(x),collapse = ",")
})
Wrap it in a function if you want. You'll either have to define which columns to send or assume all. i.e. apply(df[ ,2:3],1,f()...
sort(x) is the same as x[order(x)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With