I have a data.frame that is almost entirely blanks, but each row has one value. How can I use a vectorized or other r-vernacular approach to merge the contents of each row into a single vector?
sample data:
raw_data <- structure(
list(
col1 = c("", "", "", "", ""),
col2 = c("", "", "", "", ""),
col3 = c("", "", "", "", ""),
col4 = c("", "", "", "Millburn - Union", ""),
col5 = c("", "", "Cranston (aka Garden City Center)", "",""),
col6 = c("", "", "", "", ""),
col7 = c("", "", "", "", ""),
col8 = c("", "", "", "", "Colorado Blvd"),
col9 = c("", "", "", "", ""),
col10 = c("", "", "", "", ""),
col11 = c("Palo Alto", "Castro (aka Market St)", "", "", "")
),
.Names = c("col1", "col2", "col3", "col4", "col5", "col6", "col7", "col8", "col9", "col10", "col11"),
row.names = c(5L, 4L, 3L, 2L, 1L),
class = "data.frame"
)
This is what I tried but it fails, as it returns a 2-dimensional matrix instead of the desired vector:
raw_data$test <- apply(raw_data, MAR=1, FUN=paste0)
You can do this very simply with a single index operation:
raw_data[raw_data!='']
Demo:
R> raw_data[raw_data!=''];
[1] "Millburn - Union" "Cranston (aka Garden City Center)" "Colorado Blvd" "Palo Alto" "Castro (aka Market St)"
If you care about the vector order being top-to-bottom (as opposed to left-to-right then top-to-bottom, which is what the above operation does), you can transpose the input data.frame:
R> t(raw_data)[t(raw_data)!=''];
[1] "Palo Alto" "Castro (aka Market St)" "Cranston (aka Garden City Center)" "Millburn - Union" "Colorado Blvd"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With