I have data frames A, B, C, ... and want to modify each data frame in the same way, e.g. re-ordering factors levels of a factor which is present in all of the data frames:
A = data.frame( x=c('x','x','y','y','z','z') )
B = data.frame( x=c('x','y','z') )
C = data.frame( x=c('x','x','x','y','y','y','z','z','z') )
A$x = factor( A$x, levels=c('z','y','x') )
B$x = factor( B$x, levels=c('z','y','x') )
C$x = factor( C$x, levels=c('z','y','x') )
This gets laborious if there are lots of data frames and/or lots of modifications to be done. How can I do it concisely, using a loop or something better? A straightforward approach like
for ( D in list( A, B, C ) ) {
D$x = factor( D$x, levels=c('z','y','x') )
}
does not work, because it doesn't modify the original data frames.
EDIT: added definitions of A, B, and C to make it reproducible.
To join more than two (multiple) R data frames use the reduce() function from tidyverse package. This function takes all the data frames as a list and joins the data frames based on the specified column.
The rbind() function in R and the bind_rows() function are the most useful functions when it comes to data manipulation. You can easily bind two data frames of the same column count using rbind() function.
Within the for-loop we have performed three steps: First, we have created a vector object containing the values that we wanted to add as column to our data frame. Second, we added the new column ad the end of our data frame. Third, we renamed our new column (this step is optional).
One thing to note about R is that, with respect to assignment, <-
is transitive, whereas =
is not. Thus, if your data frames are all the same in this respect, you should be able to do something like this:
A$x <- B$x <- C$x <- factor( C$x, levels=c('z','y','x') )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With