I'd like to split a dataframe into several component dataframes based on the values in one column. In my example, I want to split dat into dat.1, dat.2 and dat.3 using the values in column "cond". Is there a simple command which could achieve this?
dat
sub cond trial time01 time02
1 1 1 2774 8845
1 1 2 2697 9945
1 2 1 2219 9291
1 2 2 3886 7890
1 3 1 4011 9032
2 2 1 3478 8827
2 2 2 2263 8321
2 3 1 4312 7576
3 1 1 4219 7891
3 3 1 3992 6674
dat.1
sub cond trial time01 time02
1 1 1 2774 8845
1 1 2 2697 9945
3 1 1 4219 7891
dat.2
sub cond trial time01 time02
2 2 1 3478 8827
2 2 2 2263 8321
1 2 1 2219 9291
1 2 2 3886 7890
dat.3
sub cond trial time01 time02
1 3 1 4011 9032
2 3 1 4312 7576
3 3 1 3992 6674
Perhaps because I'm an R novice I've still not determined how to do this despite browsing and trying the solutions proposed in several similar forum queries. Thank you in advance for any replies.
A dput()
of the data is:
structure(list(sub = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L
), cond = c(1L, 1L, 2L, 2L, 3L, 2L, 2L, 3L, 1L, 3L), trial = c(1L,
2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L), time01 = c(2774L, 2697L,
2219L, 3886L, 4011L, 3478L, 2263L, 4312L, 4219L, 3992L), time02 = c(8845L,
9945L, 9291L, 7890L, 9032L, 8827L, 8321L, 7576L, 7891L, 6674L
)), .Names = c("sub", "cond", "trial", "time01", "time02"), class = "data.frame", row.names = c(NA,
-10L))
The most general way to subset a data frame by rows and/or columns is the base R Extract[] function, indicated by matched square brackets instead of the usual matched parentheses. For a data frame named d the general format is d[rows, columms] .
By using R base df[] notation, or subset() you can easily subset the R Data Frame (data. frame) by column value or by column name.
If you wanted to get the subset of a data. frame (DataFrame) Rows & Columns in R, either use the subset() function , filter() from dplyr package or R base square bracket notation df[] . subset() is a generic R function that is used to get the rows and columns (In R terms observations & variables) from the data frame.
I think the easiest way is via split
:
split(dat, dat$cond)
Note however, that split returns a list of the data.frames.
To obtain single data.frames from the list you could procede as follows using a loop to make the single objects (implicit in the lapply
statement):
tmp <- split(dat, dat$cond)
lapply(1:length(tmp), function(x) assign(paste("dat.", x, sep = ""), tmp[[x]], envir = .GlobalEnv))
However, using a list is probably more R
ish and will be more useful in the long run.
Thanks to Gavin for posting the data!
Is there anything not satisfying about
split(dat, dat$cond)
? You do have R and split as tags, you know...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With