I have a large data frame that consists of data that looks something like this:
date w x y z region
1 2012 01 21 43 12 3 NORTH
2 2012 02 32 54 21 16 NORTH
3 2012 03 14 32 65 32 NORTH
4 2012 04 65 33 75 21 NORTH
: : : : : : :
: : : : : : :
12 2012 12 32 58 53 17 NORTH
13 2012 01 12 47 43 23 SOUTH
14 2012 02 87 43 21 76 SOUTH
: : : : : : :
25 2012 01 12 46 84 29 EAST
26 2012 02 85 29 90 12 EAST
: : : : : : :
: : : : : : :
I want to extract section of the data that have the same date
value, for example to do this just for 2012 01
I would just create a subset of data
data_1 <- subset(data, date == "2012 01")
and this gives me all the data for 2012 01
but I then go on to apply a function to this data. I would like to be able to apply my function to all possible subsets of my data, so ideally I would be looping through my large data frame and extracting the data for 2012 01, 2012 02, 2012 03, 2012 04...
and applying a function to each of these subsets of data separately.
But I would like to be able to apply this to my data frame even if my data frames length were to change, so it may not always go from 2012 01 - 2012 12
, the range of dates may vary so that sometimes it may be used on data from for example 2011 03 - 2013 01
.
The difference between subset () function and sample () is that, subset () is used to select data from the dataset which meets certain condition, while sample () is used for randomly selecting data of size 'n' from the dataset. This recipe demonstrates an example on subset () and sample () in R.
If you wanted to get the subset of a data. frame (DataFrame) Rows & Columns in R, either use the subset() function , filter() from dplyr package or R base square bracket notation df[] . subset() is a generic R function that is used to get the rows and columns (In R terms observations & variables) from the data frame.
Loop through each unique date and build the subset.
uniq <- unique(unlist(data$Date))
for (i in 1:length(uniq)){
data_1 <- subset(data, date == uniq[i])
#your desired function
}
is this what you want ?
df_list <- split(data, as.factor(data$date))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With