I have a csv file named "table_parameter". Please, download from here. Data look like this:
time avg.PM10 sill range nugget
1 2012030101 52.2692307692308 0.11054330 45574.072 0.0372612157
2 2012030102 55.3142857142857 0.20250974 87306.391 0.0483153769
3 2012030103 56.0380952380952 0.17711558 56806.827 0.0349567088
4 2012030104 55.9047619047619 0.16466350 104767.669 0.0307528346
.
.
.
25 2012030201 67.1047619047619 0.14349774 72755.326 0.0300378129
26 2012030202 71.6571428571429 0.11373430 72755.326 0.0320594776
27 2012030203 73.352380952381 0.13893530 72755.326 0.0311135434
28 2012030204 70.2095238095238 0.12642303 29594.037 0.0281416079
.
.
In my dataframe there is a variable named time contains hours value from 01 march 2012 to 7 march 2012 in numeric form. for example 01 march 2012, 1.00 a.m. is written as 2012030101 and so on.
From this dataset I want subset (24*11) datframe like the table below:

for example, for 1 am (2012030101,2012030201....2012030701) and for avg.PM10<10, I want 1 dataframe. In this case, probably you found that for some data frame there will be no observation. But its okay, because I will work with very large data set.
I can do this subsetting manually by writing (24*11)240 lines code like this!
table_par<-read.csv("table_parameter.csv")
times<-as.numeric(substr(table_par$time,9,10))
par_1am_0to10 <-subset(table_par,times ==1 & avg.PM10<=10)
par_1am_10to20 <-subset(table_par,times ==1 & avg.PM10>10 & avg.PM10<=20)
par_1am_20to30 <-subset(table_par,times ==1 & avg.PM10>20 & avg.PM10<=30)
.
.
.
par_24pm_80to90 <-subset(table_par,times ==24 & avg.PM10>80 & avg.PM10<=90)
par_24pm_90to100 <-subset(table_par,times==24 & avg.PM10>90 & avg.PM10<=100)
par_24pm_100up <-subset(table_par,times ==24 & avg.PM10>100)
But I understand this code is very inefficient. Is there any way to do it efficiently by using a loop?
FYI: Actually in future, by using these (24*11) dataset I want to draw some plot.
Update: After this subsetting, I want to plot the boxplots using the range of every dataset. But problem is, I want to show all boxplots (24*11)[like above figure] of range in one plot like a matrix! If you have any further inquery, please let me know. Thanks a lot in advance.
You can do this using some plyr, dplyr and tidyr magic :
library(tidyr)
library(dplyr)
# I am not loading plyr there because it interferes with dplyr, I just want it for the round_any function anyway
# Read data
dfData <- read.csv("table_parameter.csv")
dfData %>%
# Extract hour and compute the rounded Avg.PM10 using round_any
mutate(hour = as.numeric(substr(time, 9, 10)),
roundedPM.10 = plyr::round_any(Avg.PM10, 10, floor),
roundedPM.10 = ifelse(roundedPM.10 > 100, 100,roundedPM.10)) %>%
# Keep only the relevant columns
select(hour, roundedPM.10) %>%
# Count the number of occurences per hour
count(roundedPM.10, hour) %>%
# Use spread (from tidyr) to transform it into wide format
spread(hour, n)
If you plan on using ggplot2, you can forget about tidyr and the last line of the code in order to keep the dataframe in long format, it will be easier to plot this way.
EDIT : After reading your comment, I realised I misunderstood your question. This will give you a boxplot for each couple of hour and interval of AVG.PM10 :
library(tidyr)
library(dplyr)
library(ggplot2)
# I am not loading plyr there because it interferes with dplyr, I just want it
# for the round_any function anyway
# Read data
dfData <- read.csv("C:/Users/pformont/Desktop/table_parameter.csv")
dfDataPlot <- dfData %>%
# Extract hour and compute the rounded Avg.PM10 using round_any
mutate(hour = as.numeric(substr(time, 9, 10)),
roundedPM.10 = plyr::round_any(Avg.PM10, 10, floor),
roundedPM.10 = ifelse(roundedPM.10 > 100, 100,roundedPM.10)) %>%
# Keep only the relevant columns
select(roundedPM.10, hour, range)
# Plot range as a function of hour (as a factor to have separate plots)
# and facet it according to roundedPM.10 on the y axis
ggplot(dfDataPlot, aes(factor(hour), range)) +
geom_boxplot() +
facet_grid(roundedPM.10~.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With