I'm trying to take a column that has specific values for each type of element for each type of gridNumber
and dcast
it so that it creates 3 separate columns from the element column. I'm not sure exactly how to do this.
dput:
df <- structure(list(date = structure(c(-25584, -25584, -25584, -25583,
-25583, -25583, -25582, -25582, -25582, -25581), class = "Date"),
year = c(1899, 1899, 1899, 1899, 1899, 1899, 1899, 1899,
1899, 1899), month = c(12, 12, 12, 12, 12, 12, 12, 12, 12,
12), day = c(15, 15, 15, 16, 16, 16, 17, 17, 17, 18), gridNumber = c(526228,
526228, 526228, 526228, 526228, 526228, 526229, 526229, 526229,
526229), element = c("PPT", "TMAX", "TMIN", "PPT", "TMAX",
"TMIN", "PPT", "TMAX", "TMIN", "PPT"), value = c(0, 43.4782,
21.7403, 0, 43.3297, 20.751, 0, 57.3625, 25.8157, 0.2105)), .Names = c("date",
"year", "month", "day", "gridNumber", "element", "value"), row.names = c(NA,
10L), class = "data.frame")
data.frame:
date year month day gridNumber element value
1 1899-12-15 1899 12 15 526228 PPT 0.0000
2 1899-12-15 1899 12 15 526228 TMAX 43.4782
3 1899-12-15 1899 12 15 526228 TMIN 21.7403
4 1899-12-16 1899 12 16 526228 PPT 0.0000
5 1899-12-16 1899 12 16 526228 TMAX 43.3297
6 1899-12-16 1899 12 16 526228 TMIN 20.7510
7 1899-12-17 1899 12 17 526229 PPT 0.0000
8 1899-12-17 1899 12 17 526229 TMAX 57.3625
9 1899-12-17 1899 12 17 526229 TMIN 25.8157
10 1899-12-18 1899 12 18 526229 PPT 0.2105
dcast try:
newdat <- dcast(df, date ~ element)
Desired output columns:
date year month day gridNumber PPT TMAX TMIN value
This might not be exactly what you want because you have a separate column for value. Then, what do you put under PPT, TMAX and TMIN?
Here's how to put the value under the appropriate column with dplyr
and tidyr
:
library(dplyr)
library(tidyr)
df1 %>%
spread(element,value)
date year month day gridNumber PPT TMAX TMIN
1 1899-12-15 1899 12 15 526228 0.0000 43.4782 21.7403
2 1899-12-16 1899 12 16 526228 0.0000 43.3297 20.7510
3 1899-12-17 1899 12 17 526229 0.0000 57.3625 25.8157
4 1899-12-18 1899 12 18 526229 0.2105 NA NA
Can be written in one line using tidyr
only:
spread(df1,element,value)
We can use dcast
. The ...
on the lhs of ~
include all variables that are not specified in the rhs and in the value.var
.
library(reshape2)
dcast(df, ...~element, value.var='value')
# date year month day gridNumber PPT TMAX TMIN
#1 1899-12-15 1899 12 15 526228 0.0000 43.4782 21.7403
#2 1899-12-16 1899 12 16 526228 0.0000 43.3297 20.7510
#3 1899-12-17 1899 12 17 526229 0.0000 57.3625 25.8157
#4 1899-12-18 1899 12 18 526229 0.2105 NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With