I'm trying to create a heatmap out of the following data:
> head(myData.aggregated)
datetime value date time
1 2016-03-31 14:19:00 3 2016-03-31 2016-06-11 14:19:00
2 2016-03-31 14:49:00 69 2016-03-31 2016-06-11 14:49:00
3 2016-03-31 15:49:00 5 2016-03-31 2016-06-11 15:49:00
4 2016-03-31 16:19:00 7 2016-03-31 2016-06-11 16:19:00
5 2016-03-31 17:49:00 2 2016-03-31 2016-06-11 17:49:00
6 2016-03-31 18:19:00 7 2016-03-31 2016-06-11 18:19:00
> tail(myData.aggregated)
datetime value date time
90 2016-04-06 13:19:00 1 2016-04-06 2016-06-11 13:19:00
91 2016-04-06 13:49:00 25 2016-04-06 2016-06-11 13:49:00
92 2016-04-06 14:19:00 7 2016-04-06 2016-06-11 14:19:00
93 2016-04-06 14:49:00 1 2016-04-06 2016-06-11 14:49:00
94 2016-04-06 22:19:00 3 2016-04-06 2016-06-11 22:19:00
95 2016-04-06 22:49:00 14 2016-04-06 2016-06-11 22:49:00
And the following ggplot2 commands.
ggplot(myData.aggregated, aes(x = time, y = date, fill = scale(value))) + geom_tile() + coord_equal()
As soon as I add coord_equal() the result is a blank graph. Can someone explain to me why this is happening and how I can fix it. My goal is to get a heatmap with square tiles for each 30 min interval.
Update 1:
> dput(head(myData.aggregated))
structure(list(datetime = structure(c(1459426740, 1459428540,
1459432140, 1459433940, 1459439340, 1459441140), class = c("POSIXct",
"POSIXt"), tzone = ""), value = c(3L, 69L, 5L, 7L, 2L, 7L), date = structure(c(16891,
16891, 16891, 16891, 16891, 16891), class = "Date"), time = structure(c(1465647540,
1465649340, 1465652940, 1465654740, 1465660140, 1465661940), class = c("POSIXct",
"POSIXt"), tzone = "")), .Names = c("datetime", "value", "date",
"time"), row.names = c(NA, 6L), class = "data.frame")
TL;DR: The y-axis spans six units and the x-axis spans tens-of-thousands of units. When you add coord_equal
, the y-axis gets squashed to roughly 1/10,000th the physical length of the x-axis, effectively making the plot area disappear. The date
column (y-axis) happens to be in days and the time
column (x-axis) in seconds, but both are treated as unitless numbers by ggplot. You can denominate the y-axis in seconds also, but that will still give you a plot with an undesirable aspect ratio of at least 6:1. See below for code and additional detail.
Here's what's happening: date
is in Date
format and is therefore denominated in days, with a range of 6 days. time
is in POSIXct
format, which is denominated in seconds, with a range (since we're only interested in the time of day, regardless of date) of tens-of-thousands of seconds (up to a maximum of 86,400 seconds, or the length of one day).
The underlying values of Date
and POSIXct
formats are just numeric values with, respectively, Date
and POSIXct
classes attached. As a result, when you add coord_equal
, one unit on the y-axis takes up the same physical distance as 1 unit on the x-axis because ggplot (apparently) calculates coord_equal
based on the numeric magnitudes of the values, without regard to their date-time class. But the entire y-axis spans 6 units while the x-axis spans tens-of-thousands of units. Thus, when you require coord_equal
, the y:x aspect ratio gets squashed to on the order of 1:10,000 or so, making the plot disappear for all practical purposes.
You can denominate both the x and y axes in seconds, but even then the y-axis will span at least six times the range (6 days) as the x-axis (maximum of one day), resulting in a y:x aspect ratio of at least 6:1 with coord_equal
, which is better than 1:10,000, but still not very practical.
Here's an example with fake data:
# Fake data
set.seed(4959)
dat = data.frame(datetime=seq(as.POSIXct("2016-03-31"), as.POSIXct("2016-04-06"), by="hour"))
dat$value = sample(1:50, nrow(dat), replace=TRUE)
ggplot(dat,
aes(x = as.POSIXct(as.numeric(datetime) %% 86400,
tz="UTC", origin=as.Date("2016-01-01")),
y = as.POSIXct(as.Date(datetime)),
fill = scale(value))) +
geom_tile() +
labs(y="Date", x="Time") +
scale_x_datetime(date_labels="%H:%m") +
coord_equal()
In the code above, to create the y values we first convert to Date
format, which eliminates the time of day and then convert back to POSIXct
which converts the unit to seconds, but with time equal to midnight on that day for all datetime
values on a given date.
To create the x values, we just want time of day in seconds after midnight, so we calculate the remainder of the numeric value of datetime
after division by 86400 (number of seconds in a day). The tz=UTC
is necessary to get the hours right and origin
(which can be any date; we just want the time of day) is necessary to get the function to run without an error.
Below is what the plot looks like with and without coord_equal
. Note that with coord_equal
the x-axis, which spans one day of time (from midnight to midnight) has the same length as one day on the y axis. That's because we denominated both the y and x values in seconds. However, as long as the y axis spans several days and the x-axis spans only one day, coord_equal
will result in an undesirable aspect ratio.
Below is a demonstration of how the y-axis gets squashed relative to the x-axis if the y values are denominated in days rather than seconds and coord_equal
is specified:
ggplot(dat,
aes(x = as.POSIXct(as.numeric(datetime) %% 86400,
tz="UTC", origin=as.Date("2016-01-01")),
y = as.Date(datetime),
fill = scale(value))) +
geom_tile() +
labs(y="Date", x="Time") +
scale_x_datetime(date_labels="%H:%m") +
coord_equal()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With