In ggplot2's built-in mpg dataset there is variable called "fl.", which is a factor with levels: "c", "d", "e", "p", & "r".
Does anyone know what those letters are supposed to stand for? Needless to say, googling those letters has yet to give me any relevant leads...
library(ggplot2)
data(mpg)
str(mpg)
?mpg
[Note: There was a similar question on SO re: the mtcars dataset, which gave me the impression that this would be an appropriate forum for this sort of question.]
drv. the type of drive train, where f = front-wheel drive, r = rear wheel drive, 4 = 4wd. cty. city miles per gallon.
In ggplot2 syntax, we say that they use different geoms. A geom is the geometrical object that a plot uses to represent data. People often describe plots by the type of geom that the plot uses. For example, bar charts use bar geoms, line charts use line geoms, boxplots use boxplot geoms, and so on.
The correct code is ggplot(data = diamonds) + geom_bar(mapping = aes(x = color, fill = cut)) + facet_wrap(~color) .
The fuel:
subset(mpg, fl=="e")
pulls up only "new" american cars, and that fuel economy is much lower than the corresponding presumably gasoline models, which lines up with the lower energy content of ethanol)Note, I have no reason to know this other than an educated guess based on the rest of the data, but here is some graphical evidence:
ggplot(mpg, aes(x=fl, y=hwy)) + geom_boxplot() + facet_wrap(~cyl, nrow=1)
Notice how e
is consistently low d
is consistently high at least where there is more than 1 data point (diesel has higher energy content) and p
is consistently higher than r
(premium allows cars to run at higher compression ratios and efficiency, though actually premium has lower energy content than regular) for each cylinder category (facets are # of cylinders).
UPDATE: as per @naught101, this now appears to be documented.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With