I'd trying to graph the frequency of observations over time. I have a dataset where hundreds of laws are coded 0-3. I'd like to know if outcomes 2-3 are occurring more often as time progresses. Here is a sample of mock data:
Data <- data.frame(
year = sample(1998:2004, 200, replace = TRUE),
score = sample(1:4, 200, replace = TRUE)
)
If i plot
plot(Data$year, Data$score)
I get a checkered matrix where every single spot is filled in, but I can't tell which numbers occur more often. Is there a way to color or to change the size of each point by the number of observations of a given row/year?
A few notes may help in answering the question:
1). I don't know how to sample data where certain numbers occur more frequently than others. My sample procedure samples equally from all numbers. If there is a better way I should have created my reproducible data to reflect more observations in later years, I would like to know how.
2). this seemed like it would be best to visualize in a scatter plot, but I could be wrong. I'm open to other visualizations.
Thanks!
The coplot() function plots two variables but each plot is conditioned ( | ) by a third variable. This third variable can be either numeric or a factor.
Histogram each bar or column represents the frequency of occurrence of continuous quantitative variables. The basic advantage of Histogram is to display a large amount of data graphically that are difficult to interpret in a tabular form.
Unlike a bar or line graph, a pie graph is used when there is only one variable and is best for comparing parts of a whole. The sum of the pieces always equals 100 percent, and the visual conveys a relative value or frequency.
Here's how I would approach this (hope this is what you need)
Create the data (Note: when using sample
in questions, always use set.seed
too so it will be reproducible)
set.seed(123)
Data <- data.frame(
year = sample(1998:2004, 200, replace = TRUE),
score = sample(1:4, 200, replace = TRUE)
)
Find frequncies of score
per year
using table
Data2 <- as.data.frame.matrix(table(Data))
Data2$year <- row.names(Data2)
Use melt
to convert it back to long format
library(reshape2)
Data2 <- melt(Data2, "year")
Plot the data while showing different color per group and relative size pre frequency
library(ggplot2)
ggplot(Data2, aes(year, variable, size = value, color = variable)) +
geom_point()
Alternatively, you could use both fill
and size
to describe frequency, something like
ggplot(Data2, aes(year, variable, size = value, fill = value)) +
geom_point(shape = 21)
Here's another approach:
ggplot(Data, aes(year)) + geom_histogram(aes(fill = ..count..)) + facet_wrap(~ score)
Each facet represents one "score" value, as noted in the title of each facet. You can easily get a feeling for the counts by looking at the hight of the bars + the colour (lighter blue indicating more counts).
Of course you could also do this only for the score %in% 2:3
, if you don't want score 1 and 4 included. In such a case, you could do:
ggplot(Data[Data$score %in% 2:3,], aes(year)) +
geom_histogram(aes(fill = ..count..)) + facet_wrap(~ score)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With