Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple frequency lines on same graph where y is a character value

I'm trying to create a frequency plot of number of appearances of a graph type by year. I have played around with ggplot2 for a while, but I think this is over my head (I'm just getting started with R)

I attached a schematic of what I would like the result to look like. One of the other issues I'm running into is that there are many years that the graph types don't appear. Is there a way to exclude the graph type if it does not appear that year?

e.g. in 1940 there is no "sociogram" I don't want to have a bunch of lines at 0...

year <- c("1940","1940","1940","1940","1940","1940","1940","1940","1940","1940","1940","1941","1941","1941","1941","1941","1941","1941","1941","1941","1941","1941","1941","1941","1941")
type <- c("Line","Column", "Stacked Column", "Scatter with line", "Scatter with line", "Scatter with line", "Scatter with line", "Map with distribution","Line","Line","Line","Bar","Bar","Stacked bar","Column","Column","Sociogram","Sociogram","Column","Column","Column","Line","Line","Line","Line")
ytmatrix <- cbind(as.Date(as.character(year), "%Y", type))

Please let me know if something doesn't make sense. StackOverflow is quickly becoming one of my favorite sites!

Thank, Jon


Here's a working idea of what I have so far. Here's what I have so far... Thank you again for all your help!

And here's how I did it (I can't share the data file yet, since it's something we're hoping to use it for a publication, but the ggplot area is probably the more interesting, though I didn't really do anything new/that wasn't discussed in the post):

AJS = read.csv(data) #read in file
Type = AJS[,17] #select and name "Type" column from csv
Year = AJS[,13] #select and name "Year" column from csv
Year = substr(Year,9,12) #get rid of junk from year column
Year = as.Date(Year, "%Y") #convert the year character to a date
Year = format(Year, "%Y") #get rid of the dummy month and day
Type = as.data.frame(Type) #create data frame
yt <- cbind(Year,Type) #bind the year and type together
library(ggplot2) 

trial <- ggplot(yt, aes(Year,..count.., group= Type)) + #plot the data followed by aes(x-  axis, y-axis, group the lines)
geom_density(alpha = 0.25, aes(fill=Type)) +
opts(axis.text.x = theme_text(angle = 90, hjust = 0)) + #adjust the x axis ticks to horizontal
opts(title = expression("Trends in the Use of Visualizations in The American Journal of Sociology")) + #Add title
scale_y_continuous('Appearances (10 or more)') #change Y-axis label
trial
like image 826
crock1255 Avatar asked Oct 15 '11 16:10

crock1255


1 Answers

This might be a more interesting dataframe to experiment with:

df1 <- data.frame(date = as.Date(10*365*rbeta(100, .5, .1)),group="a")
 df2 <- data.frame(date = as.Date(10*365*rbeta(50, .1, .5)),group="b")
 df3 <- data.frame(date = as.Date(10*365*rbeta(25, 3,3)),group="c")
 dfrm <- rbind(df1,df2,df3)

I thought working with an example in the help(stat_density) page would work, but it does not:

m <- ggplot(dfrm, aes(x=date), group=group)
m+ geom_histogram(aes(y=..density..)) + geom_density(fill=NA, colour="black")

However an example I found in a search of hte archives found a posting by @Hadley Wickham that does work:

m+ geom_density(aes(fill=group), colour="black")

enter image description here

like image 161
IRTFM Avatar answered Oct 06 '22 01:10

IRTFM