Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot2 line chart gives "geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?"

Tags:

r

ggplot2

You only have to add group = 1 into the ggplot or geom_line aes().

For line graphs, the data points must be grouped so that it knows which points to connect. In this case, it is simple -- all points should be connected, so group=1. When more variables are used and multiple lines are drawn, the grouping for lines is usually done by variable.

Reference: Cookbook for R, Chapter: Graphs Bar_and_line_graphs_(ggplot2), Line graphs.

Try this:

plot5 <- ggplot(df, aes(year, pollution, group = 1)) +
         geom_point() +
         geom_line() +
         labs(x = "Year", y = "Particulate matter emissions (tons)", 
              title = "Motor vehicle emissions in Baltimore")

You get this error because one of your variables is actually a factor variable . Execute

str(df) 

to check this. Then do this double variable change to keep the year numbers instead of transforming into "1,2,3,4" level numbers:

df$year <- as.numeric(as.character(df$year))

EDIT: it appears that your data.frame has a variable of class "array" which might cause the pb. Try then:

df <- data.frame(apply(df, 2, unclass))

and plot again?


I had similar problem with the data frame:

group time weight.loss
1 Control  wl1    4.500000
2    Diet  wl1    5.333333
3  DietEx  wl1    6.200000
4 Control  wl2    3.333333
5    Diet  wl2    3.916667
6  DietEx  wl2    6.100000
7 Control  wl3    2.083333
8    Diet  wl3    2.250000
9  DietEx  wl3    2.200000

I think the variable for x axis should be numeric, so that geom_line knows how to connect the points to draw the line.

after I change the 2nd column to numeric:

 group time weight.loss
1 Control    1    4.500000
2    Diet    1    5.333333
3  DietEx    1    6.200000
4 Control    2    3.333333
5    Diet    2    3.916667
6  DietEx    2    6.100000
7 Control    3    2.083333
8    Diet    3    2.250000
9  DietEx    3    2.200000

then it works.


Start up R in a fresh session and paste this in:

library(ggplot2)

df <- structure(list(year = c(1, 2, 3, 4), pollution = structure(c(346.82, 
134.308821199349, 130.430379885892, 88.275457392443), .Dim = 4L, .Dimnames = list(
    c("1999", "2002", "2005", "2008")))), .Names = c("year", 
"pollution"), row.names = c(NA, -4L), class = "data.frame")

df[] <- lapply(df, as.numeric) # make all columns numeric

ggplot(df, aes(year, pollution)) +
           geom_point() +
           geom_line() +
           labs(x = "Year", 
                y = "Particulate matter emissions (tons)", 
                title = "Motor vehicle emissions in Baltimore")

I got a similar prompt. It was because I had specified the x-axis in terms of some percentage (for example: 10%A, 20%B,....). So an alternate approach could be that you multiply these values and write them in the simplest form.