Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do I need to reshape this wide data to effectively use ggplot2?

Tags:

r

ggplot2

I have a data.frame that looks like

  Year Crustaceans       Cod       Tuna    Herring Scorpion.fishes
1 1950    58578630   2716706   69690537   87161396        15250015
2 1951    59194582   3861166   34829755   51215349        15454659
3 1952    47562941   4396174   31061481   13962479        12541484
4 1953    68432658   3901176   23225423   13229061         9524564
5 1954    64395489   4412721   20798126   25285539         9890656
6 1955    76111004   4774045   13992697   18910756         8446391

With several more species (columns), and years running from 1950 to 2006. I'd like to explore it with ggplot2 (which I'm just learning). Do I need to transform this data so that the species is a factor to effectively use ggplot2 on this data? If not, how do I avoid having to create a layer for each species individually? If yes, (or really in either case) a quick pointer on using reshape or plyr to turn column names into a factor would be much appreciated.

like image 538
Gregor Thomas Avatar asked Oct 27 '11 00:10

Gregor Thomas


People also ask

What is the difference between ggplot and ggplot2?

You may notice that we sometimes reference 'ggplot2' and sometimes 'ggplot'. To clarify, 'ggplot2' is the name of the most recent version of the package. However, any time we call the function itself, it's just called 'ggplot'.

Which argument of ggplot can be used to add customization to plots?

To customize the plot, the following arguments can be used: alpha, color, dotsize and fill. Learn more here: ggplot2 dot plot.

Which package has functions that can be used to take long data and make it wider and vice versa?

Both reshape2 and tidyr are great R packages used to manipulate your data from the 'wide' to the 'long' format, or vice-versa. The 'long' format is where: each column is a variable. each row is an observation.

What does Geom_point () do in R?

The function geom_point() adds a layer of points to your plot, which creates a scatterplot. ggplot2 comes with many geom functions that each add a different type of layer to a plot.


2 Answers

A simple transformation using melt (from the reshape/2 package) would suffice. I would do

library(reshape2)
qplot(Year, value, colour = variable, data = melt(df, 'Year'), geom = 'line')
like image 134
Ramnath Avatar answered Oct 09 '22 02:10

Ramnath


I found the following link to be extremely helpful to learning reshape. Reshape and plyr are very easy to use functions once you have the format (not necessarily the fastest (data.table package is written using some C so it's much faster) of how they work down. This tutorial pdf is a great resource for learning it. Also I suggest copying the line from example(cast) into a script and running them one at a time to see the result.

http://had.co.nz/stat405/lectures/19-tables.pdf

like image 45
Tyler Rinker Avatar answered Oct 09 '22 01:10

Tyler Rinker