Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r: Plotting each column against each column

I have a dataframe ("data") with 7 columns (2 Factor, 5 num). The first column is containing the names of 7 different countries and in the following columns I have collected data for different parameters (like population, GDP etc.) characterizing each country. In the last column a factor variable assigns which continent the respective country belongs to.

The data looks like this:

structure(list(Country = structure(c(5L, 4L, 7L, 2L, 1L, 6L, 
3L), .Label = c("Brazil", "Chile", "China", "France", "Germany", 
"India", "Netherlands"), class = "factor"), GDP = c(0.46, 0.57, 
0.75, 0.56, 0.28, 0.88, 1), Population = c(0.18, 0.09, 0.54, 
0.01, 0.02, 0.17, 0.84), Birth.rate = c(87.21, 18.34, 63.91, 
14.21, 5.38, 51.19, 209.26), Income = c(43.89, 18.23, 63.91, 
12.3, 0.1, 14.61, 160.82), Savings = c(43.32, 0.11, 0, 1.91, 
5.29, 36.58, 50.38), Continent = structure(c(2L, 2L, 2L, 3L, 
3L, 1L, 1L), .Label = c("Asia", "Europe", "South America"), class = "factor")), .Names = c("Country", 
"GDP", "Population", "Birth.rate", "Income", "Savings", "Continent"
), class = "data.frame", row.names = c(NA, -7L))

I need some sort of loop function which plots (e.g. scatter plot) every single column against each other so that in the end every column (except the first and the last, i.e. the two factor variables) has been plotted against all other columns but each in a single plot chart (not all plots in one). Preferably all these plots are being saved to some folder on my local machine.

Also it would be great if the x and y axis are already labeled according to the respective two columns that are plotted against each other. Moreover it would be convenient to have a label next to each point in the plot displaying the respective country name. Lastly it would be nice to have three different colors for the points of the countries according to the three different continents.

So far I only have a piece of code that goes like

for (i in seq(1,length(data),1)) {
   plot(data[,i], ylab=names(data[i]), xlab="Country", 
   text(i, labels=Country, pos=4, cex =.5)) 
} 

As you can see it only plots each column against the first column ("Country") which is not what I want in the end.

Do you have any idea how I could achieve this? Thank you!

like image 886
Jonathan Rhein Avatar asked Apr 12 '16 19:04

Jonathan Rhein


People also ask

How do I plot different columns in R?

Plotting multiple variables in the different panels In this method, we plot a line graph for each column of the dataframe in the different panel of a same plot. We can achieve this task by adding facet_grid() function.

How do I plot all columns in a data frame?

You can plot data directly from your DataFrame using the plot() method. To plot multiple data columns in single frame we simply have to pass the list of columns to the y argument of the plot function.

How do I plot two columns against each other in Excel?

Highlight the two columns you want to include in your scatter plot. Then, go to the “Insert” tab of your Excel menu bar and click on the scatter plot icon in the “Recommended Charts” area of your ribbon. Select “Scatter” from the options in the “Recommended Charts” section of your ribbon.


1 Answers

You can use pairs() directly from R. Note that dt represents your dataset.

pairs(dt)

enter image description here

dt <- structure(list(Country = structure(c(5L, 4L, 7L, 2L, 1L, 6L, 
3L), .Label = c("Brazil", "Chile", "China", "France", "Germany", 
"India", "Netherlands"), class = "factor"), GDP = c(0.46, 0.57, 
0.75, 0.56, 0.28, 0.88, 1), Population = c(0.18, 0.09, 0.54, 
0.01, 0.02, 0.17, 0.84), Birth.rate = c(87.21, 18.34, 63.91, 
14.21, 5.38, 51.19, 209.26), Income = c(43.89, 18.23, 63.91, 
12.3, 0.1, 14.61, 160.82), Savings = c(43.32, 0.11, 0, 1.91, 
5.29, 36.58, 50.38), Continent = structure(c(2L, 2L, 2L, 3L, 
3L, 1L, 1L), .Label = c("Asia", "Europe", "South America"), class =      "factor")), .Names = c("Country",  
"GDP", "Population", "Birth.rate", "Income", "Savings", "Continent"
), class = "data.frame", row.names = c(NA, -7L))
like image 59
Worice Avatar answered Sep 18 '22 16:09

Worice