Currently I'm having trouble with displaying the chart Y-axis to my likings. What I want is that each separate plot shows the point width that depends on its own score. See the image below to see what I've got and what I want.
Basically, I want each plot to be dependent on its own index, i.e. Silhouette, Davies-Bouldin etc. Just like the first graph (Carlinski-Harabasz on the left) is showing.
This is the data and the code so far
algorithms <- as.factor(c(rep("kmeans", 4), rep("pam", 4), rep("cmeans", 4)))
index <- as.factor(c(rep("Silhouette", 12), rep("Carlinski-Harabasz", 12)
, rep("Davies-Bouldin",12)))
score <- c(0.12,0.08,0.07,0.05,0.1,0.07,0.09,0.08,0.13,0.11,0.1,0.1,142,106,84,74,128,
99,91,81,156,123,105,95,2.23,2.31,2.25,2.13,2.55,2.12,2.23,2.08,2.23,2.12,2.17,1.97)
clusters <- as.factor(rep(3:6,9))
indices <- data.frame(algorithms, index, score, clusters)
#Some ggplot code
ggplot(indices, aes(clusters, score)) +
geom_point(aes(size = score, color = algorithms)) +
facet_grid(.~index, scales = "free_y")
#I thought the scales function in facet grid might do the trick...
To my understanding I have to work around the Y-axis scale. However, this proves to be quite tricky for me.
ggplot(indices, aes(clusters, score)) +
geom_point(aes(size = score, color = algorithms)) +
facet_wrap(~index, scales = "free_y")
Did the trick. Thanks for pointing it out.
In addition, thanks to camille, a better visualization is to use facet_grid
with 2 variables. Therefore, the final code will be:
ggplot(indices, aes(clusters, score)) +
geom_point() + facet_grid(index ~ algorithms, scales = "free_y") +
theme_bw() + labs(y="Score per index", x="Clusters")
I've had this problem, and realized the scales have slightly different interpretations: in facet_grid
, the scales are free to change per row / column of facets, whereas with facet_wrap
, the scales are free to change per facet, since there isn't a hard & fast meaning given to the rows or columns. Think of it like grid
does macro-level scaling and wrap
does micro-level.
One advantage that facet_grid
has is quickly putting all values of one variable in a row or column together, making it easy to see what's going on. But you can mimic that in facet_wrap
by setting the facets up on a single row or column, as I did below with nrow = 1
.
library(tidyverse)
algorithms <- as.factor(c(rep("kmeans", 4), rep("pam", 4), rep("cmeans", 4)))
index <- as.factor(c(rep("Silhouette", 12), rep("Carlinski-Harabasz", 12)
, rep("Davies-Bouldin",12)))
score <- c(0.12,0.08,0.07,0.05,0.1,0.07,0.09,0.08,0.13,0.11,0.1,0.1,142,106,84,74,128,
99,91,81,156,123,105,95,2.23,2.31,2.25,2.13,2.55,2.12,2.23,2.08,2.23,2.12,2.17,1.97)
clusters <- as.factor(rep(3:6,9))
indices <- data.frame(algorithms, index, score, clusters)
ggplot(indices, aes(clusters, score)) +
geom_point(aes(size = score, color = algorithms)) +
facet_grid(. ~ index, scales = "free_y")
ggplot(indices, aes(clusters, score)) +
geom_point(aes(size = score, color = algorithms)) +
facet_wrap(~ index, scales = "free_y", nrow = 1)
The difference is more clear when you're using facet_grid
with two variables. Using the mpg
dataset from ggplot2
, this first plot doesn't have free scales, so each row's y-axis has tick marks between 10 and 35. That is, the y-axes of each row of facets are fixed. With facet_wrap
, this scaling would take place for each facet.
ggplot(mpg, aes(x = hwy, y = cty)) +
geom_point() +
facet_grid(class ~ fl)
Setting scales = "free_y"
in facet_grid
means that each row of facets can set its y-axis independent of the other rows. So, for example, all facets of compact cars are subject to one y-scale, but they're unaffected by the y-scale of pickups.
ggplot(mpg, aes(x = hwy, y = cty)) +
geom_point() +
facet_grid(class ~ fl, scales = "free_y")
Created on 2018-08-03 by the reprex package (v0.2.0).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With