I want to calculate the average clustering coefficient of a graph (from igraph
package). However, I am not sure which approach I should follow.
library(igraph)
graph <- erdos.renyi.game(10000, 10000, type = "gnm")
# Global clustering coefficient
transitivity(graph)
# Average clustering coefficient
transitivity(graph, type = "average")
# The same as above
mean(transitivity(graph, type = "local"), na.rm = TRUE)
I would be grateful for some guidance.
The clustering coefficient for the graph is the average, C = 1 n ∑ v ∈ G c v , where is the number of nodes in G .
Specifically, the clustering coefficient is a measure of the density of a 1.5-degree egocentric network. When these connections are dense, the clustering coefficient is high. If your “friends” (alters) all know each other, you have a high clustering coefficient.
The optimal solution of adding two links to the landscape-based networks increases the clustering coefficient by 0.05 on average.
Using transitivity(graph)
computes a global clustering coefficient (transitivity):
This is simply the ratio of the triangles and the connected triples in the graph. For directed graph the direction of the edges is ignored.
Meanwhile, transitivity(graph, type = "average")
being an average of transitivity(graph, type = "local")
first computes the local clustering coefficients and then averages them:
The local transitivity of an undirected graph, this is calculated for each vertex given in the vids argument. The local transitivity of a vertex is the ratio of the triangles connected to the vertex and the triples centered on the vertex. For directed graph the direction of the edges is ignored.
See, e.g., ?transitivity
and Clustering coefficient.
So firstly both of them are valid measures and the choice should depend on your purposes. The difference between them is quite clear (see the wikipedia page):
It is worth noting that this metric places more weight on the low degree nodes, while the transitivity ratio places more weight on the high degree nodes. In fact, a weighted average where each local clustering score is weighted by k_i(k_i-1) is identical to the global clustering coefficient
where k_i is the number of vertex i neighbours. Hence, perhaps using both of them would be quite fair too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With