Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Average clustering coefficient of a network (igraph)

I want to calculate the average clustering coefficient of a graph (from igraph package). However, I am not sure which approach I should follow.

library(igraph)
graph <- erdos.renyi.game(10000, 10000, type = "gnm")

# Global clustering coefficient
transitivity(graph)
# Average clustering coefficient
transitivity(graph, type = "average")
# The same as above
mean(transitivity(graph, type = "local"), na.rm = TRUE)

I would be grateful for some guidance.

like image 621
abu Avatar asked Feb 18 '18 15:02

abu


People also ask

What is the average clustering coefficient?

The clustering coefficient for the graph is the average, C = 1 n ∑ v ∈ G c v , where is the number of nodes in G .

What is considered a high clustering coefficient?

Specifically, the clustering coefficient is a measure of the density of a 1.5-degree egocentric network. When these connections are dense, the clustering coefficient is high. If your “friends” (alters) all know each other, you have a high clustering coefficient.

How can clustering coefficient be increased?

The optimal solution of adding two links to the landscape-based networks increases the clustering coefficient by 0.05 on average.


1 Answers

Using transitivity(graph) computes a global clustering coefficient (transitivity):

This is simply the ratio of the triangles and the connected triples in the graph. For directed graph the direction of the edges is ignored.

Meanwhile, transitivity(graph, type = "average") being an average of transitivity(graph, type = "local") first computes the local clustering coefficients and then averages them:

The local transitivity of an undirected graph, this is calculated for each vertex given in the vids argument. The local transitivity of a vertex is the ratio of the triangles connected to the vertex and the triples centered on the vertex. For directed graph the direction of the edges is ignored.

See, e.g., ?transitivity and Clustering coefficient.

So firstly both of them are valid measures and the choice should depend on your purposes. The difference between them is quite clear (see the wikipedia page):

It is worth noting that this metric places more weight on the low degree nodes, while the transitivity ratio places more weight on the high degree nodes. In fact, a weighted average where each local clustering score is weighted by k_i(k_i-1) is identical to the global clustering coefficient

where k_i is the number of vertex i neighbours. Hence, perhaps using both of them would be quite fair too.

like image 125
Julius Vainora Avatar answered Sep 23 '22 03:09

Julius Vainora