in Networkx, how can I cluster nodes based on nodes color? E.g., I have 100 nodes, some of them are close to black, while others are close to white. In the graph layout, I want nodes with similar color stay close to each other, and nodes with very different color stay away from each other. How can I do that? Basically, how does the edge weight influence the layout of spring_layout? If NetworkX cannot do that, is there any other tools can help to calculate the layout?
Thanks
Algorithms to characterize the number of triangles in a graph. triangles (G[, nodes]) Compute the number of triangles.
Compute the clustering coefficient for nodes. For unweighted graphs, the clustering of a node u is the fraction of possible triangles through that node that exist, cu=2T(u)deg(u)(deg(u)−1), where T(u) is the number of triangles through node u and deg(u) is the degree of u.
The global clustering coefficient is the number of closed triplets (or 3 x triangles) over the total number of triplets (both open and closed): CC = 3 × number of triangles number of triplets = number of closed triplets number of triplets . 〈C〉 = E[C]=1/3 for the above graph.
To check connectivity of a graph, we will try to traverse all nodes using any traversal algorithm. After completing the traversal, if there is any node, which is not visited, then the graph is not connected. For the directed graph, we will start traversing from all nodes to check connectivity.
Ok, lets build us adjacency matrix W for that graph following the simple procedure: if both of adjacent vertexes i-th and j-th are of the same color then weight of the edge between them W_{i,j} is big number (which you will tune in your experiments later) and else it is some small number which you will figure out analogously.
Now, lets write Laplacian of the matrix as L = D - W, where D is a diagonal matrix with elements d_{i,i} equal to the sum of W i-th row.
Now, one can easily show that the value of fLf^T, where f is some arbitrary vector, is small if vertexes with huge adjustments weights are having close f values. You may think about it as of the way to set a coordinate system for graph with i-the vertex has f_i coordinate in 1D space.
Now, let's choose some number of such vectors f^k which give us representation of the graph as a set of points in some euclidean space in which, for example, k-means works: now you have i-th vertex of the initial graph having coordinates f^1_i, f^2_i, ... and also adjacent vectors of the same color on the initial graph will be close in this new coordinate space.
The question about how to choose vectors f is a simple one: just take couple of eigenvectors of matrix L as f which correspond to small but nonzero eigenvalues.
This is a well known method called spectral clustering.
Further reading: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. by Trevor Hastie, Robert Tibshirani and Jerome Friedman
which is available for free from the authors page http://www-stat.stanford.edu/~tibs/ElemStatLearn/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With