Given some graph, I would like to determine how likely it is that it was generated randomly. I was told that a comparison to the Erdős–Rényi model was a good way to get this information, but I can't quite figure out how to do that.
Any advice?
The simplest way would probably be to compare the expected number of links with what you observed in the given graph. A slightly smarter method would be to examine the degree distributions.
A random graph model is given by a sequence of graph valued random variables, one for each possible value of n : M=(Gn;n∈N) M = ( G n ; n ∈ N ) ” [53]. “In general, a random graph is a model network in which a specific set of parameters take fixed values, but the network is random in other respects” [100].
A random graph is simple to define. One takes some number N of nodes or “vertices” and places connections or “edges” between them, such that each pair of vertices i, j has a connecting edge with independent probability p. We show an example of such a random graph in Fig.
The Erdös-Rényi random network1 (ER random network) is a nice, tractable network model that reduces the large dimension of random networks to a small number of parameters. In the traditional ER random network model, the probability of any edge is p.
The simplest way would probably be to compare the expected number of links with what you observed in the given graph. A slightly smarter method would be to examine the degree distributions. Erdős–Rényi graphs will have a binomial distributions, while real world networks are typically power law.
It might also be easier to test if you had an idea as to what other kinds of models were being used to generate the graph.
You can have a look at the ERGM package for R (www.r-project.org) at www.statnet.org. Although you might not be able to say with 100% certainty that your observed network is produced by a random process, you will be able to assess the likelihood that it was produced by random or non random partner selection processes. ERGM has a function called gof which stands for goodness-of-fit and will compare your observed network with simulated random networks and looks at network statistics such as: geodesic distance distribution, edgewise shared partner distribution, degree distribution and the triad census distribution. This will allow you to make an informed decision whether you consider your network to be random or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With