Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate a scale-free network with a power-law degree distributions

I'm trying to generate couples of scale-free networks having:

  • degree distributions following power laws with the same exponent
  • the exact same number of nodes.

I need to build at least 60 of those couples and run a simulation for each.

In order to do this, I need a way to generate a network of exactly n nodes with the above properties.

Right now, I'm able to generate a graph with a degree distribution following a power laws with a given exponent, using the NetworkX Python library, with this code

import networkx as nx
import matplotlib.pyplot as plt

#create a graph with degrees following a power law distribution
s = nx.utils.powerlaw_sequence(100, 2.5) #100 nodes, power-law exponent 2.5
G = nx.expected_degree_graph(s, selfloops=False)

print(G.nodes())
print(G.edges())

#draw and show graph
pos = nx.spring_layout(G)
nx.draw_networkx(G, pos)
plt.show()

However, this generates a graph with many isolated nodes, and usually more than a single connected component.

I can throw away the isolated nodes, but then my final network will have less nodes than I intended, and it may not be a single network. It may be 2 or more separate connected components.

like image 696
Agostino Avatar asked Mar 07 '15 22:03

Agostino


2 Answers

First a question - is there a reason you don't want isolated nodes or multiple connected components? In principle, a true "random" power-law graph will have these.

So a few comments:

1) If you use the expected_degree_graph, you're going to have a very hard time eliminating isolated nodes. This is because there are many nodes with an expected degree of around 1 (but the actual degree is from a Poisson distribution). Which means that there is a good chance they will have a degree less than 1. To show yourself that, print s.

2) An alternative option is to make the network using a configuration model graph instead. For this, we take numbers from the powerlaw sequence and take the integer part. Throw out the ones that are 0. Then use nx.configuration_model to create the graph. You can then remove self-loops and repeated edges. However, you should be careful - the high degree nodes may have many self-loops or repeated edges. So you need to be careful that the tail of the powerlaw isn't being cut down too fast. This will not solve the problem of having multiple components. You'll generally have one very large component and a few very small isolated components. Unless you have a good reason to throw these out, throwing out cases like this will bias your sample.

import networkx as nx
import matplotlib.pyplot as plt

#create a graph with degrees following a power law distribution

#I believe we can eliminate this loop to find s by using the call   
#nx.utils.create_degree_sequence(100,powerlaw_sequence) with 
#appropriate modification
while True:  
    s=[]
    while len(s)<100:
        nextval = int(nx.utils.powerlaw_sequence(1, 2.5)[0]) #100 nodes, power-law exponent 2.5
        if nextval!=0:
            s.append(nextval)
    if sum(s)%2 == 0:
        break
G = nx.configuration_model(s)
G=nx.Graph(G) # remove parallel edges
G.remove_edges_from(G.selfloop_edges())

#draw and show graph
pos = nx.spring_layout(G)
nx.draw_networkx(G, pos)
plt.savefig('test.pdf')

PS: I came up with the algorithm for the expected_degree_graph implementation in networkx (not the model, but the algorithm to implement it). If you get free time, read over it. It's cool.

like image 80
Joel Avatar answered Oct 24 '22 02:10

Joel


When I encountered a similar problem to yours, I went the chicken way out and simply used the BA generator with varying number of links for new nodes being added, e.g., 1,..., 5. This guarantees a single connected component and no parallel edges.

Since you want fixed power-law exponents, I suggest the following: Typically, in cascading failures two different types of topologies are studied. If that is the case for you, generate a "scale-free" network from the largest connected component (LCC) as shown in my ipython notebook

import networkx as nx
from networkx.utils import (powerlaw_sequence, create_degree_sequence)
sequence = create_degree_sequence(num, powerlaw_sequence, exponent=exp)
graph = nx.configuration_model(sequence, seed=seed)
loops = graph.selfloop_edges()
# remove parallel edges and self-loops
graph = nx.Graph(graph)
graph.remove_edges_from(loops)
# get largest connected component
# unfortunately, the iterator over the components is not guaranteed to be sorted by size
components = sorted(nx.connected_components(graph), key=len, reverse=True)
lcc = graph.subgraph(components[0])

and take the number of nodes to generate an ER graph as the connected topology since the number of nodes in the LCC is much more reliable above a certain edge probability.

As you can see in the linked degree distributions, the topology of the LCC is still what I would consider "scale-free". When you consider networks of several thousand nodes it should not be a problem that your 60 networks have not exactly the same number of nodes as long as your two connected networks have the same number.

If you want to connect two "scale-free" networks, I don't see how to do that other than deleting random nodes from the larger of the two LCCs until you arrive at the same number.

Let us know how you solve it.

like image 22
Midnighter Avatar answered Oct 24 '22 04:10

Midnighter