Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Textrank: complementing pagerank for sentence extraction using networkx

I am trying to implement textrank algorithm for sentence extraction as described here. For that in need to complement pagerank algorithm with weighted edges and get it to run on undirected graphs. Networkx pagerank algorithm implementation allows me to easely integrate weighted edges and is said to convert directed graphs to undirected: see here. However, when I tested it still seems to use directed graph. What am I missing here? Help greatly appriciated.

Example:

import networkx as nx
D=nx.DiGraph()
D.add_weighted_edges_from([('A','B',0.5),('A','C',1)])
print nx.pagerank(D)

Outpunt: {'A': 0.25974025929223499, 'C': 0.40692640737443164, 'B': 0.33333333333333331}

like image 981
root Avatar asked Feb 12 '12 08:02

root


1 Answers

I think you misinterpreted the note on the networkx documentation. Though, I must admit it might be worded better.

The PageRank algorithm was designed for directed graphs but this algorithm does not check if the input graph is directed and will execute on undirected graphs by converting each oriented edge in the directed graph to two edges.

What this tells is that, PageRank algorithm is designed for directed graphs, but it can be used for undirected graphs. To do so, it converts the undirected network to a directed network by replacing each edge with two directed edges (in and out).

Therefore, if you give it a directed network, it will calculate the PageRank according to the directed structure. So either start with an undirected network:

import networkx as nx

# Undirected Network
D = nx.Graph()
D.add_weighted_edges_from([('A', 'B', 0.5),('A', 'C', 1)])

# Default max number of iterations failed to converge for me
print nx.pagerank(D, max_iter=200)

# Outputs:
{'A': 0.48648648872844047, 'C': 0.32567567418103965, 'B': 0.18783783709051982}

or if you already have a directed network, convert it to an undirected one:

import networkx as nx

# Directed Network
D = nx.DiGraph()
D.add_weighted_edges_from([('A', 'B', 0.5), ('A', 'C', 1)])

# Convert to undirected
G = D.to_undirected()

# Default max number of iterations failed to converge for me
print nx.pagerank(G, max_iter=200)

# Outputs:
{'A': 0.48648648872844047, 'C': 0.32567567418103965, 'B': 0.18783783709051982}
like image 154
Avaris Avatar answered Nov 19 '22 05:11

Avaris