Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Spark GraphX connected components

How to use subgraph function to get a graph that would include only vertexes and edges from the specific connected component? Let's say I know the connected component id, the final goal is to create a new graph based on the connected component. I'd like to keep the vertex attributes from the original graph.

like image 205
Oleg Baydakov Avatar asked May 25 '15 21:05

Oleg Baydakov


2 Answers

You have to join the graph with the component IDs to the original graph, filter (take the subgraph) by the component ID, and then discard the component ID.

import scala.reflect._
import org.apache.spark.graphx._
import org.apache.spark.graphx.lib.ConnectedComponents

def getComponent[VD: ClassTag, ED: ClassTag](
    g: Graph[VD, ED], component: VertexId): Graph[VD, ED] = {
  val cc: Graph[VertexId, ED] = ConnectedComponents.run(g)
  // Join component ID to the original graph.
  val joined = g.outerJoinVertices(cc.vertices) {
    (vid, vd, cc) => (vd, cc)
  }
  // Filter by component ID.
  val filtered = joined.subgraph(vpred = {
    (vid, vdcc) => vdcc._2 == Some(component)
  })
  // Discard component IDs.
  filtered.mapVertices {
    (vid, vdcc) => vdcc._1
  }
}
like image 133
Daniel Darabos Avatar answered Sep 18 '22 19:09

Daniel Darabos


I take your question to be, given a VertexId in a source graph, create a new graph with the nodes and edges connected to this VertexId from the source graph.

Given that, here's what I would do:

val targetVertexId = ...
val graph = Graph(..., ...)
val newGraph = Graph(
  graph.vertices.filter{case (vid,attr) => vid == targetVertexId} ++
  graph.collectNeighbors(EdgeDirection.Either)
    .filter{ case (vid,arr) => vid == targetVertexId}
    .flatMap{ case (vid,arr) => arr},
  graph.edges
).subgraph(vpred = { case (vid,attr) => attr != null})

Couple of things to note:

You can change EdgeDirection.Either to EdgeDirection.In or EdgeDirection.Out as needed.

The .subgraph at the end removes all Vertices where the attribute is set to null. If the original val graph has Vertices with attributes set to null this isn't going to work. Otherwise this works, without having to know the Vertex attribute type in advance.

like image 27
David Griffin Avatar answered Sep 16 '22 19:09

David Griffin