constructing a graph from streaming data using spark streaming

Question

I am new to spark. I need to construct a co-occurrence graph(In a tweet -words will become nodes and the if the words are from same tweet we add an edge between them) from streaming data like twitter tweets. Can we use spark streaming to construct a live co-occurrence twitter graph. Is spark streaming is meant for this use case?. I am not sure whether it can be done using spark streaming . If not what are the alternatives?

jayprich · Accepted Answer

the co-occurrence frequency can be seen as a graph or an adjacency matrix, but this is a large sparse histogram (frequency count) in the product space of your word list. most likely you wish to detect a moving window correlation so should design a sketch data structure to track unusual increase or decrease in rate of occurrence in the stream. e.g. counting bloom filter or count min sketch applied to every word-pair - see http://twitter.github.io/algebird/#com.twitter.algebird.CMSCounting

constructing a graph from streaming data using spark streaming

Tags:

apache-spark

spark-streaming

Naren

1 Answers

jayprich

Recent Activity

Donate For Us

constructing a graph from streaming data using spark streaming

Tags:

apache-spark

spark-streaming

Naren

1 Answers

jayprich

Related questions

Recent Activity

Donate For Us