Recommendations for using graphs theory in machine learning? [closed]

Tags:

I have been learning alot about using graphs for machine learning by watching Christopher Bishops videos( http://videolectures.net/mlss04_bishop_gmvm/ ). I find it very interesting and watched a few others in the same categories(machine learning/graph) but was wondering if anyone had any recommendations for ways of learning more?

My problem is, although the videos gave a great high level understanding, I don't have much practical skills in it yet. I've read Bishops book on machine learning/patterns as well as Norvig's AI book but both don't seem to touch upon specific using graphs much. With the emergence of search engines and social networking, I would think machine learning on graphs would be popular.

If possible, can anyone suggestion an a resource to learn from? (I'm new to this field and development is a hobby for me, so I'm sorry in advance if there's a super obvious resource to learn from..I tried google and university sites).

Thanks in advance!

969

asked Dec 12 '11 15:12

Lostsoul

2 Answers

First, i would strongly recommend the book Social Network Analysis for Startups by Maksim Tsvetovat and Alexander Kouznetsov. A book like this is a godsend for programmers who need to quickly acquire a basic fluency in a specific discipline (in this case, graph theory) so that they can begin writing code to solve problems in this domain. Both authors are academically trained graph theoreticians but the intended audience of their book is programmers. Nearly all of the numerous examples presented in the book are in python using the networkx library.

Second, for the projects you have in mind, two kinds of libraries are very helpful if not indispensible:

graph analysis: e.g., the excellent networkx (python), or igraph (python, R, et. al.) are two that i can recommend highly; and
graph rendering: the excellent graphViz, which can be used stand-alone from the command line but more likely you will want to use it as a library; there are graphViz bindings in all major languages (e.g., for python there are at least three i know of, though pygraphviz is my preference; for R there is rgraphviz which is part of the bioconductor package suite). Rgraphviz has excellent documentation (see in particular the Vignette included with the Package).

It is very easy to install and begin experimenting with these libraries and in particular using them

to learn the essential graph theoretic lexicon and units of analysis (e.g., degree sequence distribution, nodes traversal, graph operators);
to distinguish critical nodes in a graph (e.g., degree centrality, eigenvector centrality, assortivity); and
to identify prototype graph substructures (e.g., bipartite structure, triangles, cycles, cliques, clusters, communities, and cores).

The value of using a graph-analysis library to quickly understand these essential elements of graph theory is that for the most part there is a 1:1 mapping between the concepts i just mentioned and functions in the (networkx or igraph) library.

So e.g., you can quickly generate two random graphs of equal size (node number), render and then view them, then easily calculate for instance the average degree sequence or betweenness centrality for both and observer first-hand how changes in the value of those parameters affects the structure of a graph.

W/r/t the combination of ML and Graph Theoretic techniques, here's my limited personal experience. I use ML in my day-to-day work and graph theory less often, but rarely together. This is just an empirical observation limited to my personal experience, so the fact that i haven't found a problem in which it has seemed natural to combine techniques in these two domains. Most often graph theoretic analysis is useful in ML's blind spot, which is the availability of a substantial amount of labeled training data--supervised ML techniques depend heavily on this.

One class of problems to illustrate this point is online fraud detection/prediction. It's almost never possible to gather data (e.g., sets of online transactions attributed to a particular user) that you can with reasonable certainty separate and label as "fraudulent account." If they were particularly clever and effective then you will mislabel as "legitimate" and for those accounts for which fraud was suspected, quite often the first-level diagnostics (e.g., additional id verification or an increased waiting period to cash-out) are often enough to cause them to cease further activity (which would allow for a definite classification). Finally, even if you somehow manage to gather a reasonably noise-free data set for training your ML algorithm, it will certainly be seriously unbalanced (i.e., much more "legitimate" than "fraud" data points); this problem can be managed with statistics pre-processing (resampling) and by algorithm tuning (weighting) but it's still a problem that will likely degrade the quality of your results.

So while i have never been able to successfully use ML techniques for these types of problems, in at least two instances, i have used graph theory with some success--in the most recent instance, by applying a model adapted from the project by a group at Carnegie Mellon initially directed to detection of online auction fraud on ebay.

186

answered Nov 15 '22 22:11

doug

MacArthur Genius Grant recipient and Stanford Professor Daphne Koller co-authored a definitive textbook on Bayesian networks entitled Probabalistic Graphical Models, which contains a rigorous introduction to graph theory as applied to AI. It may not exactly match what you're looking for, but in its field it is very highly regarded.

answered Nov 15 '22 22:11

templatetypedef

Related questions
                            
                                Generating power set recursively without any loops
                            
                                How to reverse a graph in linear time?
                            
                                Time complexity of Uniform-cost search
                            
                                Function to solve cubic equation analytically
                            
                                Algorithm to find if there is any i so that array[i] equals i
                            
                                Algorithm to determine coin combinations
                            
                                Can bottom-up dynamic programming be done in Lisp?
                            
                                Generate a random point on a rectangle's perimeter with uniform distribution
                            
                                Generating a uniform distribution of INTEGERS in C
                            
                                Finding path with maximum minimum capacity in graph
                            
                                Perfect square and perfect cube
                            
                                Using Python's PIL, how do I enhance the contrast/saturation of an image?
                            
                                Permutation of String letters: How to remove repeated permutations?
                            
                                Exactly how many comparisons does merge sort make?
                            
                                Prime Factorization
                            
                                Algo for a stable 'download-time-remaining' in a download window
                            
                                Reducing graph data without losing graph shape
                            
                                Reversible shuffle algorithm using a key
                            
                                Lexicographical sorting
                            
                                How do we achieve "substring-match" under O(n) time?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Recommendations for using graphs theory in machine learning? [closed]

Tags:

algorithm

math

artificial-intelligence

machine-learning

graph-theory

Lostsoul

People also ask

2 Answers

doug

templatetypedef

Recent Activity

Donate For Us