Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find Relationships between Objects

For People With A Similar Question (written after finding a solution):

This problem, as you might notice according to the answers below, has a lot of different solutions. I only chose Evan's because it was the easiest one for me implement into my own code. However, from what I tried, every other answer also worked. @SalvadorDali linked this Kaggle page which was definitely interesting and I reccomend reading if you are interested. Prolog was also brought up as a possible solution, I'm unfamiliar with it, but if you already know it -- it's probably worth considering. Also, if you just want to get code to use there are working Javascript and Python examples below. However, each one had a different approach to the solution and I'm not sure which is most effecient (feel free to test it yourself).

For further approaches/reading:

http://en.wikipedia.org/wiki/Breadth-first_search

Prolog and ancestor relationship

https://www.kaggle.com/c/word2vec-nlp-tutorial/details/part-2-word-vectors


Sorry for the confusing title, I can't figure out a way to properly word my question -- any better ideas are welcome.

Because I'm having such a difficult time describing my question, I'll try to explain my goal and code as much as needed:

Note: my code here is Go, but I'd be happy with answers in other languages as well, if you have any questions I'll try to answer as quick as possible

Basically, I have an array of "Word" objects that look like this:

type Word struct{
     text     string
     synonyms []string
}

This is an example of 4 words within the array:

  []Word{
      {text: "cat" synonyms: ["feline", "kitten", "mouser"]}
      {text: "kitten" synonyms: ["kitty", "kit"]} 
      {text: "kit" synonyms: ["pack", "bag", "gear"]}
      {text: "computer" synonyms: ["electronics", "PC", "abacus"]}
   }

My challenge is writing a method to test for a relationship between 2 words. Of course, testing between 2 words like "cat" and "kitten" would be easy with the example above. I could just check "Cat"s list of synonyms and test to see if it contains "kitten." With code like this:

areWordsRelated(word1 Word, word2 Word) bool{
    for _, elem := range word1.synonyms{
         if elem == word2.text{
             return true
         }
    }
    return false
}

However, I can't figure out how to test for a more distant relationship.

For example:

areWordsRelated("cat","pack") //should return true 
//because "cat" is related to "kitten" which is related to "pack"
areWordsRelated("cat", "computer") //should return false

I tried to do it recursively, but all my attempts don't seem to work. Any example code (My code is in Go, but Python, Java, or Javascript are also fine), pseudocode or just explanations would be really great.

like image 229
Maximilian Sun Avatar asked Jun 09 '15 19:06

Maximilian Sun


People also ask

Where is there a relationship between two objects?

The relationship between objects defines how these objects will interact or collaborate to perform an operation in an application. In any application, objects of user interface classes interact with the business layer objects in order to perform an operation.

What denotes the relationship between two objects?

Relationships between objects can be classified as either "is-a" (inheritance) or "has-a" (composition). These two relationships enable the OO software designer to create abstract models of the desired system. An object-oriented system can be characterized as a system of cooperating objects.

What are the three types of object relationships?

Generally, these four types of relationships (Inheritance, Composition, Association and Aggregation) are used in OOP.


4 Answers

If you give me some feedback on this I can edit it because it doesn't do exactly what you asked but it is the jist. I'll edit with a technical explanation of what has to be changed to meet your exact example.

package main

import "fmt"

func main() {
    words := []Word{
            {text: "cat", synonyms: []string{"feline", "kitten", "mouser"}},
            {text: "kitten", synonyms: []string{"kitty", "kit"}} ,
            {text: "kit", synonyms: []string{"pack", "bag", "gear"}},
            {text: "computer", synonyms: []string{"electronics", "PC", "abacus"}},
    }

    fmt.Println(areWordsRelated(words, words[0], words[2]))
    fmt.Println(areWordsRelated(words, words[0], words[3]))
}

type Word struct{
     text     string
     synonyms []string
}

func areWordsRelated(words []Word, word1, word2 Word) bool {
    for _, elem := range word1.synonyms{
        if elem == word2.text{
            return true
        } else {
            for _, word := range words {
                if word.text == elem {
                    if (areWordsRelated(words, word, word2)) {
                        return true
                    }
                }
            }
        }
    }
    return false
}

EDIT: This doesn't do quite what you asked because it doesn't make the connection between "pack" and "cat" as pack is not represented by an actual word object and I defined the method to receive word2 as an object (just working off your example). I could instead make that a string so it can check for "pack" in the synonyms array of "kit" before returning but the idea is the same none the less... Here's a high level explanation of the algorithm.

Iterate the synonyms, if it isn't a match, find that Word object back in the original collection and call myself with it as the first argument. This will recursively exhaust every path until it finds a match, or there are none left in which case you're outside the loop returning false. The code above runs in go playground and correctly returns true\nfalse. Notice that the recursive call is made within an if to protect from returning false prematurely (also a performance enhancement because we return as soon as true is found rather than continue to recurse the paths).

https://play.golang.org/p/gCeY0SthU1

like image 154
evanmcdonnal Avatar answered Sep 17 '22 22:09

evanmcdonnal


A Python solution:

class Word:

   # Dictionary of Words, keyed by name.
   word_dict = {}

   def __init__(self, name, synonyms):
      self.name = name
      self.synonyms = synonyms

      # Update the dictionary.
      Word.word_dict[name] = self
      for s in synonyms:
         if not s in Word.word_dict:
            Word.word_dict[s] = Word(s, [])

   def isAncestor(self, other):
      if other in self.synonyms:
         return True
      for s in self.synonyms:
         if Word.word_dict[s].isAncestor(other):
            return True
      return False

def areWordsRelated(word1, word2):
   if not word1 in Word.word_dict or not word2 in Word.word_dict:
      return False
   return Word.word_dict[word1].isAncestor(word2) or Word.word_dict[word2].isAncestor(word1)

words = []
words.append(Word("cat", ["feline", "kitten", "mouser"]))
words.append(Word("kitten", ["kitty", "kit"]))
words.append(Word("kit", ["patck", "bag", "gear"]))
words.append(Word("computer", ["electronics", "PC", "abacus"]))

print(areWordsRelated("cat", "kit"))
print(areWordsRelated("kit", "cat"))
print(areWordsRelated("cat", "computer"))
print(areWordsRelated("dog", "computer"))

Output:

True
True
False
False
like image 27
R Sahu Avatar answered Sep 21 '22 22:09

R Sahu


First of all it is not clear how do you define relationship here. If your "cat" has synonyms: ["feline", "kitten", "mouser"], does that mean that "mouser" has a synonym "cat".

Based on my understanding the answer is no. So here is a solution in python:

G = {
    "cat": ["feline", "kitten", "mouser"],
    "kitten": ["kitty", "kit"],
    "kit": ["pack", "bag", "gear"],
    "computer": ["electronics", "PC", "abacus"]
}

def areWordsRelated(G, w1, w2):
    if w1 == w2:
        return True

    frontier = [w1]
    checked = set()
    while len(frontier):
        el = frontier.pop()
        if el in G:
            neighbors = G[el]
            for i in neighbors:
                if i == w2:
                    return True
                if i not in checked:
                    frontier.append(i)
                    checked.add(i)

    return False

areWordsRelated(G, "cat", "pack") #true
areWordsRelated(G, "cat", "computer") #false

So what are we doing here? At first you have your graph, which is just dictionary (map in go) which shows your relationship (I basically took your slice).

Our algorithm grows like a mold, maintaining a set of checked elements and a current frontier. If frontier is empty (nothing to explore, then the elements are not connected). We extract one element at a time from a frontier and check all the neighbors. If any of them is the element we are looking for - then there is a connection. Otherwise check if we already have seen such element (and if not than add it to the frontier and to the set of checked).

Notice that if your relationship works in a slightly different way, all you need is to modify a graph.


One last word, if you are looking for a normal way to find synonyms take a look at word to vector algorithm and a nice implementation in python. This will allow you to find really complicated relationship even between words like finding that California and Golden Gate are related even without having this relationship specified.

like image 43
Salvador Dali Avatar answered Sep 19 '22 22:09

Salvador Dali


You're looking at a 2nd degree relationship (as opposed to the 'easy' 1st place example you already know how to find), meaning you have to do one of two things:

(1) The storage-heavy solution requires maintaining a separate list of 2nd-degree relationships and then simply do a search within that (longer) list - this requires maintaining (potentially MUCH) more data about word relationships. For example, if you have 10000 words, and each has roughly 10 synonyms, that's 100,000 first-degree relationships stored. But then you'd have something like a billion 2nd-degree relationships. So of course that gets unwieldy quickly.

In that case, each entry looks like this: {text: "cat" synonyms: ["feline", "kitten", "mouser"] seconds:["pack",...]} ... and you simply write a separate function that will check for relationships in EITHER 'synonyms' or 'seconds'.

(2) The programmatic solution would be to still only store the 1st-degree relationships and then do an embedded loop.

In this case:

//// This checks for 1st degree relationship
areWordsRelated1(word1 Word, word2 Word) bool{
    for _, elem := range word1.synonyms{
         if elem == word2.text{
             return true
         }
    }
    return false
}

//// This checks for 2nd degree by checking 1st and then, if not, 
//// then trying the 1st degree function on the children of word2
//// before giving up and returning false
areWordsRelated2(word1 Word, word2 Word) bool{
    for _, elem1 := range word1.synonyms{
         if elem1 == word2.text{
             return true
         } else {
         for _, elem2 := range elem1.synonyms{
             if areWordsRelated1(word1, elem2) {
                 return true
             }
         }
    }
    return false
}

NOTE: I noticed that in your sample data, "cat" was related to "kitten", but "kitten" was not conversely related to "cat".

like image 45
Jonathan Tweedy Avatar answered Sep 18 '22 22:09

Jonathan Tweedy