Using SVD to plot word vector to measure similarity

Tags:

This is the code I am using to calculate a word co-occurrence matrix for immediate neighbor counts. I found the following code on the net, which uses SVD.

 import numpy as np
 la = np.linalg
 words = ['I','like','enjoying','deep','learning','NLP','flying','.']
 ### A Co-occurence matrix which counts how many times the word before and after a particular word appears ( ie, like appears after I 2 times)
 arr = np.array([[0,2,1,0,0,0,0,0],[2,0,0,1,0,1,0,0],[1,0,0,0,0,0,1,0],[0,0,0,1,0,0,0,1],[0,1,0,0,0,0,0,1],[0,0,1,0,0,0,0,8],[0,2,1,0,0,0,0,0],[0,0,1,1,1,0,0,0]])
 u, s, v = la.svd(arr, full_matrices=False)
 import matplotlib.pyplot as plt
 for i in xrange(len(words)):
     plt.text(u[i,2], u[i,3], words[i])

In the last line of code, the first element of U is used as an x-coordinate and the second element of U is used as a y-coordinate to project the words, to see the similarity. What is the intuition behind this approach? Why they are taking the 1st and 2nd elements in each row (each row represents each word) as x and y to represent a word? Please help.

762

asked Jul 17 '15 18:07

Seja Nair

1 Answers

import numpy as np
import matplotlib.pyplot as plt
la = np.linalg
words = ["I", "like", "enjoy", "deep", "learning", "NLP", "flying", "."]
X = np.array([[0,2,1,0,0,0,0,0], [2,0,0,1,0,1,0,0], [1,0,0,0,0,0,1,0], [0,1,0,0,1,0,0,0], [0,0,0,1,0,0,0,1], [0,1,0,0,0,0,0,1], [0,0,1,0,0,0,0,1], [0,0,0,0,1,1,1,0]])
U, s, Vh = la.svd(X, full_matrices = False)

#plot
for i in range(len(words)):
    plt.text(U[i,0], U[i,1], words[i])
plt.show()

In the plot, pan axes towards left and you will see all the words.

198

answered Sep 25 '22 15:09

Aakash Saxena

Related questions
                            
                                Python, strptime is skipping zeros in the millisecond section
                            
                                pandas 'as_index' function doesn't work as expected
                            
                                plot a bar chart using matplotlib - type error
                            
                                how to use atexit when exception is raised
                            
                                How to print stuff in a py.test finalizer
                            
                                Conditional passing of arguments to methods in python
                            
                                Is there a way to use super() to call the __init__ method of each base class in Python?
                            
                                Unexpected output from mpi4py program
                            
                                Advantages of Dict over OrderedDict [duplicate]
                            
                                Is there a name for double underscore functions? [duplicate]
                            
                                Python - Parse string, known structure
                            
                                how can we wire up cluster based softwares using chef?
                            
                                Upgrading Django 1.5 to 1.8 the wild way. Good idea or a pretty silly one?
                            
                                Float division of big numbers in python
                            
                                Replace elements in numpy array using list of old and new values
                            
                                Python numpy: sum every 3 rows (converting monthly to quarterly)
                            
                                How do I link records to a large table efficiently using python Dedupe?
                            
                                How to use the language option in synsets (nltk) if you load a wordnet manually?
                            
                                Making Python scripts run on Windows without specifying ".py" extension
                            
                                Why isn't isnumeric working?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using SVD to plot word vector to measure similarity

Tags:

python

matplotlib

nlp

svd

Seja Nair

People also ask

1 Answers

Aakash Saxena

Recent Activity

Donate For Us