Google states that a "term-vector algorithm" can be used to determine popular keywords. I have studied http://en.wikipedia.org/wiki/Vector_space_model, but cant understand the term "term-vector algorithm".
Please explain it in a brief summary, very simple language, as if the reader is a child.
I believe "vector" refers to the mathematics definition, a quantity having direction as well as magnitude. How is it that keywords have a quantity moving in a direction?
http://en.wikipedia.org/wiki/Vector_space_model states "Each dimension corresponds to a separate term." I thought dimension relates to cardinality, is that correct?
From the book Hadoop In Practice, by Alex Holmes, page 12.
It means that each word forms a separate dimension:
Example: (shamelessly taken from here)
For a model containing only three words you would get:
dict = { dog, cat, lion }
Document 1
“cat cat” → (0,2,0)
Document 2
“cat cat cat” → (0,3,0)
Document 3
“lion cat” → (0,1,1)
Document 4
“cat lion” → (0,1,1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With