How to evaluate Word2Vec model

Tags:

Hi have my own corpus and I train several Word2Vec models on it. What is the best way to evaluate them one against each-other and choose the best one? (Not manually obviously - I am looking for various measures).

It worth noting that the embedding is for items and not word, therefore I can't use any existing benchmarks.

Thanks!

271

asked Oct 04 '18 11:10

oren_isp

1 Answers

There's no generic way to assess token-vector quality, if you're not even using real words against which other tasks (like the popular analogy-solving) can be tried.

If you have a custom ultimate task, you have to devise your own repeatable scoring method. That will likely either be some subset of your actual final task, or well-correlated with that ultimate task. Essentially, whatever ad-hoc method you may be using the 'eyeball' the results for sanity should be systematized, saving your judgements from each evaluation, so that they can be run repeatedly against iterative model improvements.

(I'd need more info about your data/items and ultimate goals to make further suggestions.)

116

answered Sep 20 '22 16:09

gojomo

Related questions
                            
                                Numpy arrays vs Python arrays [duplicate]
                            
                                ImportError: No module named gspread
                            
                                Python str() vs. '' - which is preferred
                            
                                Extract string if match the value in another list
                            
                                matplotlib: Tick labels disappeared after set sharex in subplots [duplicate]
                            
                                NetworkX Key Error when writing GML file
                            
                                How to annotate that a classmethod returns an instance of that class [duplicate]
                            
                                using the timedelta.round() function
                            
                                Grouping import statements in python
                            
                                How to make video from an updating numpy array in Python
                            
                                how is asyncio.sleep() in python implemented?
                            
                                Generate a list a(n) is not of the form prime + a(k), k < n
                            
                                Python: how to replace NaN with conditions in a dataframe?
                            
                                Python : How to make label bold in kivy
                            
                                Speed of np.empty vs np.zeros
                            
                                Using pytest's parametrize, how can I skip the remaining tests if one test case fails?
                            
                                Pandas Join on String Datatype
                            
                                pixel/array position to lat long gdal Python
                            
                                Replacing nan with blanks in Python
                            
                                Efficiently compute n-body gravitation in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to evaluate Word2Vec model

Tags:

python

nlp

embedding

word-embedding

word2vec

oren_isp

People also ask

1 Answers

gojomo

Recent Activity

Donate For Us