Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get all words from spacy vocab?

I need all the words from Spacy vocab. Suppose, I initialize my spacy model as

nlp = spacy.load('en')

How do I get the text of words from nlp.vocab?

like image 693
pauli Avatar asked Feb 02 '19 17:02

pauli


2 Answers

You can get it as a list like this:

list(nlp.vocab.strings)
like image 131
David Avatar answered Oct 13 '22 21:10

David


As of spaCy v3.0, we need to run

python -m spacy download en_core_web_sm

and then e.g.

import spacy
nlp = spacy.load("en_core_web_sm")
words = set(nlp.vocab.strings)
word = 'would'
print(f"Is '{word}' an English word: {word in words}")  # True
like image 5
tyrex Avatar answered Oct 13 '22 20:10

tyrex