How to find the most common words using spacy?

Tags:

I'm using spacy with python and its working fine for tagging each word but I was wondering if it was possible to find the most common words in a string. Also is it possible to get the most common nouns, verbs, adverbs and so on?

There's a count_by function included but I cant seem to get it to run in any meaningful way.

759

asked May 16 '16 11:05

Harry Loyd

1 Answers

I recently had to count frequency of all the tokens in a text file. You can filter out words to get POS tokens you like using the pos_ attribute. Here is a simple example:

import spacy from collections import Counter nlp = spacy.load('en') doc = nlp(u'Your text here') # all tokens that arent stop words or punctuations words = [token.text          for token in doc          if not token.is_stop and not token.is_punct]  # noun tokens that arent stop words or punctuations nouns = [token.text          for token in doc          if (not token.is_stop and              not token.is_punct and              token.pos_ == "NOUN")]  # five most common tokens word_freq = Counter(words) common_words = word_freq.most_common(5)  # five most common noun tokens noun_freq = Counter(nouns) common_nouns = noun_freq.most_common(5)

answered Oct 19 '22 04:10

Paras Dahal

Related questions
                            
                                iOS: Get displayed image size in pixels
                            
                                Cannot GET / error running hello world in webpack
                            
                                Keras: How to use fit_generator with multiple outputs of different type
                            
                                Why would I need template engines like Jade or EJS on the backend?
                            
                                GoogleApiAvailability missed with firebase-messaging:9.4.0
                            
                                No provider for NgbModalStack
                            
                                How to add pull-right to a Button in React Bootstrap?
                            
                                Writing a Python Pandas DataFrame to Word document
                            
                                How many HTTP verbs are there?
                            
                                How to convert std::string to std::vector<uint8_t>?
                            
                                Input redirection into Java - Could not find or load main class
                            
                                Gradle buildConfigField: Syntax for arrays & maps?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With