Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

NLTK - Download all nltk data except corpara from command line without Downloader UI

Tags:

python

nlp

nltk

corpus

nltk-trainer

We can download all nltk data using:

> import nltk
> nltk.download('all')

Or specific data using:

> nltk.download('punkt')
> nltk.download('maxent_treebank_pos_tagger')

But I want to download all data except 'corpara' files, for example - all chunkers, grammers, models, stemmers, taggers, tokenizers, etc

is there any way to do so without Downloader UI? something like,

> nltk.download('all-taggers')

like image

707

asked Jun 25 '16 16:06

RAVI

People also ask

Where can I download NLTK data?

Download individual packages from https://www.nltk.org/nltk_data/ (see the “download” links). Unzip them to the appropriate subfolder.

1 Answers

List all corpora ids and set _status_cache[pkg.id] = 'installed'.

It will set status value for all corpora as 'installed' and corpora packages will be skipped when we use nltk.download().

Instead of downloading all corpora and models, if you're unsure of which corpora/package you need, use nltk.download('popular').

import nltk

dwlr = nltk.downloader.Downloader()

for pkg in dwlr.corpora():
    dwlr._status_cache[pkg.id] = 'installed'

dwlr.download('popular')

To download all packages of specific folder.

import nltk

dwlr = nltk.downloader.Downloader()

# chunkers, corpora, grammars, help, misc, 
# models, sentiment, stemmers, taggers, tokenizers
for pkg in dwlr.packages():
    if pkg.subdir== 'taggers':
        dwlr.download(pkg.id)

like image

142

answered Sep 25 '22 00:09

RAVI

Sign in to Comment

Related questions
                            
                                How to Read Data from Arduino with Raspberry pi via I2C
                            
                                Using groupby and apply to add column to each group
                            
                                Flask-Restful taking over exception handling from Flask during non debug mode
                            
                                pandas read_sql return query string with arguments passed
                            
                                Sublime Text 3 Python Interactive Console? [duplicate]
                            
                                How to list the names of PyPI packages corresponding to imports in a script?
                            
                                Rendering Bokeh widgets in django Templates
                            
                                Crawling slows down drastically towards the end
                            
                                Skip loop if a function is taking too long?
                            
                                Matplotlibs pyplot.subplots() crashes kernel
                            
                                Shift theorem in Discrete Fourier Transform
                            
                                How to get python 3.5.1 running with heroku local?
                            
                                Implementing seq2seq with beam search
                            
                                How can I create an AI for tic tac toe in Python using ANN and genetic algorithm?
                            
                                Using django-filer, can I chose the folder that images go into, from 'Unsorted Uploads'
                            
                                Why do I get "GurobiError: Variable not in model" after using Model.copy()?
                            
                                how to click on the link using python selenium?
                            
                                Docker / Celery: Can't get celery to run
                            
                                How do i use Linux terminal commands like CD and LS? [duplicate]
                            
                                Saving a collection of variable length tensors to a TFRecords file in TensorFlow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With