Learning Python with the Natural Language Toolkit has been great fun, and they work fine on my local machine, though I had to install several packages in order to use it. Exactly how the NLTK resources are now integrated on my system remains a mystery to me, though it seems clear that the NLTK source code is not simply sitting someplace where the Python interpreter knows to find it.
I would like to use the Toolkit on my website, which is hosted by another company. Simply uploading the NLTK source code files to my server and telling scripts in the root directory to "import nltk" has not worked; I kind of doubted it would.
What, then, is the difference between whatever the NLTK install routine does and straightforward imports, and why should the toolkit be inaccessible to straightforward imports? Is there a way to use the NLTK source files without essentially altering my host's Python?
Many thanks for your thoughts and notes. -G
Not only do you need NLTK on your PYTHONPATH
(as @dhg points out), you need whatever dependencies it has; a quick local test indicates that this is really only PyYAML
. You should just use pip
to install packages. It's much less error-prone than trying to manually figure out all the dependencies and tweak the PYTHONPATH
accordingly. If this is a shared host where you don't have the proper access to run a pip
install, you should ask the host to do it for you.
To address the more general "Whatever the install script is doing" portion of your question: most Python packages are managed using setup.py
, which is built on top of distutils
(and sometimes setuputils
). If this is something you're really interested in, check out The Hitchhiker’s Guide to Packaging.
You don't need system-install support, just the right modules where python can find them. I've set up the NLTK without system install rights with relatively little trouble--but I did have commandline access so I could see what I was doing.
To get this working, you should put together a local install on a computer you do control-- ideally one that never had NLTK installed, since you may have forgotten (or not know) what was configured for you. Once you figure out what you need, copy the bundle to the hosting computer. But at that point, check that you're using the module versions that are appropriate for the webserver's architecture. Numpy in particular has different 32/64 bit versions, IIRC.
It's also worth your while to figure out how to see the error messages from the hosting computer. If you can't see them by default, you could catch ImportError
and display the message it contains, or you could redirect stderr... it depends on your configuration.
Let's assume that you have the NLTK source located in /some/dir/
, so that
dhg /some/dir/$ ls nltk
...
app
book.py
ccg
chat
chunk
classify
...
You can either launch the python interpreter from the directory in which the nltk
source directory is found:
dhg /some/dir/$ python
Python 2.7.1 (r271:86882M, Nov 30 2010, 10:35:34)
>>> import nltk
Or you can add its location to the PYTHONPATH
environment variable, which makes NLTK available from anywhere:
dhg /whatever/$ export PYTHONPATH="$PYTHONPATH:/some/dir/"
dhg /whatever/$ python
Python 2.7.1 (r271:86882M, Nov 30 2010, 10:35:34)
>>> import nltk
Any other dependencies, including those that NLTK depends on, can also be added to the PYTHONPATH
in the same way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With