Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using the Python NLTK (2.0b5) on the Google App Engine

I have been trying to make the NLTK (Natural Language Toolkit) work on the Google App Engine. The steps I followed are:

  1. Download the installer and run it (a .dmg file, as I am using a Mac).
  2. copy the nltk folder out of the python site-packages directory and place it as a sub-folder in my project folder.
  3. Create a python module in the folder that contains the nltk sub-folder and add the line: from nltk.tokenize import *

Unfortunately, after launching it I get this error (note that this error is raised deep within NLTK and I'm seeing it for my system installation of python as opposed to the one that is in the sub-folder of the GAE project):

 <type 'exceptions.ImportError'>: No module named nltk
Traceback (most recent call last):
  File "/base/data/home/apps/xxxx/1.335654715894946084/main.py", line 13, in <module>
    from lingua import reducer
  File "/base/data/home/apps/xxxx/1.335654715894946084/lingua/reducer.py", line 11, in <module>
    from nltk.tokenizer import *
  File "/base/data/home/apps/xxxx/1.335654715894946084/lingua/nltk/__init__.py", line 73, in <module>
    from internals import config_java
  File "/base/data/home/apps/xxxx/1.335654715894946084/lingua/nltk/internals.py", line 19, in <module>
    from nltk import __file__

Note: this is how the error looks in the logs when uploaded to GAE. If I run it locally I get the same error (except it seems to originate inside my site-packages instance of NLTK ... so no difference there). And "xxxx" signifies the project name.

So in summary:

  • Is what I am trying to do even possible? Will NLTK even run on the App Engine?
  • Is there something I missed? That is: copying "nltk" to the GAE project isn't enough?

EDIT: fixed typo and removed unnecessary step

like image 759
Ryan Delucchi Avatar asked Aug 17 '09 05:08

Ryan Delucchi


2 Answers

oakmad has managed to successfully work through deploying SEVERAL NLTK modules to GAE. Hope this helps. But , but be honest, I still don't think it's true even after read the post.

like image 164
sunqiang Avatar answered Sep 28 '22 17:09

sunqiang


The problem here is that nltk is attempting to do recursive imports: When nltk/__init__.py is imported, it imports nltk/internals.py, which then attempts to import nltk again. Since nltk is in the middle of being imported itself, it fails with a (rather unhelpful) error. Whatever they're doing is pretty weird anyway - it's unsurprising something like from nltk import __file__ breaks.

This looks like a problem with nltk itself - does it work when imported directly from a Python console? If so, they must be doing some sort of trickery in the installed version. I'd suggest asking on the nltk groups what they're up to and how to work around it.

like image 38
Nick Johnson Avatar answered Sep 28 '22 16:09

Nick Johnson