I'm trying to use NLTK, the stanford natural language toolkit. After install the required files, I start to execute the demo code: http://www.nltk.org/index.html
>>> import nltk
>>> sentence = """At eight o'clock on Thursday morning
... Arthur didn't feel very good."""
>>> tokens = nltk.word_tokenize(sentence)
>>> tokens
['At', 'eight', "o'clock", 'on', 'Thursday', 'morning',
'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']
>>> tagged = nltk.pos_tag(tokens)
>>> tagged[0:6]
[('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'),
('Thursday', 'NNP'), ('morning', 'NN')]
>>> entities = nltk.chunk.ne_chunk(tagged)
>>> entities
Then I get message:
LookupError:
===========================================================================
NLTK was unable to find the gs file!
Use software specific configuration paramaters or set the PATH environment variable.
I tried google, but there's no one tell what the missing gs file is.
I came across this error too.
gs
stands for ghostscript. You get the error because your chunker is trying to use ghostscript to draw a parse tree of the sentence, something like this:
I was using IPython; to debug the issue I set the traceback verbosity to verbose
with the command %xmode verbose
, which prints the local variables of each stack frame. (see the full traceback below) The file names are:
file_names=['gs', 'gswin32c.exe', 'gswin64c.exe']
A little Google search for gswin32c.exe
told me it was ghostscript.
/Users/jasonwirth/anaconda/lib/python3.4/site-packages/nltk/__init__.py in find_file_iter(filename='gs', env_vars=['PATH'], searchpath=(), file_names=['gs', 'gswin32c.exe', 'gswin64c.exe'], url=None, verbose=False)
517 (filename, url))
518 div = '='*75
--> 519 raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
520
521 def find_file(filename, env_vars=(), searchpath=(),
LookupError:
===========================================================================
NLTK was unable to find the gs file!
Use software specific configuration paramaters or set the PATH environment variable.
===========================================================================
Just to add to the previous answers, if you replace 'entities' with 'print(entities)' you won't get the error.
Without print() the console/notebook doesn't know how to "draw" a tree object.
A bit addition to Jason Wirth's answer. Under Windows, this line of code will search for "gswin64c.exe" in the environment variable PATH, however, the ghostscript installer does not add the binary to PATH, so for this to work, you'll need to find where ghostscript is installed and add the /bin subfolder to PATH.
For example, in my case I added C:\Program Files\gs\gs9.19\bin to PATH.
If ghostscript for some reason is not available for your platform or fails to install you can also use the wonderful networkx package to visualize such trees:
import networkx as nx
from networkx.drawing.nx_agraph import graphviz_layout
import matplotlib.pyplot as plt
def drawNodes(G,nodeLabels,parent,lvl=0):
def addNode(G,nodeLabels,label):
n = G.number_of_nodes()
G.add_node(n)
nodeLabels[n] = label
return n
def findNode(nodeLabels,label):
# Travel backwards from end to find right parent
for i in reversed(range(len(nodeLabels))):
if nodeLabels[i] == label:
return i
indent = " "*lvl
if lvl == 0:
addNode(G,nodeLabels,parent.label())
for node in parent:
if type(node) == nltk.Tree:
n = addNode(G,nodeLabels,node.label())
G.add_edge(findNode(nodeLabels,parent.label()),n)
drawNodes(G,nodeLabels,node,lvl+1)
else:
print node
n1 = addNode(G,nodeLabels,node[1])
n0 = addNode(G,nodeLabels,node[0])
G.add_edge(findNode(nodeLabels,parent.label()),n1)
G.add_edge(n0,n1)
G = nx.Graph()
nodeLabels = {}
drawNodes(G,nodeLabels,entities)
options = {
'node_color': 'white',
'node_size': 100
}
plt.figure(1,figsize=(12,6))
pos=graphviz_layout(G, prog='dot')
nx.draw(G, pos, font_weight='bold', arrows=False, **options)
l = nx.draw_networkx_labels(G,pos,nodeLabels)
Instead of entities
write entities.draw()
It should work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With