I've installed syntaxnet and am able to run the parser with the provided demo script. Ideally, I would like to run it directly from python. The only code I found was this:
import subprocess
import os
os.chdir(r"../models/syntaxnet")
subprocess.call([
"echo 'Bob brought the pizza to Alice.' | syntaxnet/demo.sh"
], shell = True)
which is a complete disaster - inefficient and over-complex (calling python from python should be done with python).
How can I call the python APIs directly, without going through shell scripts, standard I/O, etc?
EDIT - Why isn't this as easy as opening syntaxnet/demo.sh and reading it?
This shell script calls two python scripts (parser_eval and conll2tree) which are written as python scripts and can't be imported into a python module without causing multiple errors. A closer look yields additional script-like layers and native code. These upper layers need to be refactored in order to run the whole thing in a python context. Hasn't anyone forked syntaxnet with such a modification or intend to do so?
All in all it doesn't look like it would be a problem to refactor the two scripts demo.sh runs (https://github.com/tensorflow/models/blob/master/syntaxnet/syntaxnet/parser_eval.py and https://github.com/tensorflow/models/blob/master/syntaxnet/syntaxnet/conll2tree.py) into a Python module that exposes a Python API you can call.
Both scripts use Tensorflow's tf.app.flags API (described here in this SO question: What's the purpose of tf.app.flags in TensorFlow?), so those would have to be refactored out to regular arguments, as tf.app.flags
is a process-level singleton.
So yeah, you'd just have to do the work to make these callable as a Python API :)
There is a Rest API here for both syntaxnet and dragnn.
I had run them successfully on my cloud server. Some points I want to share:
build docker
sudo docker build -< ./Dockerfile
Some error may occur when build syntaxnet, just follow the ./Dockerfile and build the docker manually, it's easy to follow.
download pre-trained model
model for syntaxnet is here, eg the Chinese model http://download.tensorflow.org/models/parsey_universal/Chinese.zip
model for dragnn located here
unzip them into folders eg ./synataxnet_data, so you have something like ./synataxnet_data/Chinese
run and test
3.1 Synataxnet
run
docker run -p 9000:9000 -v ./synataxnet_data/:/models ljm625/syntaxnet-rest-api
test
curl -X POST -d '{ "strings": [["今天天气很好","猴子爱吃 桃子"]] }' -H "Content-Type: application/json" http://xxx.xxx.xxx.xxx:9000/api/v1/query/Chinese
3.2 dragnn
run
sudo docker run -p 9001:9000 -v ./dragnn_data:/models ljm625/syntaxnet-rest-api:dragnn
test
http://Yourip:9001/api/v1/use/Chinse
curl -X POST -d '{ "strings": ["今天 天气 很好","猴子 爱 吃 桃子"],"tree":true }' -H "Content-Type: application/json" http://xxx.xx.xx.xx:9001/api/v1/query
4.test results and problems
From my testing with Chinese model, the syntaxnet is slow , it spend 3 seconds to process one query, and 9 seconds for a batch of 50 queries. There is a fixed cost for loading model.
For the dragnn model, it's fast, but I'm not satisfied with the parsing result (only test with Chinese).
PS: I don't like the way synataxnet works, like using bazel and reading data from stdin, if you want to customize it, you could find some info here
Other resource that help https://github.com/dsindex/syntaxnet/blob/master/README_api.md
The best way to integrate SyntaxNet with your own code is to have it as a web service. I did that to parse Portuguese text.
I started by adapting an existing Docker Container with SyntaxNet and Tensorflow serving, to run only for Portuguese, to keep memory low. It runs really fast and it's easy to integrate with your code.
I did a blog post about it, and you can easily adapt it to any other language:
http://davidsbatista.net/blog/2017/07/22/SyntaxNet-API-Portuguese/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With