I want to use spacy as for NLP for an online service. Each time a user makes a request I call the script "my_script.py"
which starts with:
from spacy.en import English
nlp = English()
The problem I'm having is that those two lines take over 10 seconds, is it possible to keep English() in the ram or some other option to reduce this load time to less than a second?
You said that you want to launch a freestanding script (my_script.py
) whenever a request comes in. This will use capabilites from spacy.en
without the overhead of loading spacy.en
. With this approach, the operating system will always create a new process when you launch your script. So there is only one way to avoid loading spacy.en
each time: have a separate process that is already running, with spacy.en
loaded, and have your script communicate with that process. The code below shows a way to do that. However, as others have said, you will probably benefit by changing your server architecture so spacy.en
is loaded within your web server (e.g., using a Python-based web server).
The most common form of inter-process communication is via TCP/IP sockets. The code below implements a small server which keeps spacy.en
loaded and processes requests from the client. It also has a client which transmits requests to that server and receives results back. It's up to you to decide what to put into those transmissions.
There is also a third script. Since both client and server need send and receive functions, those functions are in a shared script called comm.py
. (Note that the client and server each load a separate copy of comm.py
; they do not communicate through a single module loaded into shared memory.)
I assume both scripts are run on the same machine. If not, you will need to put a copy of comm.py
on both machines and change comm.server_host
to the machine name or IP address for the server.
Run nlp_server.py
as a background process (or just in a different terminal window for testing). This waits for requests, processes them and sends the results back:
import comm
import socket
from spacy.en import English
nlp = English()
def process_connection(sock):
print "processing transmission from client..."
# receive data from the client
data = comm.receive_data(sock)
# do something with the data
result = {"data received": data}
# send the result back to the client
comm.send_data(result, sock)
# close the socket with this particular client
sock.close()
print "finished processing transmission from client..."
server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# open socket even if it was used recently (e.g., server restart)
server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server_sock.bind((comm.server_host, comm.server_port))
# queue up to 5 connections
server_sock.listen(5)
print "listening on port {}...".format(comm.server_port)
try:
while True:
# accept connections from clients
(client_sock, address) = server_sock.accept()
# process this connection
# (this could be launched in a separate thread or process)
process_connection(client_sock)
except KeyboardInterrupt:
print "Server process terminated."
finally:
server_sock.close()
Load my_script.py
as a quick-running script to request a result from the nlp server (e.g., python my_script.py here are some arguments
):
import socket, sys
import comm
# data can be whatever you want (even just sys.argv)
data = sys.argv
print "sending to server:"
print data
# send data to the server and receive a result
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# disable Nagle algorithm (probably only needed over a network)
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, True)
sock.connect((comm.server_host, comm.server_port))
comm.send_data(data, sock)
result = comm.receive_data(sock)
sock.close()
# do something with the result...
print "result from server:"
print result
comm.py
contains code that is used by both the client and server:
import sys, struct
import cPickle as pickle
# pick a port that is not used by any other process
server_port = 17001
server_host = '127.0.0.1' # localhost
message_size = 8192
# code to use with struct.pack to convert transmission size (int)
# to a byte string
header_pack_code = '>I'
# number of bytes used to represent size of each transmission
# (corresponds to header_pack_code)
header_size = 4
def send_data(data_object, sock):
# serialize the data so it can be sent through a socket
data_string = pickle.dumps(data_object, -1)
data_len = len(data_string)
# send a header showing the length, packed into 4 bytes
sock.sendall(struct.pack(header_pack_code, data_len))
# send the data
sock.sendall(data_string)
def receive_data(sock):
""" Receive a transmission via a socket, and convert it back into a binary object. """
# This runs as a loop because the message may be broken into arbitrary-size chunks.
# This assumes each transmission starts with a 4-byte binary header showing the size of the transmission.
# See https://docs.python.org/3/howto/sockets.html
# and http://code.activestate.com/recipes/408859-socketrecv-three-ways-to-turn-it-into-recvall/
header_data = ''
header_done = False
# set dummy values to start the loop
received_len = 0
transmission_size = sys.maxint
while received_len < transmission_size:
sock_data = sock.recv(message_size)
if not header_done:
# still receiving header info
header_data += sock_data
if len(header_data) >= header_size:
header_done = True
# split the already-received data between header and body
messages = [header_data[header_size:]]
received_len = len(messages[0])
header_data = header_data[:header_size]
# find actual size of transmission
transmission_size = struct.unpack(header_pack_code, header_data)[0]
else:
# already receiving data
received_len += len(sock_data)
messages.append(sock_data)
# combine messages into a single string
data_string = ''.join(messages)
# convert to an object
data_object = pickle.loads(data_string)
return data_object
Note: you should make sure the result sent from the server only uses native data structures (dicts, lists, strings, etc.). If the result includes an object defined in spacy.en
, then the client will automatically import spacy.en
when it unpacks the result, in order to provide the object's methods.
This setup is very similar to the HTTP protocol (server waits for connections, client connects, client sends a request, server sends a response, both sides disconnect). So you might do better to use a standard HTTP server and client instead of this custom code. That would be a "RESTful API", which is a popular term these days (with good reason). Using standard HTTP packages would save you the trouble of managing your own client/server code, and you might even be able to call your data-processing server directly from your existing web server instead of launching my_script.py
. However, you will have to translate your request into something compatible with HTTP, e.g., a GET or POST request, or maybe just a specially formatted URL.
Another option would be to use a standard interprocess communication package such as PyZMQ, redis, mpi4py or maybe zmq_object_exchanger. See this question for some ideas: Efficient Python IPC
Or you may be able to save a copy of the spacy.en
object on disk using the dill
package (https://pypi.python.org/pypi/dill) and then restore it at the start of my_script.py
. That may be faster than importing/reconstructing it each time and simpler than using interprocess communication.
Your target should be to initialize the spacy models only once. Use a class , and make spacy a class attribute. Whenever you would use it, it would be the same instance of the attribute.
from spacy.en import English
class Spacy():
nlp = English()
Since you are using Python you can program some sort of workers (I think at some point you will need to scale you application also) where these initialisation are only done once! We have tried Gearman for similar usecase and it works well.
Cheers
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With