Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is possible to keep spacy in memory to reduce the load time? [closed]

Tags:

python

nlp

spacy

I want to use spacy as for NLP for an online service. Each time a user makes a request I call the script "my_script.py"

which starts with:

from spacy.en import English
nlp = English()

The problem I'm having is that those two lines take over 10 seconds, is it possible to keep English() in the ram or some other option to reduce this load time to less than a second?

like image 916
Luis Ramon Ramirez Rodriguez Avatar asked Apr 22 '17 01:04

Luis Ramon Ramirez Rodriguez


3 Answers

You said that you want to launch a freestanding script (my_script.py) whenever a request comes in. This will use capabilites from spacy.en without the overhead of loading spacy.en. With this approach, the operating system will always create a new process when you launch your script. So there is only one way to avoid loading spacy.en each time: have a separate process that is already running, with spacy.en loaded, and have your script communicate with that process. The code below shows a way to do that. However, as others have said, you will probably benefit by changing your server architecture so spacy.en is loaded within your web server (e.g., using a Python-based web server).

The most common form of inter-process communication is via TCP/IP sockets. The code below implements a small server which keeps spacy.en loaded and processes requests from the client. It also has a client which transmits requests to that server and receives results back. It's up to you to decide what to put into those transmissions.

There is also a third script. Since both client and server need send and receive functions, those functions are in a shared script called comm.py. (Note that the client and server each load a separate copy of comm.py; they do not communicate through a single module loaded into shared memory.)

I assume both scripts are run on the same machine. If not, you will need to put a copy of comm.py on both machines and change comm.server_host to the machine name or IP address for the server.

Run nlp_server.py as a background process (or just in a different terminal window for testing). This waits for requests, processes them and sends the results back:

import comm
import socket
from spacy.en import English
nlp = English()

def process_connection(sock):
    print "processing transmission from client..."
    # receive data from the client
    data = comm.receive_data(sock)
    # do something with the data
    result = {"data received": data}
    # send the result back to the client
    comm.send_data(result, sock)
    # close the socket with this particular client
    sock.close()
    print "finished processing transmission from client..."

server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# open socket even if it was used recently (e.g., server restart)
server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server_sock.bind((comm.server_host, comm.server_port))
# queue up to 5 connections
server_sock.listen(5)
print "listening on port {}...".format(comm.server_port)
try:
    while True:
        # accept connections from clients
        (client_sock, address) = server_sock.accept()
        # process this connection 
        # (this could be launched in a separate thread or process)
        process_connection(client_sock)
except KeyboardInterrupt:
    print "Server process terminated."
finally:
    server_sock.close()

Load my_script.py as a quick-running script to request a result from the nlp server (e.g., python my_script.py here are some arguments):

import socket, sys
import comm

# data can be whatever you want (even just sys.argv)
data = sys.argv

print "sending to server:"
print data

# send data to the server and receive a result
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# disable Nagle algorithm (probably only needed over a network) 
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, True)
sock.connect((comm.server_host, comm.server_port))
comm.send_data(data, sock)
result = comm.receive_data(sock)
sock.close()

# do something with the result...
print "result from server:"
print result

comm.py contains code that is used by both the client and server:

import sys, struct
import cPickle as pickle

# pick a port that is not used by any other process
server_port = 17001
server_host = '127.0.0.1' # localhost
message_size = 8192
# code to use with struct.pack to convert transmission size (int) 
# to a byte string
header_pack_code = '>I'
# number of bytes used to represent size of each transmission
# (corresponds to header_pack_code)
header_size = 4  

def send_data(data_object, sock):
    # serialize the data so it can be sent through a socket
    data_string = pickle.dumps(data_object, -1)
    data_len = len(data_string)
    # send a header showing the length, packed into 4 bytes
    sock.sendall(struct.pack(header_pack_code, data_len))
    # send the data
    sock.sendall(data_string)

def receive_data(sock):
    """ Receive a transmission via a socket, and convert it back into a binary object. """
    # This runs as a loop because the message may be broken into arbitrary-size chunks.
    # This assumes each transmission starts with a 4-byte binary header showing the size of the transmission.
    # See https://docs.python.org/3/howto/sockets.html
    # and http://code.activestate.com/recipes/408859-socketrecv-three-ways-to-turn-it-into-recvall/

    header_data = ''
    header_done = False
    # set dummy values to start the loop
    received_len = 0
    transmission_size = sys.maxint

    while received_len < transmission_size:
        sock_data = sock.recv(message_size)
        if not header_done:
            # still receiving header info
            header_data += sock_data
            if len(header_data) >= header_size:
                header_done = True
                # split the already-received data between header and body
                messages = [header_data[header_size:]]
                received_len = len(messages[0])
                header_data = header_data[:header_size]
                # find actual size of transmission
                transmission_size = struct.unpack(header_pack_code, header_data)[0]
        else:
            # already receiving data
            received_len += len(sock_data)
            messages.append(sock_data)

    # combine messages into a single string
    data_string = ''.join(messages)
    # convert to an object
    data_object = pickle.loads(data_string)
    return data_object

Note: you should make sure the result sent from the server only uses native data structures (dicts, lists, strings, etc.). If the result includes an object defined in spacy.en, then the client will automatically import spacy.en when it unpacks the result, in order to provide the object's methods.

This setup is very similar to the HTTP protocol (server waits for connections, client connects, client sends a request, server sends a response, both sides disconnect). So you might do better to use a standard HTTP server and client instead of this custom code. That would be a "RESTful API", which is a popular term these days (with good reason). Using standard HTTP packages would save you the trouble of managing your own client/server code, and you might even be able to call your data-processing server directly from your existing web server instead of launching my_script.py. However, you will have to translate your request into something compatible with HTTP, e.g., a GET or POST request, or maybe just a specially formatted URL.

Another option would be to use a standard interprocess communication package such as PyZMQ, redis, mpi4py or maybe zmq_object_exchanger. See this question for some ideas: Efficient Python IPC

Or you may be able to save a copy of the spacy.en object on disk using the dill package (https://pypi.python.org/pypi/dill) and then restore it at the start of my_script.py. That may be faster than importing/reconstructing it each time and simpler than using interprocess communication.

like image 130
Matthias Fripp Avatar answered Oct 07 '22 05:10

Matthias Fripp


Your target should be to initialize the spacy models only once. Use a class , and make spacy a class attribute. Whenever you would use it, it would be the same instance of the attribute.

from spacy.en import English

class Spacy():
      nlp = English()
like image 37
DhruvPathak Avatar answered Oct 07 '22 05:10

DhruvPathak


Since you are using Python you can program some sort of workers (I think at some point you will need to scale you application also) where these initialisation are only done once! We have tried Gearman for similar usecase and it works well.

Cheers

like image 29
ML_TN Avatar answered Oct 07 '22 06:10

ML_TN