Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running a Tornado Server within a Jupyter Notebook

Taking the standard Tornado demonstration and pushing the IOLoop into a background thread allows querying of the server within a single script. This is useful when the Tornado server is an interactive object (see Dask or similar).

import asyncio
import requests
import tornado.ioloop
import tornado.web

from concurrent.futures import ThreadPoolExecutor

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Hello, world")

def make_app():
    return tornado.web.Application([
        (r"/", MainHandler),
    ])

pool = ThreadPoolExecutor(max_workers=2)
loop = tornado.ioloop.IOLoop()

app = make_app()
app.listen(8888)
fut = pool.submit(loop.start)

print(requests.get("https://localhost:8888"))

The above works just fine in a standard python script (though it is missing safe shutdown). Jupyter notebook are optimal environment for these interactive Tornado server environments. However, when it comes to Jupyter this idea breaks down as there is already a active running loop:

>>> import asyncio
>>> asyncio.get_event_loop()
<_UnixSelectorEventLoop running=True closed=False debug=False>

This is seen when running the above script in a Jupyter notebook, both the server and the request client are trying to open a connection in the same thread and the code hangs. Building a new Asyncio loop and/or Tornado IOLoop does not seem to help and I suspect I am missing something in Jupyter itself.

The question: Is it possible to have a live Tornado server running in the background within a Jupyter notebook so that standard python requests or similar can connect to it from the primary thread? I would prefer to avoid Asyncio in the code presented to users if possible due to its relatively complexity for novice users.

like image 203
Daniel Avatar asked Mar 16 '19 21:03

Daniel


People also ask

Is Jupyter notebook a Web server?

Jupyter notebook is an HTML page. That makes the choice of web server a natural. This allows it to work without requiring you to install another webserver, it's all self-contained. By the way, there is also now a JupyterLab Desktop App available, see here.

Can you use multiple kernels within the same Jupyter notebook?

SoS Notebook is an extension to Jupyter Notebook that allows the use of multiple kernels in one notebook. More importantly, it allows the exchange of data among subkernels so that you can, for example, preprocess data using Bash, analyze the processed data in Python, and plot the results in R.


3 Answers

Based on my recent PR to streamz, here is something that works, similar to your idea:

class InNotebookServer(object):
    def __init__(self, port):
        self.port = port
        self.loop = get_ioloop()
        self.start()

    def _start_server(self):
        from tornado.web import Application, RequestHandler
        from tornado.httpserver import HTTPServer
        from tornado import gen

        class Handler(RequestHandler):
            source = self

            @gen.coroutine
            def get(self):
                self.write('Hello World')

        application = Application([
            ('/', Handler),
        ])
        self.server = HTTPServer(application)
        self.server.listen(self.port)

    def start(self):
        """Start HTTP server and listen"""
        self.loop.add_callback(self._start_server)


_io_loops = []

def get_ioloop():
    from tornado.ioloop import IOLoop
    import threading
    if not _io_loops:
        loop = IOLoop()
        thread = threading.Thread(target=loop.start)
        thread.daemon = True
        thread.start()
        _io_loops.append(loop)
    return _io_loops[0]

To call in the notebook

In [2]: server = InNotebookServer(9005)
In [3]: import requests
        requests.get('http://localhost:9005')
Out[3]: <Response [200]>
like image 179
mdurant Avatar answered Oct 18 '22 02:10

mdurant


Part 1: Let get nested tornado(s)

To find the information you need you would have had to follow the following crumbtrails, start by looking at what is described in the release notes of IPython 7 It particular it will point you to more informations on the async and await sections in the documentation, and to this discussion, which suggest the use of nest_asyncio.

The Crux is the following:

  • A) either you trick python into running two nested event loops. (what nest_asyncio does)
  • B) You schedule coroutines on already existing eventloop. (I'm not sure how to do that with tornado)

I'm pretty sure you know all that, but I'm sure other reader will appreciate.

There are unfortunately no ways to make it totally transparent to users – well unless you control the deployment like on a jupyterhub, and can add these lines to the IPython startups scripts that are automatically loaded. But I think the following is simple enough.

import nest_asyncio
nest_asyncio.apply()


# rest of your tornado setup and start code.

Part 2: Gotcha Synchronous code block eventloop.

Previous section takes only care of being able to run the tornado app. But note that any synchronous code will block the eventloop; thus when running print(requests.get("http://localhost:8000")) the server will appear to not work as you are blocking the eventloop, which will restart only when the code finish execution which is waiting for the eventloop to restart...(understanding this is an exercise left to the reader). You need to either issue print(requests.get("http://localhost:8000")) from another kernel, or, use aiohttp.

Here is how to use aiohttp in a similar way as requests.

import aiohttp
session =  aiohttp.ClientSession()
await session.get('http://localhost:8889')

In this case as aiohttp is non-blocking things will appear to work properly. You here can see some extra IPython magic where we autodetect async code and run it on the current eventloop.

A cool exercise could be to run a request.get in a loop in another kernel, and run sleep(5) in the kernel where tornado is running, and see that we stop processing requests...

Part 3: Disclaimer and other routes:

This is quite tricky and I would advise to not use in production, and warn your users this is not the recommended way of doing things.

That does not completely solve your case, you will need to run things not in the main thread which I'm not sure is possible.

You can also try to play with other loop runners like trio and curio; they might allow you to do stuff you can't with asyncio by default like nesting, but here be dragoons. I highly recommend trio and the multiple blog posts around its creation, especially if you are teaching async.

Enjoy, hope that helped, and please report bugs, as well as things that did work.

like image 37
Matt Avatar answered Oct 18 '22 03:10

Matt


You can make the tornado server run in background using the %%script --bg magic command. The option --bg tells jupyter to run the code of the current cell in background.

Just create a tornado server in one cell alongwith the magic command and run that cell.

Example:

%%script python --bg

import tornado.ioloop
import tornado.web

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Hello, world")

def make_app():
    return tornado.web.Application([
        (r"/", MainHandler),
    ])

loop = tornado.ioloop.IOLoop.current()

app = make_app()
app.listen(8000) # 8888 was being used by jupyter in my case

loop.start()

And then you can use requests in a separate cell to connect to the server:

import requests

print(requests.get("http://localhost:8000"))

# prints <Response [200]>

One thing to note here is that if you stop/interrupt the kernel on any cell, the background script will also stop. So you'll have to run this cell again to start the server.

like image 1
xyres Avatar answered Oct 18 '22 03:10

xyres