But what if the request actually takes 30 seconds' worth of work to handle?
This is exactly what threads are for. You package up a job that does the 30 seconds of work and responds to the request when it's done, and kick it out of the event loop to run in parallel, then you can move on to the next event without waiting for it to finish.
The same thing applies to GUI programs. Clicking a button can't stop the interface from responding for 30 seconds. So, if the button means there's 30 seconds of work to do, kick it out of the event loop.
Thread pools
The idea of a thread pool is pretty simple: Your code doesn't have to worry about finding a thread to use, or starting a thread and making sure there aren't too many, or anything like that. You just package up a job, and tell the pool to run it as soon as it can. If there's an idle thread, it'll run your job immediately. If not, your job will go into a queue to be run in its turn.You can build a thread pool pretty easily, but you don't have to, because Python has a really simple one in the standard library: concurrent.futures. (If you're using Python 3.1 or earlier, including 2.7, install and download the backport).
Let's say you've built a server that looks like this:
class FooHandler(BaseHandler): def ping(self, responder): responder('pong') def time(self, responder): responder(time.time()) def add(self, x, y): responder(x+y) def sumrange(self, count): responder(sum(range(count)))
The problem is that if someone calls sumrange with a count of 1000000000, that may take 30 seconds, and during those 30 seconds, your server is not listening to anyone else. So, how do we put that on a thread?
class FooHandler(BaseHandler): def __init__(self): self.executor = ThreadPoolExecutor(max_workers=8) def ping(self, responder): responder('pong') def time(self, responder): responder(time.time()) def add(self, x, y): responder(x+y) def sumrange(self, count): self.executor.submit(lambda: responder(sum(range(count))))
That's all there is to it. Now, sumrange can take as long as it wants, and nobody else is blocked.
Process pools
What happens if two people call sumrange(1000000000) at the same time?Since each request gets a separate thread, and your machine probably has at least two cores, it should still take around 30 seconds to answer both of them, right?
Not in Python. The GIL (Global Interpreter Lock) prevents two Python threads from doing CPU work at the same time. If your threads are doing more I/O than CPU—which is usually the case for servers—that's not a problem, but sumrange is clearly all CPU work. So, it'll take 60 seconds to answer both of them.
Not only that, but in Python 2, there were some problems with the GIL that could cause it to take more than twice as long, and even to intermittently block up the main thread.
What's the solution?
Use processes instead of threads. There's more overhead to passing information between processes than between threads, but all we're passing here is an integer and some "responder" object that's probably ultimately just a simple wrapper around a socket handle.
There are also restrictions on what you can pass. Depending on how that responder was coded, you may be able to pass it, you may not. If you can, the change is as easy as this:
def __init__(self): self.executor = ProcessPoolExecutor(max_workers=8)
If the responder object wasn't coded in a way that lets you pass it between processes, you'll get an error like this:
_pickle.PicklingError: Can't pickle: attribute lookup function failed
Futures
So, what do you do when you get that error?If you can't give the background process the responder object, you need to have it return you the value, so you can call responder on it. But how do you do that?
That "executor.submit" call actually returns something, called a future. We ignored it before, but it's exactly what we need now. A future represents a result that will exist later. In our case, the job is sum(range(count)), so the result is just the sum that the child process will pass back when it finishes all that work.
What good is that?
Well, you can do four basic things with a future:
First, you can wait for it to finish by calling result(), but that brings you right back to the starting point—you sit around for 30 seconds blocking the event loop.
You can wait on a batch of futures at once. That's pretty cool, because it means that a single thread could wait on the results of 1000 jobs run across 8 child processes. But you don't want to have to write that thread if you don't have to, right? Ideally, your event loop framework would know how to wait for futures the same way it waits for sockets. But, unless you're using the PEP 3156 prototype, yours probably doesn't.
You can poll a future, or a batch of futures, to see if it's done. Which means you could just poll your whole list of futures each time through the event loop. But what if some poor user is waiting for his sum, and no other events happen for 10 minutes? So you need to add some kind of timeout or on_timer event or equivalent to make sure you keep checking at least, say, once/second if there are any futures to check.
Finally, you can just attach a callback that will get run whenever the future is finished. Which… doesn't have any down side at all. That's all you want here: to run responder on the result whenever it's ready.
This is a bit tricky to get your head around, but an example shows how simple it actually is:
def sumrange(self, count): future = self.executor.submit(lambda: sum(range(count))) future.add_done_callback(lambda f: responder(f.result()))
Add a comment