A couple years ago, I wrote a blog post on greenlets, threads, and processes. At that time, it was already possible to write things in terms of explicit coroutines (Greg Ewing's original
yield from
proposal already had a coroutine scheduler as an example, Twisted already had @inlineCallbacks
, and asyncio
had even been added to the stdlib), but it wasn't in heavy use yet. Things have changed since then, especially with the addition of the async
and await
keywords to the language (and the popularity of similar constructs in a wide variety of other languages). So, it's time to take a look back (and ahead).Differences
Automatic waiting
Greenlets are the same thing as coroutines, but greenlet libraries likegevent
are not just like coroutine libraries like asyncio
. The key difference is that greenlet libraries do the switching magically, while coroutine libraries make you ask for it explicitly.For example, with
gevent
, if you want to yield until a socket is ready to read from and then read from the socket when waking up, you write this:buf = sock.recv(4096)To do the same thing with
asyncio
, you write this:buf = await loop.sock_recv(sock, 4096)Forget (for now) the difference in whether
recv
is a socket method or a function that takes a socket; the key difference is that await
. In asyncio
, any time you're going to wait for a value, yielding the processor to other coroutines until you're ready to run, you always do this explicitly, with await
. In gevent
, you just call one of the functions that automatically does the waiting for you.In practice, while marking waits explicitly is a little harder to write (especially during quick and dirty prototyping), it seems to be harder to get wrong, and a whole lot easier to debug. And the more complicated things get, the more important this is.
If you miss an
await
, or try to do it in a non-async
function, your code will usually fail hard with a obvious error message, rather than silently doing something undesirable.Meanwhile, let's say you're using some shared container, and you've got a race on it, or a lock that's being held too long. It's dead simple to tell at a glance whether you have an
await
between a read and a write to that container, while with automatic waiting, you have to read every line carefully. Being able to follow control flow at a glance is really one of the main reasons people use Python in the first place, and await
extends that ability to concurrent code.Serial-style APIs
Now it's time to come back to the difference betweensock.recv
and sock_recv(sock)
. The asyncio
library doesn't expose a socket API, it exposes an API that looks sort of similar to the socket API. And, if you look around other languages and frameworks, from JavaScript to C#, you'll see the same thing.It's hard to argue that the traditional socket API is in any objective sense better, but if you've been doing socket programming for a decade or four, it's certainly more familiar. And there's a lot more language-agnostic documentation on how it works, both tutorial and reference (e.g., if you need to look up the different quirks of a function on Linux vs. *BSD, the closer you are to the core syscall, the easier it will be to find and understand the docs).
In practice, however, the vast majority of code in a nontrivial server is going to work at a higher level of abstraction. Most often, that abstraction will be Streams or Protocols or something similar, and you'll never even see the sockets. If not, you'll probably be building your own abstraction, and only the code on the inside—a tiny fraction of your overall code—will ever see the sockets.
One case where using the serial-style APIs really does help, however, is when you've got a mess of already-written code that's either non-concurrent or using threads or processes, and you want to convert it to use coroutines. Rewriting all that code around
asyncio
(no matter which level you choose) is probably a non-trivial project; rewriting it around gevent
, you just import all the monkeypatches and you're 90% done. (You still need to scan your code, and test the hell out of it, to make sure you're not doing anything that will break or become badly non-optimal, of course, but you don't need to rewrite everything.)Conclusion
If I were writing the same blog post today, I wouldn't recommend magic greenlets for most massively-concurrent systems; I'd recommend explicit coroutines instead.There is still a place for
gevent
. But that place is largely in migrating existing threading-based (or on-concurrent) codebases. If you (and your intended collaborators) are familiar enough with threading and traditional APIs, it may still be worth considering for simpler systems. But otherwise, I'd strongly consider asyncio
(or some other explicit coroutine framework) instead.
View comments