Sockets and multiprocessing

If you want to write a forking server, one obvious way to do it is to use the multiprocessing module to create a pool of processes, then hand off requests as you get them.

The problem is that handling a request usually involves reading from and writing to a client socket. So, you need to send the socket object as part of the request. And you can't pickle sockets.

There are easy ways around this for some cases:

For protocols without much data per request and response, there's an easy solution: the main process does the reads, dispatches a buffer instead of a socket, receives back a buffer, and does the writes. But that won't work for, e.g., streaming large files.
If your requests and responses are so big that the cost of process startup/teardown is irrelevant, you can just fork a new process for each job instead of using a pool. Then, as long as you haven't made your sockets non-inheritable, everything just works.
If you're using a custom protocol, it's pretty easy to give each child its own listener socket; the main process then just becomes a load balancer, listening on a well-known port, then telling the clients to reconnect to some other port. Then you don't need any socket migration.

But really, the simplest solution is just sending the socket objects over the queue, right?

Wrong. You can't pickle sockets.

Why can't you pickle sockets?

If you try, the first error you're going to get, on Python 2.x, is something about objects with __slots__ not being picklable. That one's easy to work around—use any protocol other than 1 (which already happens by default in 3.x), or write your own custom pickler for the socket type.

The next problem is that the socket type is a wrapper around various other objects, some of which are C extension types, which aren't documented. You have to dig them out by introspection or by reading the code and writing picklers for them.

But the biggest problem is that the main thing a socket does is wrap up a file descriptor (on POSIX) or a WinSock handle (on Windows). While there are some minor differences between the two, the basic idea is the same, so I'll just talk about file descriptors until we get to Windows details.

A file descriptor is just a number. It's an index into a table of open files (including sockets, pipes, etc.) for your process that the kernel maintains. See Wikipedia for more details, but that should be enough to explain the problem. If your socket has file descriptor 23, and you send the number 23 to some other process, that's not going to mean anything. If you're lucky, the other process's file table doesn't have a #23, so you just get EBADFD errors. If you're unlucky, #23 refers to some completely different file, and you end up with errors that are harder to track down, like sending one client's sensitive data to some random other client, or writing garbage into your config file.

Can you send file descriptors?

Yes! But it's not quite as easy as you'd like. And it's different on *nix and Windows. And it's different on different *nix platforms.

Unix sockets

On *nix platforms, the first thing you need is a Unix socket.

Normally you'll use socketpair to create a socket for each child in the pool, and just inherit it across the fork. This is a bit annoying with multiprocessing.Pool, because it doesn't provide a way to hook the process creation, only to specify an initializer that gets run after creation; you basically have to subclass Process and override the start method. But that's not too hard.

Alternatively, you can just use a Unix socket with a non-anonymous name: create a filename using the tempfile module, then you can pickle and send that filename, then each side can create a socket(AF_UNIX) and call connect. But be warned that this may not work on all platforms; IIRC, at least one system (AIX?) required some special permission to send file descriptors over a non-anonymous Unix socket.

sendmsg

The POSIX sendmsg function allows you to send message data plus ancillary data. The message data is just a list of buffers, but the ancillary data is a list of buffers tagged with a socket level and a message type (just like the socket levels and options in setsockopt, which you might be more familiar with). One of the message types, SCM_RIGHTS, is defined by POSIX as "Indicates that the data array contains the access rights to be sent or received."

So, what are "access rights"? Well, it doesn't say anywhere in the standard. But the way almost every *nix system interprets this, it means that if you send an array of fd's with SCM_RIGHTS via sendmsg over a Unix-domain socket, the kernel will make the same files available, with the same access rights, to the receiver. (The kernel may also renumber the fd's on the way, so don't rely on the fact that file #23 on the sender comes out as file #23 on the receiver.)

The code for this is pretty simple:

    def sendfds(sock, *fds):
        fda = array.array('I', fds).tobytes()
        sock.sendmsg([b'F'], # we have to send _something_
                     [(socket.SOL_SOCKET, socket.SCM_RIGHTS, fda)])

    def recvfds(sock):
        msg, anc, flags, addr = sock.recvmsg(1, 4096)
        fds = []
        for level, type, data in anc:
            fda = array.array('I')
            fda.frombytes(data)
            fds.extend(fda)
        return fds

Notice that I went out of my way to send only one array of sockets, but to receive multiple arrays on the other side. There are a lot of fiddly details that are different between different *nix platforms; the usual rule about "be conservative in what you send, be liberal in what you accept" is extra-important here if you want your code to be portable.

Some platforms have additional message types that (usually together with custom socket options) let you do more than just send file descriptors with sendmsg—you can pass credentials (user IDs, like letting a child sudo to you without needing as password), or verify credentials, or pass capabilities or quota privileges or all kinds of other things. But none of this is cross-platform beyond passing file descriptors with SCM_RIGHTS (and even that is not 100% portable, as mentioned above).

Windows

Windows doesn't have Unix sockets. Instead, it has a function WSADuplicateSocket, which can be used to create a shareable socket, and some opaque data that describes that socket. Unlike Unix, the magic isn't in how you pass the socket handle, it's in the key information embedded in that opaque data. Any process that gets hold of that opaque data can open the same shared socket.

In Python 3.3+, this is dead simple: You call share on a socket, you get back some bytes, you pass them in some way (e.g., pickling it and posting it on the queue), and the child calls socket.fromshare, and that's it:

    def sendsock(channel, pid, sock):

        channel.put(sock.share(pid))

    def recvsock(channel):

        return sock.fromshare(channel.get())

If you need this to work on 3.2 or earlier, you can look at the 3.3 source, but the basic idea is pretty simple from the MSDN docs; it's just a matter of using win32api or ctypes to call the functions.

Wrapping it up

So, how are you going to wrap this up so you can just say "put this socket on the queue"?

Well, you can't quite make it that simple. The problem is that you have to know which child is going to pick up the socket before you can pickle it (to get the appropriate pid or Unix socket). Once you know that, it's pretty easy—but of course with a normal pool, you don't know that until someone picks it up.

One way to do this is to not pass the socket itself, but some kind of key that the child can use to request the socket. At that point, it writes back to you (on a pipe, or a separate queue, or whatever) and says, "I'm PID #69105, and I need socket #23", and you respond by doing the appropriate thing. This might be more readable wrapped up in a future-based API, but at that point you're writing your own SocketMigratingProcessPoolExecutor almost from scratch, so it may not be worth it.

With a lot less rewriting, you can probably modify either ProcessPoolExecutor or multiprocessing.Pool to add a short (depth-1?) queue per process and a queue manager thread in the main process that keeps these queues as full as possible. (Whenever a new task comes in, first look for an idle process, then fall back to a process that's not idle but has an empty per-process queue; if you find either, migrate the socket and add the task to the process's queue.)

As you can see, this isn't going to be trivial, but there's no real conceptual difficulty.

Sockets and multiprocessing

Why can't you pickle sockets?

Can you send file descriptors?

Unix sockets

sendmsg

Windows

Wrapping it up

Add a comment

Stupid Python Ideas

Hybrid Programming

Greenlets vs. explicit coroutines

ABCs: What are they good for?

A standard assembly format for Python bytecode

Unified call syntax

Why heapq isn't a type

Unpacked Bytecode

Everything is dynamic

Wordcode

For-each loops should define a new variable