1. Back in 2008, Christopher Lenz wrote a great article called The Truth About Unicode in Python, which covers a lot of useful information that you wouldn't get just by reading the Unicode HOWTO, and that you might not even know to ask about unless you were already… well, not a Unicode expert, but more than a novice.

    Unfortunately, that information is out of date, and he hasn't updated it. So, I think it's time for a new version.

    Read his article for the state of Python 2.5, then come back here for the differences in Python 3.4. (If you care about 2.6-2.7, it's somewhere in between the two, so you may need a bit of trial and error to see for yourself.)

    tl;dr

    Case conversion and internal representation are fixed; collation and regular expressions are improved but not entirely fixed; everything else is mostly the same.

    Collation

    It's still true that the basic str comparison methods compare code points lexicographically, rather than collating properly. His example still gives the same results.

    However, for many uses, locale.strcoll, or locale.strxfrm as a key function, does the right thing.

    Of course whenever you see "many", you'll want to know what the limitations are.

    The biggest issue is that many characters compare differently in different locales. The most famous example is that in different European locales, "ä" may sort as a single character "a" and "b", a single character after "z", or the two characters "ae". In fact, in some countries, there are two different sorting rules, usually called dictionary and phonebook sorting.

    A second issue is that many strings of codepoints are equivalent, but may not be sorted the same. For example, if you've got a script with string literals stored in NFC, but you take input from the terminal which may come in NFKD, any character that can be decomposed won't match when you try to compare them.

    The Unicode Collation Algorithm specifies an algorithm, based on ISO 14651, plus rules for customizing that algorithm for different locales, and a Common Locale Data Repository that includes the customizations for all common locales. (They handle the dictionary-vs.-phonebook rule by treating the phonebook as a separate locale—e.g., in ICU, you specify "de__phonebook" instead of "de".)

    Python does not come with the CLDR data. It relies on your OS to supply the same data as part of its locale library. But traditional Unix locale data doesn't supply everything needed to replace CLDR, and many *nix systems don't come with a complete set of locales anyway—and Windows has nothing at all. So, if you stick with the stdlib, you're usually just getting plain ISO 14651, which means "ä" is going to come after "z". If you want to sort with German rules, you can explicitly setlocale de_DE, but that may fail—or, worse, it may succeed but not affect strcoll/strxfrm.

    The Unicode Collation FAQ explains the differences, with links to the full details.

    Case Conversion

    Python 3.3+ supports the full case conversion rules, including the SpecialCasing.txt table that can convert one character into multiple characters. For example, 'ß'.title() == 'Ss'.

    This also includes the casefold method, which can handle case-insensitive comparisons in cases when lower does the wrong thing—'ß'.casefold() == 'ss'.casefold().

    Regular Expressions

    The long-term intention is (or at least was, at one point) to replace the re module with a new implementation, regex—which, among other things, solves some of the Unicode problems and makes others easier to solve. However, regex wasn't ready for 3.3, or for 3.4, so the re module is still in the stdlib, but has gotten little attention. So, a proper fix for all of these problems may be a few years off.

    First, as Christopher Lenz pointed out, Python's re module doesn't support the TR18 Unicode character classes, like \p{Ll}. This is still true in 3.4, and it's also true in regex, although it should be easier to add support to regex.

    On top of that, while he didn't mention it, case-insensitive searching has always effectively used lower instead of casefold, meaning it won't find "straße" in "Königstrasse". This is fixed in regex, but is not fixed in the 3.4 stdlib module. (When casefold was added, equivalent code didn't appear to be needed in re, because regex was going to be replacing it anyway.)

    It might be nice to add normal-form-independent matching

    Also, character classes like \w were in ASCII mode by default; they're now UNICODE by default. (Also, the LOCALE mode was broken; it's still broken in 3.4; it's been fixed in regex, but it's been quasi-deprecated for so long that I doubt anyone will care.)

    Finally, the re engine didn't support \u, \U, and \N escapes. Since you normally write regular expressions with raw string literals, which of course also don't support such escapes, this made it hard to write patterns for characters that are hard to type, or more readable as escapes. This was fixed in… I think 2.7/3.2; at any rate, it's definitely working in 3.4; re.search(r'\u0040', '\u0040') returns a match.

    Text Segmentation

    This needs to be broken down a bit.

    First, Python strings are now iterables of Unicode code points, not of UTF-32-or-UTF-16-depending-on-build code units. This means s[3] now always a "character" in the sense of a code point, never half a surrogate pair. (See Internal Representation for more on this—and note that many other languages use either UTF-16 or UTF-8 for strings, and therefore still have this problem.)

    However, they're still not iterables of grapheme clusters. For example, a single Hangul syllable may be three Unicode code points, but in most use cases people think of it as a single character. So, given the string "각" (Hangul gag, in NFD form), the length is 3 instead of 1, and s[2] is the lower "ᄀ", which isn't really a separate character.

    But the only way to "fix" this would mean a major tradeoff: in a language where strings are treated as iterables of grapheme clusters (Swift is the only one I know of off-hand), strings aren't randomly accessible. So, s[2] will never give you part of a character, but then s[2] will always give you a TypeError.

    The way to get the best of both worlds is probably to use the Python representation, plus a generator function so you can iterate graphemes(s). (And maybe some helper functions for the previous and next grapheme after a specific point. Swift handles this by having smart "indexes" that act sort of like bidirectional iterators, so s.index(c) doesn't return a number, but it does return something that you can use with s[i] and s++ and s--.)

    Unfortunately, Python doesn't come with such a function. Partly because when it was first suggested long ago, the grapheme rule was so trivial that anyone could write it as a one-liner. (Just read a character, then all continuation characters before the next non-continuation character.) But even if you ignore the the extended grapheme rule (which you usually shouldn't), the legacy rule is more complicated than that nowadays—still easy to write, but not easy to come up with on your own. I've seen at least 4 people suggest that Python should add such a function over the past 5 or so years, but none of them have submitted a patch or provided a real-life use case, so it hasn't happened. I'm pretty sure no one has any objection to adding it, just no one has enough motivation to do it.

    To Unicode, text segmentation also includes word and sentence segmentation, not just characters. But this is pretty clearly outside the scope of the stdlib. For example, as the spec itself mentions, splitting Japanese words is pretty much the same problem as hyphenating English words, and can't be done without a dictionary.

    Bi-directional Text

    I'm not sure what Python could include that would be helpful here. If you're going to write a layout engine that handles bidi text, you need to know a lot more than just the bidi boundaries (like font metrics), and I can't imagine anyone expects any of the other stuff to be part of a string library.

    Python does include the unicodedata.bidirectional function, but that's been there since the early 2.x days, so clearly Christopher Lenz was looking for more than that. I just don't know what, exactly.

    Locale Context

    This is still as much of a problem as ever: locales are global (plus, changing locales isn't thread-safe), which means you can't use the locale module to handle multiple locales in a single app, like a webserver, unless you spawn a subprocess.

    In fact, it may be worse than in 2008. Back then, Python wasn't in much use on resource-limited systems, especially on web servers; nowadays, a lot more people are running Python web servers on Raspberry Pi devices and micro-VMs and so on, and some of these systems have very incomplete locale installations.

    Internal Representation

    This one was fixed in Python 3.3.

    For 3.2, there was some talk about deprecating narrow builds, but nobody pushed it very hard. Nobody brought up the disk space issue again (it really isn't relevant; normally you don't store raw Python objects on disk…), but people did bring up the memory issue, and some other issues with dealing with Windows/Cocoa/Java/whatever UTF-16 APIs.

    This dragged on until it was too late to fix 3.2, but Martin van Löwis took some of the ideas that came out of that discussion and turned them into PEP 393, which made it into 3.3. Now, strings use 1 byte/character if they're pure Latin-1, 2 bytes if they're pure UCS-2, 4 bytes otherwise. Wider strings can also cache their UTF-8 and/or UTF-16 representation if needed. The old C APIs are still supported (with no significant loss in speed for any extension anyone could find), and there's a set of new, cleaner C APIs that can be used when you want a specific width or as much speed as possible.

    Other Stuff

    There are a few problems that weren't mentioned in the original blog that are still minor problems today.

    A lot of effort over the past couple years has gone into making it easier for people to install, use, and require third-party modules. Python now comes with pip, and many popular packages that need to be compiled are available as pre-compiled wheels. (And the stable API, and the lack of narrow vs. wide builds, makes that a lot less of a nightmare than it would have been to implement in the old days.) 

    And this has brought with it an attitude that many things not only don't have to be in the stdlib, but shouldn't be there, because they need to evolve at a different pace. For example, while the UCD doesn't change that often (Python 3.4.2 is on UCD 6.3.0, and 3.5 should be on 7.0.0, which just came out in October 2014), supplemental data like the CLDR change more often (it's up to 26.0.1). If the recommended way of dealing with CLDR-compatible sorting is to use PyICU or PyUCA, Python doesn't have to update multiples times per year.

    Meanwhile, the unicodedata module doesn't contain all of the information in the UCD tables. Basically, it only contains the data from UnicodeData.txt; you can't ask what block a character is in, or the annotations, or any of the extra information (like the different Shift-JIS mappings for each Emoji). And this hasn't changed much since 2008. This is probably reasonable given that the full database is over 3MB, and Python itself is only about 14MB, but people have been annoyed that they can write 90% of what they want with the stdlib only to discover that for the last 10% they need to either get a third-party library or write their own UCD parser.

    Finally, Python 3.0 added a problem that 2.5 didn't have—it made it hard, in many places, to deal with text in an unknown encoding, or in a combination of ASCII-compatible text and non-text, both of which you often want to search for ASCII-compatible segments (e.g., consider munging HTTP response headers; it you have to use bytes because the body can't be decoded, but there's no bytes.format or bytes.__mod__, this is no fun). Python 3.3, 3.4, and 3.5 have all added improvements in this area (e.g., surrogateescape means the body can be decoded, or bytes.__mod__ means it doesn't have to be).

    Conclusions

    Some of the changes that Christopher Lenz wanted 6 years ago have been made in the stdlib. And some don't belong there—I think the right solution for collation beyond what locale can do should be to encourage people to use PyUCA, Babel, or PyICU (although it would still be nice if there were a more Pythonic wrapper around ICU than PyICU, as he called for 6 years ago). But there are at least two problems that I think still need to be fixed.

    Hopefully one day, either the regex module will be ready, or people will give up on it and start improving the re module instead. Either way, it's high time ICU-style character classes were added (and, if the answer is to stick with re, proper casefolding too).

    Some way to iterate a string by grapheme clusters would be a great addition to the stdlib. Someone just needs to write it.
    1

    View comments

  2. You've probably been told that you can convert any recursive function into an iterative loop just by using an explicit stack.

    Tail-recursive functions

    Whenever you get an example, it's usually something that's trivial, because the function was tail-recursive, so you don't even need a stack:

        def fact(n, value=1):
            if n<2:
                return 1
            return fact(n-1, value*n)
    
    That maps directly to:
        def fact(n):
            value = 1
            while True:
                if n<2:
                    return value
                n, value = n-1, value*n
    
    You can merge the while and if, at which point you realize you have a for in disguise. and then you can realize that, multiplication being commutative and associative and all that, you might as well turn the loop around, so you get:
        def fact(n):
            value = 1
            for i in range(2, n+1):
                value *= i
            return value
    
    (You might also notice that this is just functools.reduce(operator.mul, range(2, n+1), 1), but if you're the kind of person who notices that and finds it more readable, you probably also rewrote the tail-recursive version into a recursive fold/reduce function, and all you had to do was find an iterative reduce function to replace it with.)

    Continuation stacks

    Your real program isn't tail-recursive. Either you didn't bother making that transformation because your language doesn't do tail call elimination (Python doesn't), or the whole reason you're switching from recursive to iterative in the first place is that you couldn't figure out a clean way to write your code tail-recursively.

    So, now you need a stack. But what goes on the stack?

    The most general answer to that is that you want continuations on the stack: what the result of the function does with the result of each recursive call. That may sound scary, and in general it is… but in most practical cases, it's not.

    Let's say you have this:
        def fact(n):
            if n < 2:
                return 1
            return n * fact(n-1)
    
    What's the continuation? It's "return n * _", where that _ is the return value of the recursive call. You can write a function with one argument that does that. (What about the base case? Well, a function of 1 argument can always ignore its argument). So, instead of storing continuations, you can just store functions:
        def fact(n):
            stack = []
            while True:
                if n < 2:
                    stack.append(lambda _: 1)
                    break
                stack.append(lambda _, n=n: _ * n)
            value = None
            for frame in reversed(stack):
                value = frame(value)
            return value
    
    (Notice the n=n in the second lambda. See the Python FAQ for an explanation, but basically it's to make sure we're building a function that uses the current value of n, instead of one that closes over the variable n.)

    This is undeniably kind of ugly, but we can start simplifying it. If only the base case and the recursive call had the same form, we could factor out the whole function, right? Well, if we start with 1 instead of None, the base case can return _ * 1. And then, yes, we can factor out the whole function, and just store each n value on the stack:
        def fact(n):
            stack = []
            while True:
                if n < 2:
                    stack.append(1)
                    break
                stack.append(n)
            value = 1
            for frame in reversed(stack):
                value = value * frame
            return value
    
    But once we're doing this, why even store the 1? And, once you take that out, the while loop is obviously a for loop over a range in disguise:
        def fact(n):
            stack = []
            for i in range(n, 1, -1):
                stack.append(i)
            value = 1
            for frame in reversed(stack):
                value *= frame
            return value
    
    Now stack is obviously just list(range(n, 1, -1)), so we can skip the loop entirely:
        def fact(n):
            stack = list(range(n, 1, -1))
            value = 1
            for frame in reversed(stack):
                value *= frame
            return value
    
    Now, we don't really care that it's a list, as long as it's something we can pass to reversed. In fact, why even call reversed on a backward range when we can just write a forward range directly?
        def fact(n):
            value = 1
            for frame in range(2, n+1):
                value *= frame
            return value
    
    Not surprisingly, we ended up with the same function we got from the tail recursive starting point.

    Interpreter stacks

    Is there a way to do this in general without stacking up continuations? Of course there is. After all, an interpreter doesn't have to call itself recursively just to execute your recursive call (even if CPython does, Stackless doesn't…), and your CPU certainly isn't calling itself recursively to execute compiled recursive code.

    Here's what a function call does: The caller pushes the "program counter" and the arguments onto the stack, then it jumps to the callee. The callee pops, computes the result, pushes the result, jumps to the popped counter. The only issue is that the callee can have locals that shadow the caller's; you can handle that by just pushing all of your locals (not the post-transformation locals, which include the stack itself, just the set used by the recursive function) as well.

    This sounds like it might be hard to write without a goto, but you can always simulate goto with a loop around a state machine. So:

        State = enum.Enum('State', 'start cont done')
    
        def fact(n):
            state = State.start
            stack = [(State.done, n)]
            while True:
                if state == State.start:
                    pc, n = stack.pop()
                    if n < 2:
                        # return 1
                        stack.append(1)
                        state = pc
                        continue
                    # stash locals
                    stack.append((pc, n))
                    # call recursively
                    stack.append((State.cont, n-1))
                    state = State.start
                    continue
                elif state == State.cont:
                    # get return value
                    retval = stack.pop()
                    # restore locals
                    pc, n = stack.pop()
                    # return n * fact(n-1)
                    stack.append(n * retval)
                    state = pc
                    continue
                elif state == State.done:
                    retval = stack.pop()
                    return retval
    
    Beautiful, right? Well, we can find ways to simplify this. Let's start by using one of the tricks native-code compilers use: in addition to the stack, you've also got registers. As long as you've got enough registers, you can pass arguments in registers instead of on the stack, and you can return values in registers too. And we can just use local variables for the registers. So:
        def fact(n):
            state = State.start
            pc = State.done
            stack = []
            while True:
                if state == State.start:
                    if n < 2:
                        # return 1
                        retval = 1
                        state = pc
                        continue
                    stack.append((pc, n))
                    pc, n, state = State.cont, n-1, State.start
                elif state == State.cont:
                    state, n = stack.pop()
                    retval = n * retval
                elif state == State.done:
                    return retval
    
    1

    View comments

  3. If you want to write a forking server, one obvious way to do it is to use the multiprocessing module to create a pool of processes, then hand off requests as you get them.

    The problem is that handling a request usually involves reading from and writing to a client socket. So, you need to send the socket object as part of the request. And you can't pickle sockets.

    There are easy ways around this for some cases:

    • For protocols without much data per request and response, there's an easy solution: the main process does the reads, dispatches a buffer instead of a socket, receives back a buffer, and does the writes. But that won't work for, e.g., streaming large files.
    • If your requests and responses are so big that the cost of process startup/teardown is irrelevant, you can just fork a new process for each job instead of using a pool. Then, as long as you haven't made your sockets non-inheritable, everything just works.
    • If you're using a custom protocol, it's pretty easy to give each child its own listener socket; the main process then just becomes a load balancer, listening on a well-known port, then telling the clients to reconnect to some other port. Then you don't need any socket migration.

    But really, the simplest solution is just sending the socket objects over the queue, right?

    Wrong. You can't pickle sockets.

    Why can't you pickle sockets?

    If you try, the first error you're going to get, on Python 2.x, is something about objects with __slots__ not being picklable. That one's easy to work around—use any protocol other than 1 (which already happens by default in 3.x), or write your own custom pickler for the socket type.

    The next problem is that the socket type is a wrapper around various other objects, some of which are C extension types, which aren't documented. You have to dig them out by introspection or by reading the code and writing picklers for them.

    But the biggest problem is that the main thing a socket does is wrap up a file descriptor (on POSIX) or a WinSock handle (on Windows). While there are some minor differences between the two, the basic idea is the same, so I'll just talk about file descriptors until we get to Windows details.

    A file descriptor is just a number. It's an index into a table of open files (including sockets, pipes, etc.) for your process that the kernel maintains. See Wikipedia for more details, but that should be enough to explain the problem. If your socket has file descriptor 23, and you send the number 23 to some other process, that's not going to mean anything. If you're lucky, the other process's file table doesn't have a #23, so you just get EBADFD errors. If you're unlucky, #23 refers to some completely different file, and you end up with errors that are harder to track down, like sending one client's sensitive data to some random other client, or writing garbage into your config file.

    Can you send file descriptors?

    Yes! But it's not quite as easy as you'd like. And it's different on *nix and Windows. And it's different on different *nix platforms.

    Unix sockets

    On *nix platforms, the first thing you need is a Unix socket.

    Normally you'll use socketpair to create a socket for each child in the pool, and just inherit it across the fork. This is a bit annoying with multiprocessing.Pool, because it doesn't provide a way to hook the process creation, only to specify an initializer that gets run after creation; you basically have to subclass Process and override the start method. But that's not too hard.

    Alternatively, you can just use a Unix socket with a non-anonymous name: create a filename using the tempfile module, then you can pickle and send that filename, then each side can create a socket(AF_UNIX) and call connect. But be warned that this may not work on all platforms; IIRC, at least one system (AIX?) required some special permission to send file descriptors over a non-anonymous Unix socket.

    sendmsg

    The POSIX sendmsg function allows you to send message data plus ancillary data. The message data is just a list of buffers, but the ancillary data is a list of buffers tagged with a socket level and a message type (just like the socket levels and options in setsockopt, which you might be more familiar with). One of the message types, SCM_RIGHTS, is defined by POSIX as "Indicates that the data array contains the access rights to be sent or received."

    So, what are "access rights"? Well, it doesn't say anywhere in the standard. But the way almost every *nix system interprets this, it means that if you send an array of fd's with SCM_RIGHTS via sendmsg over a Unix-domain socket, the kernel will make the same files available, with the same access rights, to the receiver. (The kernel may also renumber the fd's on the way, so don't rely on the fact that file #23 on the sender comes out as file #23 on the receiver.)

    The code for this is pretty simple:
        def sendfds(sock, *fds):
            fda = array.array('I', fds).tobytes()
            sock.sendmsg([b'F'], # we have to send _something_
                         [(socket.SOL_SOCKET, socket.SCM_RIGHTS, fda)])
    
        def recvfds(sock):
            msg, anc, flags, addr = sock.recvmsg(1, 4096)
            fds = []
            for level, type, data in anc:
                fda = array.array('I')
                fda.frombytes(data)
                fds.extend(fda)
            return fds
    
    Notice that I went out of my way to send only one array of sockets, but to receive multiple arrays on the other side. There are a lot of fiddly details that are different between different *nix platforms; the usual rule about "be conservative in what you send, be liberal in what you accept" is extra-important here if you want your code to be portable.

    Some platforms have additional message types that (usually together with custom socket options) let you do more than just send file descriptors with sendmsg—you can pass credentials (user IDs, like letting a child sudo to you without needing as password), or verify credentials, or pass capabilities or quota privileges or all kinds of other things. But none of this is cross-platform beyond passing file descriptors with SCM_RIGHTS (and even that is not 100% portable, as mentioned above).

    Windows

    Windows doesn't have Unix sockets. Instead, it has a function WSADuplicateSocket, which can be used to create a shareable socket, and some opaque data that describes that socket. Unlike Unix, the magic isn't in how you pass the socket handle, it's in the key information embedded in that opaque data. Any process that gets hold of that opaque data can open the same shared socket.

    In Python 3.3+, this is dead simple: You call share on a socket, you get back some bytes, you pass them in some way (e.g., pickling it and posting it on the queue), and the child calls socket.fromshare, and that's it:

        def sendsock(channel, pid, sock):
            channel.put(sock.share(pid))
    
        def recvsock(channel):
            return sock.fromshare(channel.get())

    If you need this to work on 3.2 or earlier, you can look at the 3.3 source, but the basic idea is pretty simple from the MSDN docs; it's just a matter of using win32api or ctypes to call the functions.

    Wrapping it up

    So, how are you going to wrap this up so you can just say "put this socket on the queue"?

    Well, you can't quite make it that simple. The problem is that you have to know which child is going to pick up the socket before you can pickle it (to get the appropriate pid or Unix socket). Once you know that, it's pretty easy—but of course with a normal pool, you don't know that until someone picks it up.

    One way to do this is to not pass the socket itself, but some kind of key that the child can use to request the socket. At that point, it writes back to you (on a pipe, or a separate queue, or whatever) and says, "I'm PID #69105, and I need socket #23", and you respond by doing the appropriate thing. This might be more readable wrapped up in a future-based API, but at that point you're writing your own SocketMigratingProcessPoolExecutor almost from scratch, so it may not be worth it.

    With a lot less rewriting, you can probably modify either ProcessPoolExecutor or multiprocessing.Pool to add a short (depth-1?) queue per process and a queue manager thread in the main process that keeps these queues as full as possible. (Whenever a new task comes in, first look for an idle process, then fall back to a process that's not idle but has an empty per-process queue; if you find either, migrate the socket and add the task to the process's queue.)

    As you can see, this isn't going to be trivial, but there's no real conceptual difficulty.
    0

    Add a comment

  4. I was recently trying to optimize some C code using cachegrind, and discovered that branch misprediction in an inner loop was the culprit. I began wondering how much anything similar could affect Python code. After all, especially in CPython, every opcode is going through hundreds of lines of ceval code, with a giant switch and multiple branches, and so on, so, how much difference could a branch at the Python level make?

    To find out, I turned to the famous Stack Overflow question Why is processing a sorted array faster than an unsorted array? That question has a simple C++ test case where the code runs ~6x as fast on x86 and x86_64 platforms with most compilers, and it's been verified to be similar in other compiled languages like Java, C#, and go.

    So, what about Python?

    The code

    The original question takes a 32K array of bytes, and sums the bytes that are >= 128:
        long long sum = 0;
        /* ... */
        for (unsigned c = 0; c < arraySize; ++c)
        {
            if (data[c] >= 128)
                sum += data[c];
        }
     
    In Python terms:
        total = 0
        for i in data:
            if i >= 128:
                total += i
    
    When run with a random array of bytes, this usually takes about 6x as long as when run with the same array pre-sorted. (Your mileage may vary, because there's a way the compiler can optimize away the loop for you, at least on x86 and x86_64, and some compilers will, depending on your optimization settings, figure that out for you. If you want to know more, read the linked question.)

    So, here's a simple test driver:
        import functools
        import random
        import timeit
    
        def sumbig(data):
            total = 0
            for i in data:
                if i >= 128:
                    total += i
            return total
    
        def test(asize, reps):
            data = bytearray(random.randrange(256) for _ in range(asize)
            t0 = timeit.timeit(functools.partial(sumbig, data), number=reps)
            t1 = timeit.timeit(functools.partial(sumbig, bytearray(sorted(data))), number=reps)
            print(t0, t1)
    
        if __name__ == '__main__':
            import sys
            asize = int(sys.argv[1]) if len(sys.argv) > 1 else 32768
            reps = int(sys.argv[2]) if len(sys.argv) > 2 else 1000
            test(asize, reps)
    
    I tested this on a few different computers, using Apple pre-installed CPython 2.7.6, Python.org CPython 2.7.8 and 3.4.1, Homebrew CPython 3.4.0, and a custom build off trunk, and it pretty consistently saves about 12% to pre-sort the array. Nothing like the 84% savings in C++, but still, a lot more than you'd expect. After all, we're doing roughly 65x as much work in the CPython ceval loop than we were doing in the C++ loop, so you'd think the difference would be lost in the noise, and we're also doing much larger loops, so you'd think branch prediction wouldn't be helping as much in the first place. But if you watch the BR_MISP counters in Instruments, or do the equivalent with cachegrind, you'll see that it's mispredicting a lot more branches for the unsorted case than the sorted case—as in 1.9% of the total conditional branches in the ceval loop instead of 0.1%. Presumably, even though this is still pretty small, and nothing like what you see in C, the cost of the misprediction is also higher? It's hard to be sure…

    You'd expect a much bigger benefit in the other Python implementations. Jython and IronPython compile to JVM and ILR code, which then gets the same kind of JIT recompilation as Java and C#, so if those JITs aren't smart enough to optimize away the branch with a conditional move instruction for Java and C#, they won't be for Python code either. PyPy is JIT-compiling directly from Python—plus, its JIT is itself driven by tracing, which can be hurt by the equivalent of branch misprediction at a higher level. And in fact I found a difference of 54% in Jython 2.5.1, 68% in PyPy 2.3.1 (both the 2.7 and 3.2 versions), and 72% in PyPy 2.4.0 (2.7 only).

    C++ optimizations


    There are a number of ways to optimize this in C, but they all amount to the same thing: find some way to do a bit of extra work to avoid the conditional branch—bit-twiddling arithmetic, a lookup table, etc. The lookup table seems like the most likely version to help in Python, so here it is:
        table = bytearray(i if i >= 128 else 0 for i in range(256))
        total = 0
        for i in a:
            total + table[i]
    
    To put this inside a function, you'd want to build the table globally instead of once per call, then copy it to a local inside the function to avoid the slow name lookup inside the inner loop:
        _table = bytearray(i if i >= 128 else 0 for i in range(256))
        def sumbig_t(data):
            table = _table
            total = 0
            for i in a:
                total + table[i]
    
    The sorted and unsorted arrays are now about the same speed. And for PyPy, that's about 13% faster than with the sorted data in the original version. For CPython, on the other hand, it's 33% slower. I also tried the various bit-twiddling optimizations; they're slightly slower than the lookup table in PyPy, and at least 250% slower in CPython.

    So, what have we learned? Python, even CPython, can be affected by the same kinds of low-level performance problems as compiled languages. The alternative implementations can also handle those problems the same way. CPython can't… but then if this code were a bottleneck in your problem, you'd almost certainly be switching to PyPy or writing a C extension anyway, right?

    Python optimizations

    There's an obvious Python-specific optimization here—which I wouldn't even really call an optimization, since it's also a more Pythonic way of writing the code:
        total = sum(i for i in data if i >= 128)
    
    This does in fact speed things up by about 13% in CPython, although it slows things down by 217% in PyPy. It leaves us with the same original difference between random and sorted arrays, and the same basic effects for applying the table or bit twiddling.

    You'd think that taking two passes, applying the table to the data and then summing the result, would obviously be slower, right? And normally you'd be right; a quick test shows a 31% slowdown. But if you think about it, when we're dealing with bytes, applying the table can be done with bytes.translate. And of course the sum is now just summing a builtin sequence, not a generator. So we effectively have two C loops instead of one Python loop, which may be a win:
        def sumbig_s(data):
            return sum(data.translate(bytearray(_table)))
    
    For CPython, this saves about 83% of the time vs. the original sumbig's speed on unsorted data. And the difference between sorted and unsorted data is very small, so it's still an 81% savings on sorted data. However, for PyPy, it's 4% slower for unsorted data, and you lose the gain for sorted data.

    If you think about it, while we're doing that pass, we might as well just remove the small values instead of mapping them to 0:
        def sumbig_s2(data):
            return sum(data.translate(bytearray(_table), bytearray(range(128))))
    
    That ought to make the first loop a little slower, and affected by branch misprediction, while making the second loop twice as fast, right? And that's exactly what we see, in CPython. For unsorted data, it cuts 90% off the original code, and now sorting it gives us another 36% improvement. That's near PyPy speeds. On the other hand, in PyPy, it's just as slow for sorted data as the previous fix, and twice as slow for unsorted data, so it's even more of a pessimization.

    What about using numpy? The obvious implementation is:
        def sumbig_n(data):
            data = np.array(data)
            return data[data>=128].sum()
    
    In CPython, this cuts 90% of the speed off the original code for unsorted data, and for sorted data it cuts off another 48%. Either way, it's the fastest solution so far, even faster than using PyPy. But we can do the equivalent of translating the if to arithmetic or bit-twiddling too. I tried tricks like ~((data-128)>>31)&data, but the fastest turned out to be the simplest:
        def sumbig_n2(data):
            data = np.array(data)
            return (data&(data>=128)).sum()
    
    Now just as fast for unsorted data as for sorted, cutting 94% of the time off the original.

    Conclusions

    So, what does this mean for you?

    Well, most of the time, nothing. If your code is too slow, and it's doing arithmetic inside a loop, and the algorithm itself can't be improved, the first step should almost always be converting to numpy, PyPy, or C, and often that'll be the last step as well.

    But it's good to know that the same issues that apply to low-level code still affect Python, and in some cases (especially if you're already using numpy or PyPy) the same solutions may help.
    3

    View comments

  5. A lot of people have questions like this:
    I've got a 100MB CSV file. I read it in to a list, populated with a csv.DictReader, and my computer starts swapping. Why?
    Let's look at what it takes to store a 100MB file as a list of dicts of strings.

    But first, let's look at how to solve this problem, instead of how to understand it.

    Very often, you don't actually need the whole thing in memory; you're just, e.g., reading the rows from one CSV file, processing each one independently, and writing out the result to a different CSV file. You can do that by using the csv.DictReader as an iterator—for each row in the reader, write a row to the writer, without storing it anywhere. Then you only need enough memory for one row at a time.

    Even if you do more global work (where one row's processing may depend on another row far away), do you really need this all in memory, or just in some accessible form? If you read the CSV and use it to iteratively build a database (whether a SQL database, a dbm, or your favorite kind of NoSQL store), you can get reasonably fast random access with bounded memory usage.

    Where does the memory go?

    First, let's make it specific. We've got 200K rows of 25 columns of an average of 20 characters. All of the characters are ASCII. We're using 64-bit CPython 3.4. (I'll explain how changing each of these assumptions can change things as I go along.)

    Reading the file into memory

    If you just open the file and write contents = f.read(), you get a 100MB string. That takes 100MB of memory for the string storage, plus 48 bytes for a str header, plus possibly 1 byte for a NUL terminator. So, still effectively just 100MB.

    How do I know this? Well, you can read the source, but it's easier to just call sys.getsizeof on an empty string.

    This only works because we're using CPython 3.3+ with all-ASCII data. Python 3.x strings are Unicode. In CPython 3.0-3.2, that means either 2 or 4 bytes per character, depending on your build, which would mean 200MB or 400MB. In 3.3 and later, strings are one byte per character if they're pure ASCII, two if they're pure BMP, and four otherwise. Of course you can stay with un-decoded bytes objects instead of strings—and in Python 2.7, that's often what you'd do.

    Splitting into lines

    If you read in a list of lines, with contents = list(f), you get 200K strings, each of which has that 48-byte header on top of its 500-byte string data. So, that's an extra 9.2MB.

    Plus, the list itself has a 48-byte header, and it also needs to store references to all of those strings. In CPython, each of those references is a pointer, so in 64-bit-land, that's 8 bytes. Also, lists have extra slack on the end. (Why? Expanding a list requires allocating a new chunk of memory, copying all the old values over, and freeing the old chunk. That's expensive. If you did that for every append, lists would be too slow to use. But if you leave room for extra values on each expansion—with the extra room proportional to the current size—the reallocations and copies amortize out to essentially free. The cost is a bit of wasted space.) Let's assume it's 220K pointers; that's an extra 1.7MB.

    Splitting into columns

    If you split each line into 25 columns, whether by using contents = [line.split(',') for line in f] or by using contents = list(csv.reader(f)), you get 200K * 25 strings, each of which has that 48-byte header, plus 200K lists, each with a 48-byte header plus a bit over 25 pointers, plus of course the original list of 220K pointers. Now we're talking about 281MB.

    Storing in dicts

    If you convert each line into a dict instead of a list, e.g., by using contents = list(csv.DictReader(f)), you've got the same strings. (You don't get a bunch of separate copies of the key strings; each dict just references the same strings.)

    The only difference is that instead of 200K lists, you have 200K dicts. A dict is bigger than a list for three reasons. First, each slot in a dict has to hold a hash value, a key pointer, and a value pointer, instead of just a value pointer. Seconds, dicts are more complicated, so they have bigger headers. Finally, dicts need more slack than lists (because they're more expensive to copy when they expand, because you have to reprobe every hash instead of just copying a big buffer of memory, and because hash tables work best with specific types of numbers for their capacity).

    So, the 200K * (48 + 25 * 8) part is unchanged, but the 200K * (48 + 27 * 8) turns into 200K * (96 + 32 * 24). Now we're talking 624MB.

    Couldn't Python do better?

    One obvious thing Python could so is to store each dict as two pieces: a hash table full of keys and their hashes, which would be shared by all those 200K rows, plus a hash table full of just values, which would be separate for each row. CPython 3.4+ actually has the ability to do this, but it only gets triggered in special cases, like the attribute dictionaries of objects of the same type. (And yes, this means that if you replace that list of dicts with a list of instances of some class, you will actually save some memory.)

    You could take this even farther by using a shared hash table full of keys, hashes, and indices, and then each separate dict would just have an array of values, without the need for that extra slack. PyPy 2.0+ can do this, although I'm not sure exactly where it get triggered (except that I am sure that the class trick would work here, and work even better than in CPython).

    Beyond that, is there a way we could get rid of the "boxing", the need to store both the strings and the array full of pointers to those strings?

    Well, if all of your column values are about the same length, then you could store the strings directly in the hash table (or array). See fixedhash for a quick&dirty implementation of a hash table with fixed-size keys. But typically, a CSV file has some columns that are only 2 or 3 characters long, and others than are 50 characters or longer, so you'd be wasting more space than you'd be saving.

    Failing that, you could replace the 8-byte strings with small integer indices into a giant string. How small could this be? Well, your string is 100MB long, so that takes 27 bits. If your longest column value is 64 characters long, you need another 7 bits for the length. So that's 34 bits. Accessing arrays by bits is slow and complicated, but by bytes can be reasonable, which means you can use 5 bytes instead of 8 for each string. This cuts 38% out of the cost of one of the smaller but still significant parts of your storage.

    But ultimately, you're talking about having 5 million objects alive. That's going to take a good chunk of memory to store. The only way to avoid that is to not store all those objects. 
    1

    View comments

  6. There are hundreds of questions on StackOverflow from people who want to know something like, "How do I edit a file in-place, without creating a new file."

    In general, the answer is, "You can't. You really do want to create a new file." (There is an exception, which I'll get to later.)

    Why? Well, files are just streams of bytes—or, for text files, streams of characters. If you try to edit a file by replacing a 60-character line with a 70-character line, you've just overwritten the first 10 characters of the next line.

    Of course you can get around that by shifting the entire rest of the file up by 10 characters (or, equivalently and more simply, by reading the whole rest of the file after your line, then truncating before your line, then writing your new line, then writing the rest of the file), but that's not only complicated, it's very slow, and can require huge amounts of memory if you're dealing with huge files.

    Using temporary files

    But you don't want a new file, you want to change the original file. Your text editor doesn't make you give the file a new name every time you save, after all. So how does it let you insert text in the middle of a file?

    How your text editor works

    When you tell your text editor to save, it creates a new temporary file, then moves that temporary over the original. To the end-user, that looks like the file has been changed in-place. But it does it without all the complexity and slowness of shifting most of the file around for every change.

    If you think about it, the text editor obviously isn't editing the file in place for every change anyway; nothing gets changed on disk until you tell it to save.

    A simple text editor will read the whole file into a big array (like a Python list or string) in memory, edit that in place. But that's still shifting a bunch of memory around for every insertion or deletion, which can be a problem for big files. A smarter text editor will use a data structure like a rope—a linked list of arrays. When you want to edit in the middle of one of those arrays, it just splits that array in half around the part you're replacing, and inserts a new one for your new text. An even smarter text editor will replace most of the arrays in that linked list with references to disk positions, only reading them into memory as-needed. And there are other tricks a text editor can do to be even cleverer.

    But regardless of how it's storing things, when you save, it just writes that array to a temporary file, or walks the linked list and for each chunk writes the array or copies from the specified positions in the original file, etc.

    Besides being simpler and a whole lot faster, this has other advantages. The biggest one is that saving becomes atomic. If you pull the plug in the middle of a save, obviously you've lost any changes that weren't saved left—but you still have the previous version of the file around, which you wouldn't have if you tried to edit a file in place. This also makes it a lot easier to add features like autosaving, backup files, etc.

    A concrete example

    The most common example people who ask this question have is modifying a CSV file. You have a CSV file with columns for Spam, Eggs, Beans, and Bill, but now your prices have changed, so you want to replace the Bill value in each row with a different one, computed from the new prices. So:
        import csv
        import tempfile
        with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp, \
             open('breakfast.csv', 'rb') as f:
            r = csv.reader(f)
            w = csv.writer(tmp)
            header = next(r)
            w.writerow(header)
            for row in r:
                bill = 2.39 * int(row[0]) + 1.98 * int(row[1]) + 0.33 * int(row[2])
                row[3] = format(bill, '.2f')
                w.writerow(row)
        os.replace(tmp.name, 'breakfast.csv')
    
    If you're using Python 3.2 or earlier (including 2.x), you don't have os.replace. In that case, see Getting atomic writes right.

    Exception: fixed-length records

    Some people are still going to say, "But it must be possible to edit a file in place. If nothing else, filesystems have to be able to do exactly that, right?"

    And of course that's true. And in very simple cases, it can be worth doing. But as things get more complex, the tradeoff gets worse. So let's start with the simplest case.

    Let's say you have a file made up of records, each one of which is exactly 100 bytes long. In that case, it's pretty easy to edit record #789: just seek to 789*100 bytes in, then write the new 100-byte record over the old one.

    (Note that a text file made up of 100-character lines does not have fixed-length records. 100 characters is only 100 bytes if your characters all happen to fit in 1 byte each (so if you're using UTF-8, if they all happen to be ASCII), and if your platform's newlines are "\n". If you're sure that's the case, then you should open the file as a binary file, and encode/decode your strings with the 'ascii' codec.)

    But what if you want to insert a record, or delete one? Well, if the records don't have any particular order, this isn't too much of a problem: you can always insert at the end, and delete by copying the last record over the deleted record and then truncating the last 100 bytes off the file. (Or, if you don't want to invalidate indexes or iterators you might have lying around elsewhere in your program, you can just mark the records deleted in some way, then compact at the end.)

    But if the records have some meaningful order (e.g., they're sorted), then you're back to shifting the whole rest of the file up or down by 100 bytes, in which case you might as well do the atomic write-temporary-and-replace thing.

    So, that's where things stop being simple. If you want to know how to get more complex, read on; if not, just skip the whole next section.

    Fun with complex file formats

    Chunks

    What if you make your file format more complex? For example, let's say you leave large gaps in your file (possibly even huge gaps, if your platform supports sparse files), and you use a header every N records that says which records within that chunk are in use and which aren't. (You'll want the chunk size to be an even multiple of the system page size—if you don't know what that is, 4096 is a good number.)

    Then you can insert and delete, and only on the rare occasions where you've inserted so much that you fill an entire chunk do you need to shift anything outside the current chunk. (And even then, if there's room in the next or previous chunk, you can get away with just shifting within two chunks.) You will occasionally have to rewrite most of the file, but for many use cases that will be very rare.

    If you think about it, this is just a bigger version of the simple fixed-record format: each chunk is a record, and you almost never have to insert or delete a chunk in-place.

    And in fact, there's really nothing forcing you to use fixed records within the chunk; as long as the chunks are fixed-length, you can store blocks of text in the chunks just as easily. In fact, many text editors do something similar for autosave files.

    Indexes

    Even if you "almost never" have to insert a chunk in-place, sometimes you'll have to. And since that's very slow, it could affect the responsiveness of your program. Or, even worse, if it happens while the user is trying to quit or shut down, they might just kill your program before you can finish.

    Isn't there some way you can ask the filesystem to just add another block into the middle of your file? Well, technically, yes, just about any filesystem has to have a way to do that. But it's not part of any portable API (and definitely not part of Python). So, unless you want to write some complicated native code for every platform you need to run on, this isn't an option.

    But think about how filesystems must do that: To a filesystem, your file is some kind of list of blocks, ideally but not necessarily contiguous.

    And you can do the same thing yourself. Keep some kind of index—a mapping of chunks to seek positions. So instead of just multiplying 65109 by the chunk size, you look up 65109 in the index. Where do you keep this mapping? Well, you could keep it in a separate file, like "spam.idx" alongside "spam.dat". But you can also just store it at the beginning of the file. What happens when your index gets full? You just move the first data chunk to the end of the file (and update the index), so you have room for another index chunk.

    Notice that if you do this, adding new chunks is no longer so expensive, so you don't need large gaps anymore.

    If your files are gigantic, you can do two or even more levels of indexing.

    In fact, even if your files aren't gigantic, two levels of indexing may be worth doing. On a hard drive, seeking is much slower than reading sequentially, so if you can keep your reads and writes as local as possible, you can speed things up. (Of course this assumes that your filesystem generally keeps your files defragmented, so that adjacent chunks of a file are on adjacent disk blocks, but that's usually a decent assumption with modern filesystems.)

    If your files are big enough, you also may want to take advantage of the fact that most of your chunks are in order and use run-length encoding or similar alternatives.

    Free lists

    If you do a lot of inserts and deletes, eventually you'll end up with multiple empty chunks, sometimes even runs of empty chunks, just lying around wasting space.

    But you can easily add a free list to your index. When a chunk becomes empty, append it to the free list. Next time you need a new chunk, grab the last chunk off the free list and use that, only creating a new chunk if the free list is empty.

    Defragmentation

    The more inserts and deletes you do, the more your chunk list will be out of order. This means you spend more time seeking on the hard disk (which, again, is very slow), and it also means any run-length encoding or similar index compression you do becomes less useful.

    The solution is the same one filesystems use. You can do an "offline defragmentation" by just reading the chunks in sequence order and writing them out to a new file. This also frees up all the chunks in your free list, shrinking the file. Or you can do "online defragmentation"—any time you notice a badly out-of-order run of chunks, make a note to defrag that part of the index as soon as you have idle time.

    Atomicity

    Remember the big advantage of write-temporary-and-replace was atomicity: at any given time, you've either got the original file, or the new file, not half the new file, or a corrupted file. Can you get the same advantage editing in-place?

    Sure, if you're willing to write more code.

    If you're not using an index, and your chunk size is a filesystem block size, you can generally count on the fact that writing a block is an atomic operation. So, you read in the 4K chunk, modify it, write it out, and either it was written or the old version is there. When you need to insert or delete a chunk, you can use the write-temporary-and-replace trick to make that atomic.

    What if you're using an index? Well, again, when you're just changing a single block, you can do that atomically. When you're inserting blocks, you insert them at the end, then you rewrite the index. If you fail before updating the index, those new blocks aren't referenced by anything. At next startup, you could present them to the user as "auto-recovery" data, or just toss them. As long as your "transactions" are just single values, atomicity and consistency are just a matter of doing things in the right order.

    If the index has grown to more than one block, you might have to write more than one index block. The only way around that is to add multiple levels of indexing, so the top level is never larger than a single block. So you append the new chunk, then append a new version of the lower-level index chunk that refers to that chunk (if you've grown to more than 2 levels, there will be more such appends) then rewrite the top-level index.

    A different option is to write to a journal, which is just a file where you record operations in sequential order, so at startup you can roll back (or, occasionally, complete) partial changes that happened before you crashed.

    Finally, there are all kinds of useful ways you can add redundancy, to make it easier to detect and recover from corruption. This obviously isn't as good as avoiding it in the first place, but if you can't be 100% sure you're 100% atomic, a fallback never hurts.

    Of course often you want atomicity (and the rest of ACID) for transactions that affect more than a single record. For that, you need to build a transaction system, which I won't get into here.

    Concurrency

    Writing a block may be atomic, but reading a block, making changes, and writing it back obviously isn't. If you're just worried about surviving crashes, this isn't a problem—either your program did 1 write, or it did 0. But if you have two programs (or two threads in your program, or even two interleaved operations on the same thread) both trying to edit the file, you've got a classic race condition: program 1 reads the index, program 2 reads the index, program 1 writes back its modified index, program 2 writes back its modified index, and now program 1's changes are lost.

    The solution here is the same as anywhere else. The details are different on different platforms, but almost everywhere but Windows, the simplest version uses pessimistic flock:
        @contextlib.contextmanager
        def flocking(f, op):
            fcntl.flock(f, op)
            try:
                yield
            finally:
                fcntl.flock(f, fcntl.LOCK_UN)
    
        # atomic read
        with flocking(f, fcntl.LOCK_SH):
            f.seek(pos)
            buf = f.read(4096)
    
        # atomic edit
        with flocking(f, fcntl.LOCK_EX):
            f.seek(pos)
            buf = bytearray(f.read(4096))
            buf[off:off+4] = struct.pack('>I', 23)
            f.seek(pos)
            f.write(buf)
    
    This can cause a lot of contention if the changes you need to make may take a while. In that case, you use optimistic locking, or drop to the lower-level lockf (which lets you lock just part of a file instead of the whole thing), or both.

    Do you really want to reinvent the wheel?

    At this point, you've probably realized that you're doing most of the same work a database or a filesystem would do.

    If you really want to continue on, your next step should be to take a look at the data structures and algorithms that databases and filesystems use. The key is to store your index, or possibly even your data itself, in a structure like a B-tree or B+ tree, or even a simple probing hash table. There are well-known algorithms for dealing with these data structures, and nice pre-built implementations (and the trees also give you sorting for free).

    But there's an even simpler idea: Just use a database or a filesystem to do the work for you.

    Using a database should be pretty obvious. You have all kinds of options, depending on what you want to do. Python comes with a simple key-value store (actually, a few variations) in the dbm package, and a complete relational database in the sqlite3 module, but there are plenty of more powerful key-value stores, relational databases, documented-oriented databases, persistent graphs, and so on. Whatever you want probably already exists, and probably has a nice Python interface. (If the native interface to whatever you want isn't nice enough, there may be an adapter—e.g., SQLAlchemy can give you a functional wrapper or a complete ORM on top of most relational databases.) See relational database and NoSQL on Wikipedia for some ideas. All kinds of things that are hard to get right, much less optimize for each platform, will be automatically taken care of for you.

    Using a filesystem is also very simple: Just store each record (or each chunk) in a separate file, and use a directory as an index. You can create multiple levels of indices by just creating subdirectories. (Depending on your filesystem, you may need to do this every time you get to a few thousand files, or even less.) An atomic insert is just creating a temporary and moving it in-place; an atomic delete is just os.remove; an atomic edit is a write-temporary-and-replace. When you're working with the filesystem (and the OS cache, etc.), a lot of things get a lot easier.
    0

    Add a comment

  7. Recently, as an aside to the proposal for static type annotations in Python, Guido offhandedly noted another suggested improvement to Python, adding ADTs (Algebraic Data Types). This has also come up in discussions about pattern matching, and about existing run-time and compile-time static-typing libraries.

    What is an ADT?

    Instead of trying to fully define an ADT in formal terms (see Wikipedia, or, better, work through a Haskell tutorial, for that), I'll try to define it in terms of values as you might use them in Python.

    Basically, an ADT is either a simple type, a product of ADTs, or a sum of ADTs. That's what makes it "algebraic".

    Product types

    What is a product of types? It's a tuple: (42, "spam") is of type int * str, or, written more simply, (int, str).*

    Often you want to add names to the values, like (answer: 42, manna: "spam"), which is of type (answer: int) * (manna: str), or (answer: int, manna: str). Note the equivalency to the "implicit namedtuple literals" that people suggest for Python every few months.

    And often you want to add a name to the tuple itself, making this a full "record" type, like ImportantStuff(answer: 42, manna: "spam"), which is of type ImportantStuff:((answer: int) * (manna: str)) or ImportantStuff(answer: int, manna: str). This is exactly like the namedtuple that Python has today. It's also like a C struct (or Python ctypes.Structure).

    Note that none of these is like an OO class, especially Python's rendition of that concept. Besides the fact that classes have methods, they have an arbitrary bag of attributes, not an ordered sequence of them. They may have private attributes that are part of the structure but not part of the interface, and computed attributes that are part of the interface but not part of the structure. And so on. (The classes in weaker OO languages like C++ and Java are closer to record types—which is why C++ just defines "class" and "struct" to mean the same thing, except for the default visibility of their members.)

    * Notice that, since a tuple type is just a product of types, a single-element tuple is the exact same thing as its one value. And an empty tuple is a singleton value of some unit type that's the result of not multiplying anything (equivalent to 0 when multiplying integers). Python of course distinguishes single-element tuples from their values: (spam,) != (spam). And Python's closest thing to a unit type is NoneType, whose value, None, is often used inside tuples. So, Python's tuple isn't quite a purist version of an untagged product type. But I think we can ignore that difference here.

    Sum types

    A sum type is a union of two types. This is the part Python doesn't really have today.

    Consider a URL-fetching function, which can return either a successful download, an error download, or a failure to even get an error. (Ignore for now the fact that Python would use an exception for the third case.) So, it might return something like:

        ((headers: Sequence[str], body: Iterable[str])
         | (code: int, description: str, headers: Sequence[str], body: Iterable[str])
         | (errno: int,))

    On its own, this isn't very usable; to know which of the three you've gotten, you need to either LBYL the members of the returned record with getattr, or EAFP it by using a try:/except AttributeError:. And consider that, at least with HTTP, successful downloads actually have a code and description, just like failed downloads, meaning the two record types are identical, giving you no way to distinguish them! So often, you will want to tag the values:

        (success: (headers: Sequence[str], body: Iterable[str])
         | failure: (code: int, description: str, headers: Sequence[str], body: Iterable[str])
         | error: (errno: int,))

    But notice that a tagged union of plain (named) tuples is exactly equivalent to an untagged union of records. You could just as well write that type as:

        (Success(headers: Sequence[str], body: Iterable[str])
         | Failure(code: int, description: str, headers: Sequence[str], body: Iterable[str])
         | Error(errno: int,))

    Either way, a language could allow you to dissect unions with the same member syntax used by records. But the latter allows you to dissect them by pattern matching, or sometimes just by comparison, which is often much simpler.

    In practice

    Most languages that make heavy use of ADTs go with a sort of hybrid: the tags are part of the named record type, but named record types can only be defined inside sum types, and therefore it doesn't really matter.

    For example, here's how cons lists are generally defined:

        List(Cons(head: T, tail: List[T]) | Nil())

    Here Cons and Nil are called constructors for List. Whichever way you interpret this, Nil() is an instance of type Nil (or of the Nil kind of type List), and also an instance of type List.* Cons(1, Nil()) is an instance of type Cons[int] (or the Cons kind of type List[int]), and also an instance of type List[int].

    However, this does raise a question of namespacing: is the name Cons available in the same scope as List, or do you have to write Cons.List? (It's not a coincidence that this same question that comes up with enum in many languages, which I'll get to later.)

    * It's worth noting that in most ADT-based languages, this Nil() can have a type, like List[int]; the fact that it isn't always inferable just means that sometimes you have to specify it explicitly.

    Optional/Maybe

    One special case that people often propose for Python isn't really special at all.

    A value that can be either an int or None is just int | NoneType.

    A value that can be either an int or None, with the type tagged by a name, is (Just(int) | Nothing(NoneType)).*

    In languages that make sum types very concise, like Haskell,** it's perfectly reasonable to write Maybe int as a type, and Just 42 or Nothing as values of that type, and to pattern match the value as Just n | Nothing.

    Many other languages add syntactic sugar: int? means Maybe[int], n! means the value of n or raise an exception if it's Nothing, n? means nil-chaining (an expression with n? has the value of the expression with the Just value if n is not Nothing, or it has the value Nothing if n is Nothing), there may be shortcut values to extract a value (e.g., Swift's "if let n: int = m {} else {}", and so on. But under the covers, these are all doing things that can be expressed directly in terms of the sum type.

    * Note that the direct mapping of Haskell's Maybe to Python would be (Just(int,) | Nothing()). But because Python's unions aren't mathematical product types, and NoneType isn't a real unit type, as described earlier, it's more convenient to tag non-record-typed values in the Maybe, and there's really no theoretical reason you can't do that, it just means either treating the tags are part of the union rather than part of a record constructor, or a bit of extra syntactic sugar.

    ** In Haskell, tags are available in the outer namespace, and type parameters and constructors, like functions, don't need parentheses around their arguments.

    Fitting ADTs into Python

    Product types

    As noted above, Python already has tuples and named record types, but not anonymous record types. How do we add those?

    namedtuple

    The existing namedtuple works fine for named record types.

    The proposals for implicit namedtuple literals have, as far as I know, been coherent (at least Christian Tismer's worked out all of the tricky details and showed that they weren't so tricky after all); the only reason they haven't gone anywhere is that no one could explain a real need for them. So let's just add that, and use it for our anonymous tagged records:

        (answer: 42, manna: "spam")

    The type of that is:

        (answer: int, manna: str)

    Alternative: typed __slots__

    A class with __slots__ is also pretty close to a named record type, and in some ways seems like a better fit, assuming we had a syntax to type the slots, which seems pretty easy.

    However, the fact that, unlike a namedtuple, it's not a tuple is at least a theoretical problem, if not a practical one. The fact that there is nothing requiring the constructor arguments to map to the slots is a bigger problem. And it's hard to imagine a way to write these anonymously and inline (short of some kind of horrible expression using the type class's constructor).

    Sum types

    Python doesn't have sum types at all, except implicitly. And explicit sum types, especially either tagged sums of anonymous products or sums of named products, are exactly what people are missing when they ask for ADTs in Python.

    So, how could we add them?

    MyPy Union

    MyPy adds a Union type: Union[int, NoneType] (or, more compactly, int | str) is a value that can be either an int or a str.

    While this only gives us untagged unions, as noted above, tagged unions are equivalent to untagged unions of record types. So, if we wanted to define a Haskell-like Maybe, we could do this:

        Just = namedtuple('Just', (('value', T,)))
        Nothing = namedtuple('Nothing', ())
        Maybe = Just | Nothing

    Also, without pattern matching, the only way to actually use this is with isinstance checks (or EAFP try/except):

        def spam(eggs: Maybe[int]):
            if isinstance(n, Just[int]):
                print('Spam with {} eggs'.format(eggs.value))
            else:
                print('The spam stands alone')

    There's nothing theoretically wrong here, but it certainly is ugly.

    Swift-style Enum

    Consider that a sum type of tagged parameterless type constructors is equivalent to a Python-style Enum, with the constructors mapping to enumeration members:

        Color = (Red | Blue | Green)

        Color = Enum('Color', 'Red Blue Green')

    You can do the same things with each (although with slightly different syntax). The question of whether Red, Blue, and Green leak into the parent scope or are attributes of Color arises in exactly the same way for the two cases, and has the same pro and con arguments.

    So, what if Enum were extended to allow values to be attached to each member? This is exactly what Swift enumerations are, and this is how Swift handles sum types.

    I'm going to assume some extension to the language that allows statically typing variables and attributes, rather than doing it in comments a la MyPy, and I'm going to assume that tuples of types are specified as tuples of types rather than some other syntax, but feel free to translate.

        class Barcode(Enum):
            UPCA: (int, int, int, int) = 1
            QRCode: str = 2

    So now, you can do this:

        >>> barcode = Barcode.UPCA(8, 85909, 51226, 3)
        >>> barcode
        <Barcode.UPCA(8, 85909, 51226, 3): 1>
        >>> barcode.value
        (8, 85909, 51226, 3)

    Or, if you wanted to name the parts of the UPC-A:

        class Barcode(Enum):
            UPCA: (LL: int, L: int, R: int, RR: int) = 1
            QRCode: str = 2

        >>> barcode = Barcode.UPCA(8, 85909, 51226, 3)
        >>> barcode
        <Barcode.UPCA(LL: 8, L: 85909, R: 51226, RR: 3): 1>
        >>> barcode.LL
        8

    Of course you still need isinstance, pattern matching, or EAFP code to distinguish whether you've got a UPCA or a QRCode:

        >>> barcode = Barcode.QRCode('ABCDEFGHIJKLMNOP')
        >>> barcode.LL
        AttributeError: 'Barcode.QRCode' object has no attribute 'LL'
        >>> barcode

    But still, it looks a lot prettier than using Union.

    What about recursion?

    One of the important features of ADTs is that they can be recursive. See the List example above: the second element of a Cons is a List.

    In Python, this means we either need some kind of explicit forward reference (as used by MyPy), or a way to define sum types and record types in such a way that they can automatically use the type being defined.

    Notice that it's relatively easy to write an "auto-numbered" Enum class that doesn't require you to specify a value; referencing an unknown name just gives it the next number in sequence and binds it. With a slightly smarter/hackier metaclass, I believe you could also write an "auto-recursive" Enum that assumes any unknown name is the class itself, and then checks at construction time that the only unknown name seen was the class name. I haven't actually tried to do this, but I'm going to just naively assume it's possible; if not, something like MyPy's forward references may be necessary.

    So ADTs are just a subset of classes?

    Each namedtuple type is a class, and each Enum type is a class, and I've defined ADTs in terms of (implicit) namedtuples and Enums. This seems at first to go against the spirit of ADTs, which are a completely alternative form of typing to OO classes.

    But note that in Python, tuple is a class, int is a class, NonteType is a class, every type is a class.

    More importantly, ADTs are a strict subset of classes. The "strict subset" part is what makes them useful: they have a static structure, they have 

    Does Python really need this?

    The big question is: under what circumstances would you want to use an ADT (whether implemented as an Enum with anonymous nanedtuple values, or otherwise) instead of a class?

    If the language had pattern matching that only worked on ADTs, the answer might well be "quite often".

    But without pattern matching, or with pattern matching that worked on classes that chose to opt in (as described in my previous blog post), I think it wouldn't be that often.

    A paradigm case: JSON

    JSON is (quite accidentally) an ADT. You can define it as:

        class JSON(Enum):
            null: NoneType = 0
            boolean: bool = 1
            number: numbers.Number = 2
            string: str = 3
            array: Sequence[JSON] = 4
            obj: Mapping[str, JSON] = 5

    So, what would your code look like if json.loads returned one of these, instead of just returning a dict?

    For the quick&dirty case, it would obviously be a whole lot more inconvenient to parse out a JSON than a plain dict. Today, you can already write:

        j = json.loads(s)
        j['aliases'][0]['first']

    But if you want to specify a schema, map it to objects in your hierarchy, and validate it, that would be a huge pain with dynamically walking the structure, but comparatively easy treating JSON as an ADT:

        class Name((first: str, last: str)):
            def tojson(self):
                return JSON.object({'first': self.first, 'last': self.last})
            @classmethod
            def fromjson(cls, j):
                case j:
                    of JSON.object as o:
                        return cls(o['first'], o['last'])
                raise ValueError('{} is not a name'.format(j))

        class Person((aliases: Sequence[Name], age: float)):
            def tojson(self):
                return JSON.object({'aliases': JSON.array(map(Name.tojson, self.aliases)), 'age': age})
            @classmethod
            def fromjson(cls, j):
                case j:
                    of JSON.object as o:
                        case o['aliases']:
                            of JSON.array as a:
                                return cls([Name.fromjson(name) for name in a], o['age'])
                raise ValueError('{} is not a person'.format(j))

    (Of course there's no reason Name and Person couldn't be "pure" ADTs, with free functions instead of methods, but it seems more Pythonic this way.)

    This pattern matching syntax is nowhere near as simple as the ML or Haskell equivalent, which implies that maybe I didn't do as good a job with my straw-man syntax as I'd hoped. But note that a JSON parsing framework could automate all of that pattern matching based on the attributes and types of the class. Assuming I'd written such a thing, or found it on PyPI, or inherited it from the web framework I'm using, I could write the above as:

        class Name(JSONable[(first: str, last: str)]):
            pass

        class Person(JSONable[(aliases: Sequence[Name], age: float]):
            pass

    So:

        person = Person.fromjson(json.loads(s))
        person['aliases'][0]['first']

    This looks almost as simple as the non-validating case, but it will raise an exception if s doesn't actually specify a Person.

    So, is that a good enough use case to include ADTs in Python?

    5

    View comments

  8. The idea of a pattern-matching case statement has come up a few times recently (first in a proposal to add a C-style case statement, more recently as part of a the proposal to add static type annotations), but the discussion ends as soon as people realize that it's not quite as easy as it appears.

    The core problem is that the key to pattern matching is unpacking constructors of algebraic data types, and Python has arbitrary OO-style classes, not ADTs. Everything else is bikesheddable, but ultimately obvious. And there are two potential solutions to that core problem.

    I'll start by laying out a straw-man proposal.

    (This was originally written 17 August 2014, and then updated 7 April 2015. The major change is the new section at the end, although I also fixed a couple typos and restored some formatting that Blogspot broke because it sucks.)

    (Also see Pattern matching again, written a year and a half later, after I became less ignorant about a few more languages, most notably Scala.)

    Semantics

    You can match any of the following:
    • Any assignment target: Always matches, and the target is bound to the match value.*
    • A constructor call: This is the tricky bit that most of this post is about.
    • Any expression that isn't ambiguously an assignment target or constructor call: Matches iff the match value == the expression value. (Generally this means literal constants.)
    • Any of the above followed by an "as" clause: Matches iff the pattern matches, and if so, binds the as target to the match value. (That is, "(0, spam) as eggs" matches iff (0, spam) matches, and if so binds eggs. This is mainly useful if you need to further decompose a subpattern, but also get a name for the whole thing.)
    • Any comma-separated list of the above: Unpacks the match value just like in an assignment target list, then matches iff each pattern matches the corresponding value.
    • Any of the above followed by a guard clause: Matches iff the pattern list matches and the if expression (identical to the conditional clause in a comprehension) is truthy. The if test can, and usually will, use values bound by the pattern list match. The "and" mentioned is of course short-circuiting.
    * Note that this counts as an assignment for function-locals rules.

    If you've seen a case expression in Haskell or ML, all of this should be familiar. Except for one conspicuous absence:

    Lists

    The very first use case anyone learns for pattern matching in Haskell or ML is matching the head and tail of a list.

    Of course in Haskell and ML, a list is, under the covers, just semantic sugar for the Cons record type. When you match (x:xs), those x and xs are really the two members of the list object, and (x:xs) is just syntactic sugar for Cons(x, xs). While we could come up with a syntax to match lists similarly in Python, it would be misleading, because a list is not made up of a head and a tail, it's an array made up of N elements.

    On top of that, the main use for pattern-matching lists is to write recursive functions over lists, which is a decidedly non-Pythonic thing to do in the first place.

    So, I've left out list decomposition intentionally.*

    * Note that you can actually match the empty list or tuple, because those are both unambiguously not targets—or, of course, you could match the constructor call list(). And you can match x, *xs against any iterable of 1 or more elements. So, if you really want to recursively decompose a list, you can. But it's a bad idea, so there's no reason to add syntactic sugar to make it easier.

    Non-literal values

    (It's a bit of an oversimplification to say that only literals can be used as values to match, but it's a useful oversimplification for this part of the discussion.)

    In many cases, you're going to want to match the value in, say, spam[3].eggs, but because that's a valid target, you can't. Instead, you have to do it with a guard clause.

    In ML or Haskell, that isn't as much of a problem. For one thing, only identifiers are actually binding targets. For another, each case (or let) is a new scope, and you can't actually rebind variables (although you can shadow them with inner-scope variables of the same name).

    One possibility is to add a special notation that forces something to be a value rather than a target. In the comments, I suggested a hideously ugly straw-man, @target, which draws misleading parallels to not only Haskell pattern matching syntax, but also languages like Perl and its cousins. But I'm sure someone could come up with some better syntax. And that might be worth doing.

    An alternate possibility is to always treat any valid expression as a value, and only treat something as a target if the expression would raise a NameError or LookupError. If you want to rebind an existing variable (or a list member), that would be the special case that requires some special notation. However, I think that would very quickly lead to confusion. Anything that allows "spam" to sometimes mean "bind spam to the value" and sometimes "match the value against the current value of spam" would be hard to read without absorbing a lot more context. Worse, a lot of newcomers to Python would see identifiers being used in case statements for their values and assume it's like a C or Java switch statement, which would be very misleading. And then there's the question of what happens if the expression has a global binding.

    A more extreme, but much simpler, version would be to have no rule at all: there's always special syntax for a value, even if it's a literal--or, alternatively, there's always special syntax for a binding, even if it's a currently-unbound identifier. (Note that libraries that add pattern matching into Python without new syntax seem to go that way—e.g., pypatt uses "quote(x)", rather than just "x", to mean "match anything and bind to x", but they may not have much choice.) But I don't think this is necessary. I think the "anything that looks like a target is a target" rule is easy to understand, remember, teach, and even discover through experimentation.

    Syntax

    Example

    class Breakfast:
            def __init__(self, spam, eggs, beans):
                self.spam, self.eggs, self.beans = spam, eggs, beans
    
        # ...
    
        case order:
            of _, Breakfast(0, _, _):
                raise WaitressError("You can't have it without spam")
            of _, Breakfast(_, _, beans) if beans:
                raise WaitressError("Beans are off")
            of _, Breakfast(spam, eggs, _) if eggs > spam:
                raise WaitressError("That'd be more eggs than spam, can't have that")
            of customer, Breakfast(spam, eggs, beans) as breakfast:
                kitchen.fry_spam(spam)
                kitchen.fry_eggs(eggs)
                kitchen.fry_beans(beans)
                waitress.deliver(customer, breakfast)
            else:
                raise WaitressError("I'm confused by that order")
    
    This syntax obviously isn't ideal, what with the double indent and the of and else statements being statements rather than clauses, but this should be enough to discuss the real issue.

    Note that, unlike Haskell or ML, this is a statement, not an expression.*

    Like an if/elif/else chain, this just tries each case until it finds one that matches, then executes its suite. If none of them match and there's no else, you just fall off the statement, having done nothing.**

    All of these cases are 2-element pattern lists (or 2-element pattern lists with a guard clause), so order[0] gets matched against the first pattern, and order[1] against the second.***

    The first thing is always a target (_ or customer)****, so it always succeeds, and binds the target to order[0].

    The second thing in every case is a Breakfast constructor (or a Breakfast constructor with an as clause). That gets (recursively) matched against order[1], and it's the tricky bit that will be examined in detail.

    The as clause just means that, if Breakfast(spam, eggs, beans) matches, breakfast gets bound to the whole Breakfast object it matched.

    The if clauses are checked only after the pattern list is matched, so eggs and spam are the matched values. If the if condition is false, the whole pattern clause fails.

    Note that a failed clause may still end up binding some names. And the case statement doesn't happen in its own scope. If you wanted to do something confusing, like writing a pattern that always fails and then using the names it bound in the suite of the next pattern, well, consenting adults and all that.

    When the Breakfast constructor gets matched, it will in some way (described later) try to match its arguments. Some of these are targets, but not all of them—i.e., the literal constant 0. Note that matching Breakfast(0, _, _) is equivalent to matching Breakfast(spam, _, _) if spam == 0 (except that the latter will bind spam to the value).

    * Of course in those languages, almost everything is an expression. Just as Python has if statements, often used for side effects, instead of if expressions that have a value, Python would presumably have case statements often used for side effects. Note that Python did eventually add a ternary conditional expression. I suppose it might be possible to add a case expression later, but I don't think it would be very useful; the whole point of the conditional expression is to use it in very short statements that would be overly verbose if split into a 4-line statement.

    ** In most languages with pattern matching, either else is required, or falling off a case without matching anything is an error. But again, that's because case is an expression, and must have a value, and the same thing is true for if expressions.

    *** As with target unpacking, order can be any iterable, not just a sequence. I'd have to look at how assignment unpacking works under the covers, but if it's not building a list but just calling next once for each argument, we could still do the same here. Basically it would just need to try to unpack to up to the Nth element each time it first sees a pattern list of length N, and remember all previously unpacked values and whether it's received StopIteration.

    **** Note that (unlike Haskell) _ doesn't have any special meaning here; we're just binding some of the match values to the name "_". If you bind multiple values to the same name, it gets one of them arbitrarily.

    Grammar

    I've actually built a GRAMMAR that adds this syntax, and from some simple tests there don't seem to be any problems parsing any of it. But of course it would need new AST nodes, at least dummy code generation, and a real test suite, and I didn't build any of that, so don't take that as too strong of a promise.

    Matching constructor calls

    The big question is, how can Python know what to do with Breakfast(spam, eggs, beans)?

    In Haskell or ML, a case statement matches by deconstruction. Breakfast in those languages would be akin to a ('Breakfast', 'spam eggs beans'), so it's really just as easy as tuple unpacking, which Python already does, except with the addition of checking the type.

    But in Python, Breakfast is a class. The fact that its constructor happens to take three arguments, and it happens to assign them to attributes with the same name in the same order (even if that order were recoverable…) is just a lucky accident, not something the language requires, or even encourages.

    Option 1: Matching by construction

    One alternative is to actually call the Breakfast constructor with special values that override __eq__, then trust that Breakfast's __eq__ method will do the right thing. Like this:
    class MatchParam:
            def __eq__(self, other):
                self.matched = other
                return True
    
    So, to match order[1] against, say, Breakfast(0, eggs, _), Python would do this:
    eggs, _ = MatchParam(), MatchParam()
        if Breakfast(0, eggs, _) == order[1]:
            eggs, _ = eggs.matched, _.matched
            # if there's a guard clause, check it here
            # run the suite code here
    
    To give a more complicated example, if Order were a class rather than a tuple, we could match Order(customer, Breakfast(0, eggs, _)) like this:
    egg, _ = MatchParam(), MatchParam()
        customer = MatchParam()
        if Order(customer, Breakfast(0, eggs, _)) == order:
            # etc.
    
    The "as" clause makes this marginally more complicated, but it should be obvious how to make that work.

    Unfortunately, while this works for many simple classes, there are plenty of classes for which it does the wrong thing. Basically, any class that does anything with the arguments in the constructor beyond storing them will probably fail. (Even static type checking will complain that a MatchParam is not a str… although that could be handled by making MatchParam act like a subclass of any type.)

    Worse, some classes would succeed, but only by doing something dangerous or expensive. For example, I don't think you'd want to create a new FileIO(path, 'w') just to match an existing one!

    So, I think this is ultimately a non-starter. Which is a pity, because it's so simple, and so close to what ML-style languages do…

    Option 2: Matching by match protocol

    Any automatic form of matching is going to have the same "FileIO problem". Really, matching has to be something that classes opt in to. The obvious way to do that is to add a match protocol: a new dunder method that's called to get back the constructor arguments, so they can be checked. Like this:
    class Breakfast:
            def __init__(self, spam, eggs, beans):
                self.spam, self.eggs, self.beans = spam, eggs, beans
            def __match__(self):
                return self.spam, self.eggs, self.beans
    
    Now we can match order[1] against Breakfast(0, eggs, _) like this (showing the semantics by transforming to existing Python in a way that could be implemented programmatically, although of course the actual implementation wouldn't do that--which means there's no need to worry about "macro hygiene" problems with those underscore-prefixed names affecting existing variables):
    if isinstance(order[1], Breakfast):
            _0, eggs, _ = order[1].__match__()
                if _0 == 0:
                    # run the suite code here
    
    … except that if _either_ if fails, we want to fall through to the next pattern, not just the outer one. I won't try to show how to write that in any of the examples, just keep it in mind. (Note that if case statements evaluate to a function definition and call, like comprehensions, this is a lot simpler to write. But that has the side effect of not exposing bound variables to the outer scope--which may be something you actually want, but if not, it's not something you want to add incidentally...)

    To match Order(customer, Breakfast(0, eggs, _)):
    if isinstance(order, Order):
            customer, _breakfast = order.__match__()
            if isinstance(_breakfast, Breakfast):
                _0, eggs, _ = _breakfast.__match__()
                if _0 == 0:
                    # etc.
    

    And to match Order(0, Breakfast(0, eggs, _) as breakfast) if eggs:
    if isinstance(order, Order):
            _0, _breakfast = order.__match__()
                if _0 == 0:
                    if isinstance(_breakfast, Breakfast):
                        _0, eggs, _ = _breakfast.__match__()
                            if _0 == 0:
                                breakfast = _breakfast
                                if eggs:
                                    # suite here
    
    This transformation shows exactly why pattern-matching can be attractive. It's just syntactic sugar, but it's syntactic sugar for code you wouldn't want to write explicitly, and you certainly wouldn't want to read.

    The obvious downside here is that most simple classes won't be matchable. But if matching has to be opt-in, that seems pretty much unavoidable.

    Keyword arguments?

    One obvious problem with __match__ returning a tuple is that it can't match keyword arguments. Which means classes with keyword-only parameters in their constructors can't be matched at all.

    I think this could be solved by returning something like an inspect.ArgSpec instead of a tuple, but I haven't investigated to make sure this is feasible. Anyway, it should be obvious how to fit it into the case statement syntax; it's only the semantics that need to be worked through.

    Can we remove the boilerplate?

    Clearly, we could trivially add __match__ support to namedtuple, which helps in cases where you really do just want an ML-like record type and were already going to used namedtuple for it.

    It would be easy to define a @simpleclass decorator that generates a __match__ automatically by, e.g., inspecting the parameter names to __init__. Or, alternatively, the decorator could just take the attribute names and generate both the __init__ and __match__ methods for you.

    There could also be special decorators for dealing with __slots__ classes, classes with only @property attributes, etc.

    But somehow, after the first one, each seems increasingly less Pythonic. It might be appropriate for certain frameworks; you might even want to build a framework with base classes (and/or metaclasses) that do this automatically, instead of using a decorator. (For example, a Django model seems like it could define matching in terms of fields—except for the fact that you generally can't actually construct them directly, you have to call a create method… but there's no reason that Model.objects.create couldn't itself be a matchable type, rather than Model.)

    Can we leverage the copy/pickle protocol?

    The match protocol is really doing the same thing as the copy/pickle protocol: it's asking a class instance for its values, in a form that could be used to construct an equal instance. Could this be leveraged?

    The copy/pickle protocol is very flexible. While classes can implement __getnewargs__ or __getnewargs_ex__ to provide constructor arguments, they can also implement __getstate__ and __setstate__, or even __reduce_ex__ and a custom constructor-from-reduction function. And it's not clear how anything other than __getnewargs_ex__ could be used for matching unless the class also implemented __matchstate__ or a custom matcher-from-reduction function.

    Still, this might be a good answer: Make __getnewargs__ and __getnewargsex__ automatically provide match information, and provide a way for the other alternatives to optionally specify match information (and if they don't, the class just can't be matched).

    However, there aren't all that many classes that use __getnewargs__. Most classes that are simple enough to not need __getstate__ or __reduce__ just rely on the default pickler in object.__reduce_ex__, which of course stores the __dict__ and a _reconstructor function that calls __new__ but not __init__ and then updates the __dict__, which won't be of any use to matching.

    And encouraging people to implement __getnewargs__ if they want to match means that they will no longer get the default pickler. For a class that does expensive or dangerous work in the constructor, this means copy.copy will now do that expensive or dangerous work—so we're right back to the FileIO problem.

    So, I think that, despite the obvious similarities between the two protocols, they have to remain separate. (Maybe in a new Python-like language with both ideas there from the start, it would be a different story.)

    Beyond case

    In languages that rely heavily on pattern matching, it can often be used in two other places: function definitions, and let expressions. The equivalents of both are feasible in Python.

    Function definitions

    In Haskell, you can define a function as, in effect, a series of separate definitions that use patterns as their arguments.

    Putting it in Python terms:
    def fact(0):
            return 1
    
        def fact(n):
            return n * fact(n-1)
    
    In Haskell, this is explicitly syntactic sugar for declaring a single function with a case statement. In other words, the compiler turns that into:
    def fact(_n):
            case _n:
                of 0:
                    return 1
                of n:
                    return n * fact(n-1)
    
    Note that this is different from overloading by type in languages like C++ (or in Python's functools.singledispatch or any of the multiple dispatch modules on PyPI), because it's overloading by structure (or value, in the case of constants). But it can be used to overload by type just by matching different type constructors.

    In Python, we would need some way to mark these as pattern overloads, as opposed to redefinitions, but the answer to that is obvious, and already used by singledispatch: use a decorator:
    @pattern
        def fact(0):
            return 1
    
        @fact.pattern
        def fact(n):
            return n * fact(n-1)
    
    (The specific "@fact.pattern" syntax isn't absolutely necessary; "@pattern" alone could look up __name__ and find the right thing to attach to, and some of the alternative overloading modules on PyPI do this. I've just written it this way to parallel singledispatch.)

    This would need pretty minimal language support: a function definition can take a single case pattern instead of a normal argument list. Normally, such a function is not callable (it always raises a TypeError), but the pattern decorator can inspect it to build something callable out of it. I'll leave the implementation of the pattern function as an exercise for the reader.

    Assignments

    Haskell of course doesn't have assignments; it's a pure (meaning purely-immutable) functional language. And let expressions aren't the same as assignment statements. So in this case, rather that start off with the parallel and then show how it maps to Python, I'll just show the Python:
    Breakfast(spam, eggs, beans) = order[1]
    
    It should be obvious what this means, and how to implement it given the above.

    What if there are constants?
    Breakfast(0, eggs, beans) = order[1]
    
    That matches eggs and beans if order[1].spam == 0, and raises MatchException("0 does not match 42") if order[1].spam == 42.

    There's no ambiguity here. A call is not a valid target today, nor is a literal. A target list is a valid target, but since a pattern list is a superset of a target list, this isn't a problem. (At least I'm pretty sure there aren't any assignments that are valid today but that would have a different meaning if treated as a pattern assignment.)

    And yes, you could even make as clauses and if clauses work here for consistency, although I don't think they'd be useful that often.

    In fact, the easiest way to add pattern-matching assignments is to redefine assignments in terms of case statements, so target_list = expression is equivalent to:
    case expression:
            of target_list:
                pass
            else:
                raise ValueError
    

    What about adding ADTs to Python?

    There are really two parts of ADTs: record types, which are basically tuples with named elements, and sum types, which are tagged unions with named values. The first is exactly the same thing as collections.namedtuple, which Python already has. Adding pattern matching for namedtuples would be trivial (with the match protocol described above, or otherwise). The second, Python doesn't have—but enum.Enum is halfway there. The only part that's missing is the ability to attach a value to an enumeration member (and, maybe, statically specify a type for each member). This is exactly the approach that Swift takes. In Python terms, imagine you could do this:
    class Barcode(enum.Enum):
            UPCA = 1
            QRCode = 2
    
        >>> barcode = Barcode.UPCA(17, 23, 42, 42)
        >>> barcode
        <Barcode.UPCA: 1 = (17, 23, 42, 42)>
        >>> barcode.value
        (17, 23, 42, 42)
    
    For more on this idea, see my separate post ADTs for Python.

    If we had this, would it be worth adding pattern matching only for Enum and namedtuple? I don't think so. If neither of these types is common enough to even be a builtin; adding new syntax just for them seems silly.

    However, the ability to match on these types in particular--as well as any other types that opt in--might be a good argument for adding a general match protocol, as described above.

    Is any of this a good idea?

    I'm not actually sure it is.

    A big part of the reason pattern matching is so important in functional languages is that it's declarative. Like let, it's an expression that binds values within a subexpression, and has the value of that subexpression.

    There doesn't seem to be any good reason you couldn't use pattern matching to bind names applicatively for the rest of the scope, or to execute side effects, any more than there's a good reason that for statements and if statements don't make sense. But it's also hard to find a nice paradigm case from a pattern-matching language that isn't better rewritten in Python in a more traditional iterative style. Someone would need to find a good use case that really is more readable as a pattern decomposition.

    It's possible that things like singledispatch and the proposed new static type annotations will gradually change Python into a language where pattern matching is a more obvious win (as the addition of iterators and generators changed it into a language where other features borrowed from Haskell and friends turned out to be a bigger win), but it's hard to anticipate that. Certainly adding features like not-quite-sum union types and not-quite-parametric generic types looks like a step in that direction, but it's way too early to tell how big of a step, and how far Python will go, and what it will look like when it gets there.

    Szymon Pyzalski's proposal

    After reading Szymon Pyzalski's proposal on python-ideas, there are a few ideas worth stealing from him, or at least considering.

    No targets in patterns

    Szymon solves the tricky problem with distinguishing values from targets in patterns by just not having targets. Instead, the "as" clause handles the existing Python tuple decomposition, which handles some simple cases, and when that isn't sufficient, the match object can dump a bunch of variables into the case statement's scope. I don't like that, for multiple reasons, but let's not get into those here. However, it does make some of his other ideas fit in a lot more simply, so keep that in mind.

    "for" instead of "case"

    Trying to pick a good keyword for an addition to Python is always tricky. Adding a new keyword means that any code that used that keyword as an identifier is now illegal, and I'm sure there's code out there that has variables named "case" or "switch" or "match" or anything else you might pick. On the other hand, reusing an existing keyword runs the risk of making the syntax ambiguous to the parser and/or to a human. Szymon uses "for", and I think you can make that unambiguous to the parser... but it looks odd to a human. Especially if the pattern (or the iterable, in an existing loop statement) is long and complex, so you have to hunt for an "in" (and one not inside another expression!) to tell which type of "for" statement it is. I think introducing "case" is, all things considered, less bad, but of the existing keywords, "for" does seem to be the best bet.

    No keyword for subclauses

    In his proposal, there's no "of" or "else" keyword, you just put the pattern followed by a colon. (For the "else" case, you use an ellipsis followed by a colon.) And this isn't just more concise, it looks cleaner.

    I think this means the parser has to actually treat these as clauses of the case statement, instead of (as I did) treating them as separate statements, then having a post-parser rule (equivalent to the "break must be inside a loop" rule) to only allow them inside a case, which would be harder to implement--but then again, the way I implemented it is clearly a hack, so if his way is doable, that's no big loss.

    Matching on types

    There are a lot of cases in Python where you'd like to match by type, without decomposing. In fact, going back to the duality between function overloading and pattern matching, all of the examples for single and multiple dispatch are also potentially examples for type-based pattern matching. Here's the motivating example from Matthew Rocklin's multipledispatch library converted to pattern matching:
    def add(*args):
            case args:
                of (x: int, y: int):
                    return x+y
                of (x, y):
                    return "%s + %s" % (x, y)
    
    Of course this requires another extension to the language, being able to use function-style type annotations in assignment target lists, one that we're not likely to get any time soon. (Note that PEP 484 explicitly puts off even considering a syntax like "x: int = 3" for future consideration.) Then again, we're not likely to get pattern matching any time soon, either, so maybe that's not a problem.

    Without that, by adding the rule that values match pattern values if they're equal or if the pattern value is a type and the value is an instance of that type, you could write it as "of (@int, @int) as x, y:". But that only works in simple cases (again, where tuple decomposition works), and it's more verbose, and it requires the horrible "@" syntax for disambiguating a value from a target (which I hoped would be so rarely needed that it wouldn't matter how ugly it is, but it seems like this case would be pretty common if it were allowed).

    Regular expressions

    Many newer scripting languages have C-style case statements that can match regular expressions. Szymon includes something similar in his proposal, and takes it even farther, by providing a way to decompose the regular expressions—<P> named subexpressions are like variable names. For example:
    case xy:
            of re.compile(r'(?P<x>[0-9]+)-(?P<y>[0-9]+)'):
                return range(x, y)
    
    This works with his design of dumping variable names into the scope. But with ML-style decomposition with targets, it could only work if regexp syntax were part of the language syntax--as in, say, Perl.

    Also, I'm not entirely sure this is a good idea anyway. There's a reason regexps aren't embedded in the language in Python as they are in Perl. While it makes a few things harder or more verbose, it also discourages people from using regexp solutions to non-regexp problems. And, in fact, Szymon's motivating example seems to be exactly one of the cases where regexps are an attractive nuisance, not an actual solution: he wants to match two numbers as either a tuple, a dict of values named x and y, or a string, separated by a hyphen. Like this (translated to my case ... of syntax and slightly simplified):
    case point:
            of (Number, Number) as x, y: return x+y
            of {'x': Number, 'y': Number}: return x+y
            of re.compile(r'(?P<x>[0-9]+)-(?P<y>[0-9]+)'): return x+y
    
    That looks nice, except that the third one doesn't actually match two numbers, it matches two Arabic numerals representing nonnegative integers. It won't work for "-5-3", or "3.1-3.5", or "1+2j-1-2j". The first two, on the other hand, work for any Number type--even one I didn't know about (say, if the user has imported a quaternion library).

    It's also worth noting that he presents the first two as parallels to "a tuple (x, y)" and "a mapping {'x': x, 'y': y}" (in his proposal, other objects are also matched by effectively converting themselves into a tuple or a mapping, so this covers more than it appears to), and it's pretty clear that (with the addition of "where x and y are Numbers") the Python pattern is an exact translation of the for-humans pattern. But the third is presented as a parallel to "a string "{x}-{y}.format(x, y)", which not only doesn't actually match (as explained above), but is not at all readable as doing so.

    Can this be hacked into Python as a module?

    MacroPy comes with pattern matching as an example of the kinds of things you could add to Python using its macro facility. It essentially uses the "decorator to define both __init__ and __match__" idea above, and only classes defined with that decorator can be matched, and then it (ab)uses the with and if statements, but it seems pretty nice.

    If you don't want a full macro-processing import hook, there are hacks that can accomplish some of the same by, e.g., ignoring the code and inspecting the function's source, or even decompiling the function. Or, if you're willing to give up some niceties, you can write things a bit more clumsily and use a portable (non-hacky) decorator to convert. Or, with even more clunky syntax, you can do it with just a special type with overloaded operators. There are lots of implementations using every one of these tricks; Grant Jenks' PyPatt helpfully provides a quick survey of them (as well as being a good one in itself).
    2

    View comments

  9. Weak typing

    A language with weak typing is one where programs can escape the type system. Another way to describe it is that values can change types.

    For example, in C, I can create a pointer to characters, then tell the compiler I want to use it as a pointer to integers:
        char sz[] = "abcdefg";
        int *i = (int *)sz;
    
    On a little-endian platform with 32-bit integers, this makes i into an array of the numbers 0x64636261 and 0x00676665. In fact, you can even cast pointers themselves to integers (of the appropriate size), or vice-versa:
        char *spam = (char *)0x12345678
        spam[0] = 0;
    
    And this means I can overwrite memory anywhere in the system.*

    * Of course modern OS's use virtual memory and page protection so I can only overwrite my own process's memory, but there's nothing about C itself that offers such protection, as anyone who ever coded on, say, Classic Mac OS or Win16 can tell you.

    Traditional Lisp allowed similar kinds of hackery; on some platforms, double-word floats and cons cells were the same type, and you could just pass one to a function expecting the other and it would "work".

    Most languages today aren't quite as weak as C and Lisp were, but many of them are still somewhat leaky. For example, any OO language that has an unchecked "downcast",* that's a type leak: you're essentially telling the compiler "I know I didn't give you enough information to know this is safe, but I'm pretty sure it is," when the whole point of a type system is that the compiler always has enough information to know what's safe.

    * A checked downcast doesn't make the language's type system any weaker just because it moves the check to runtime. If it did, then subtype polymorphism (aka virtual or fully-dynamic function calls) would be the same violation of the type system, and I don't think anyone wants to say that.

    Very few "scripting" languages are weak in this sense. Even in Perl or Tcl, you can't take a string and just interpret its bytes as an integer.* But it's worth noting that in CPython (and similarly for many other interpreters for many languages), if you're really persistent, you can use `ctypes` to load up `libpython`, cast an object's `id` to a `POINTER(Py_Object)`, and force the type system to leak. Whether this makes the type system weak or not depends on your use cases—if you're trying to implement an in-language restricted execution sandbox to ensure security, you do have to deal with these kinds of escapes…

    * You can use a function like struct.unpack to read the bytes and build a new int out of "how C would represent these bytes", but that's obviously not leaky; even Haskell allows that.

    Implicit conversion

    Implicit conversion is really a different thing from a weak or leaky type system.

    Every language, even Haskell, has functions to, say, convert an integer to a string or a float. But some languages will do some of those conversions for you automatically—e.g., in C, if you call a function that wants a float, and you pass it in int, it gets converted for you. This can definitely lead to bugs with, e.g., unexpected overflows, but they're not the same kinds of bugs you get from a weak type system.

    C isn't really being any weaker here; you can add an int and a float in Haskell, or even concatenate a float to a string, you just have to do it more explicitly.

    Or, using the alternate definition, the value's type isn't changing from int to float, the program is creating a new float by calling a conversion function on the int, but the original int is still sitting around in whatever variable (or temporary location) it was always in.

    With dynamic languages, the whole idea of implicit conversion is pretty murky. There's no such thing as "a function that wants a float" in Python or Perl. But there are overloaded functions that do different things with different types, and there's a strong intuitive sense that, e.g., adding a string to something else is "a function that wants a string".

    In that sense, Perl, Tcl, and JavaScript appear to do a lot of implicit conversions ("a" + 1 gives you "a1"), while Python does a lot fewer ("a" + 1 raises an exception, but 1.0 + 1 does give you 2.0*).

    * Actually, in modern Python, that can be explained in terms of OO subtyping, since `isinstance(2, numbers.Real)` is true. I don't think there's any sense in which `2` is an instance of the string type in Perl or JavaScript… although in Tcl, it actually is, since _everything_ is an instance of string.

    But it's pretty hard to put that into anything beyond a loose, intuitive sense. Why shouldn't there be a + function that takes a string and an int? Obviously (string, int)->string is a perfectly valid type for a function to have, since it's the type of, e.g., the index function.

    Another meaning for strength

    There's another, completely orthogonal, definition of "strong" vs. "weak" typing, where "strong" means powerful/flexible/expressive.

    For example, Haskell lets you define a type that's a number, a string, a list of this type, or a map from strings to this type, which is a perfect way to represent anything that can be decoded from JSON. There's no way to define such a type in Java. But at least Java has parametric (generic) types, so you can write a function that takes a List of T and know that the elements are of type T; other languages, like early Java, forced you to use a List of Object and downcast. But at least Java lets you create new types with their own methods; C only lets you create structures. And BCPL didn't even have that. And so on down to assembly, where the only types are different bit lengths.

    So, in that sense, Haskell's type system is stronger than modern Java's, which is stronger than earlier Java's, which is stronger than C's, which is stronger than BCPL's.

    So, where does Python fit into that spectrum? Again, dynamic types make this murky.

    In many cases, duck typing allows you to simulate everything you can do in Haskell, and even some things you can't; sure, errors are caught at runtime instead of compile time, but they're still caught. However, there are cases where duck typing isn't sufficient.

    For example, in Haskell, if you have a doubly-optional value (e.g., from calling zip_longest* on a list of optional values), you can distinguish None (filled in by zip_longest) from Just(None) (an actual None value from the original list). In Python, they're both None—that's why zip_longest needs a fillvalue parameter, so you can pass an alternate sentinel.

    * I'm using Python terminology and syntax instead of Haskell here, so people who don't know Haskell can understand. I assume people who do know both languages can figure out what I mean.

    For another example, in Haskell, the concept of an empty list of integers makes perfect sense; in Python, an empty list is an empty list. So, in Haskell, you could decide that reducing + over that list should return 0*; in Python, you can't.

    * In fact, Haskell doesn't let you do this; if you call the reduce function with no start value on an empty list, you get an error. But its type system is powerful enough that you _could_ make this work, and Python's isn't.

    On top of that, in many cases, often arguable whether Python really is simulating a stronger type system. For example, in Haskell, if you have a function that returns either a string or nothing, it returns Maybe String; to access the string, you have to use Just on it, even after testing for Nothing. Normally, you do this with pattern matching:
        case spam:
            of None:
                print('I got nothing')
            of Just(s):
                print('I got {}'.format(s))
    
    In Python, the same function would return either None, or a str; to access the str, you just access it directly after testing for None, no need to pull it out of a Just. So, does this mean Python is simulating disjunctive types, and its if not None is doing implicit lowering? On the one hand, you get effectively the same error in Python trying to call, e.g., len(spam) if spam is None as you'd get in Haskell. On the other hand, if you call, say, __sizeof__ or __lt__ without testing for None, it will work in Python, and the equivalent would definitely not work in Haskell.

    I think this is a matter of degrees; the cases where Python gets it "wrong" according to Haskell are just more examples of where duck typing isn't as powerful as Haskell's type system. (And, conversely, the places where Python gets it "right" are examples of where Python is more concise and readable without giving up any power.)
    0

    Add a comment

  10. The official tutorial does a great job explaining list comprehensions, iterators, generators, and generator expressions at a high level. Since some people don't want to read even that much, I wrote a post on comprehensions for dummies to summarize it.

    And if you want to know the lowest level nitty gritty of how they work, there's documentation to point you to the right place, and then the code is relatively readable, at least if you understand C well and know the basics of how CPython works.

    But what if you're somewhere between the two extremes? People occasionally ask this on StackOverflow, and the questions invariably get closed because "explain to me how this works" is not a good question for the SO format, but it's a perfectly good question for, say, a blog.

    List comprehensions


    Under the covers, Python compiles the guts of your list comprehension into a function, and then compiles the comprehension itself into a call to that function. I'll oversimplify it a bit, but the basic idea is this:
        [random.choice(string.ascii_lowercase + " ") for i in range(n)]
    
    … turns into:
        def _comprehension(source):
            result = []
            for i in source:
                result.append(random.choice(string.ascii_lowercase + " "))
            return result
        _comprehension(iter(range(n)))
    

    Of course the function isn't defined in the middle of the expression, it's sort of but not quite as if it were defined in the previous statement (it is definitely defined in the local scope, of course, so it can access your variables as closure cells). And it doesn't have a visible name like _comprehension. Also, because comprehensions are limited in form, it can optimize things a bit. But that's the basic idea.

    Set and dict comprehensions

    A set comprehension looks exactly like a list comprehension, except it starts with result = set() and does add instead of append.

    A dict comprehension is basically the same, but the function it calls on each colon-separated key-value tuple doesn't really exist, so just think of it as result[key] = value.

    Generator expressions

    A generator expression looks a lot like a list comprehension, except that it's a generator function instead of a regular function. In other words:
        def _comprehension(source):
            for i in source:
                yield random.choice(string.ascii_lowercase + " ")
        _comprehension(iter(range(n)))
    
    In fact, generator expressions (even though they came later than list comprehensions in Python history) are really the core concept; you can almost build everything else on top of them:
        a = [i*2 for i in range(10)]
        b = list(i*2 for i in range(10))
        assert a == b
        a = {i*2 for i in range(10)}
        b = set(i*2 for i in range(10))
        assert a == b
        a = {i: i*2 for i in range(10)}
        b = dict((i, i*2) for i in range(10))
        assert a == b
    
    But the other comprehensions aren't quite pure syntactic sugar.

    First, they're up to 40% faster (in the worst case, where you're basically doing nothing).

    Second, raising StopIteration just ends a generator expression early, but in a list/set/dict comprehension it ends the whole containing expression; you don't get a list up to the point where you raised StopIteration, you get nothing. (This difference is apparently actually a bug, but not one worth fixing; see my post Can you optimize list(genexp) for further details on what it would take to fix it.) So, in the rare cases where you need to handle StopIteration properly (which I personally have only come across once, and it was write writing code to experiment with an idea for changing the language, not real code…), you have to use list(…), not […].

    Finally, if you're writing code that's backward compatible with Python 2.x, generator expressions worked the same way they do in 3.x, but the other comprehensions didn't; they were inlined into the current function, meaning they leaked the iteration variable into the outer scope (overwriting any existing variable with the same name). And if you're still using Python 2.6 or earlier, you don't have set or dict comprehensions, so you have no choice but to use set(…) or dict(…).

    Disassembling comprehensions

    If you want to dive in further yourself, you might think about disassembling the comprehension. But if you try that, you're just going to be disassembling the code that builds and calls a function, not the actual internal comprehension function's code.

    Of course you run into that problem with any local function, and you can get around that easily by having the outer function return the local function (or its disassembly, or whatever), but that doesn't work here, because you don't have a name for the local to return (and it's not in locals() or anywhere else introspectable).

    Well, you can always put a comprehension in a module and disassemble everything in the module.

    Or you can use this quick&dirty hack:
        >>> frame = [sys._getframe() for _ in range(1)][0]
        >>> dis.dis(frame.f_code)
          1           0 BUILD_LIST               0
                      3 LOAD_FAST                0 (.0)
                >>    6 FOR_ITER                18 (to 27)
                      9 STORE_FAST               1 (_)
                     12 LOAD_GLOBAL              0 (sys)
                     15 LOAD_ATTR                1 (_getframe)
                     18 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
                     21 LIST_APPEND              2
                     24 JUMP_ABSOLUTE            6
                >>   27 RETURN_VALUE
    
    If you know Python bytecode, you can see some of the optimizations I mentioned—there's no GET_ITER before the FOR_ITER, because the source is always passed in as an iterator; it uses special BUILD_LIST and LIST_APPEND bytecodes instead of calling the list constructor and looking up and calling its append method and then having to push the list again; it jumps straight from the loop to the RETURN_VALUE rather than doing the extra POP_BLOCK work and pushing some value, because the list set up in BUILD_LIST is already guaranteed to be on top of the stack.

    You can probably guess how set and dict comprehensions work: they do BUILD_SET and BUILD_MAP, and SET_ADD and MAP_ADD instead of the list equivalents. And MAP_ADD requires two things on the stack instead of one.

    Generator expressions are similar. No BUILD_LIST, a YIELD_VALUE followed by POP_TOP instead of LIST_APPEND, and they have to LOAD_CONST None at the end because they have nothing to return, but that's the only differences.
    1

    View comments

Blog Archive
About Me
About Me
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.