In a recent thread on python-ideas, Stephan Sahm suggested, in effect, changing the method resolution order (MRO) from C3-linearization to a simple depth-first search a la old-school Python or C++. I don't think he realized that's what he was proposing at first, but the key idea is that he wants Mixin to override A in B as well as overriding object in A in this code:

class Mixin: pass
class A(Mixin): pass
class B(Mixin, A): pass

In other words, the MRO should be B-Mixin-A-Mixin-object. (Why not B-Mixin-object-A-Mixin-object? I think he just didn't think that through, but let's not worry about that.) After all, why would he put Mixin before A if he didn't want it to override A in B? And why would he attach Mixin to A if he didn't want it to override object in A?

Well, that doesn't actually work. The whole point of linearization is that each class appears only once in the MRO, and many feature of Python--including super, which he wanted to make extensive use of--depend on that. For example, with his MRO, inside Mixin.spam, super().spam() is going to call A.spam, and its super().spam() is going to call Mixin.spam(), and you've obviously got a RecursionError on your hands.

I think ultimately, his problem is that what he wants isn't really a mixin class (in typical Python terms--in general OO programming terms, it's one of the most overloaded words you can find...). For example, a wrapper class factory could do exactly what he wants, like this:

def Wrapper(cls): return cls
class A(Wrapper(object)): pass
class B(Wrapper(A)): pass

And there are other ways to get where he wants.

Anyway, changing the default MRO in Python this way is a non-starter. But if he wants to make that change manually, how hard is it? And could he build a function that lets his classes could cooperate using that function instead of super?

Customizing MRO


The first step is to build the custom MRO. This is pretty easy. He wants a depth-first search of all bases, so:

[cls] + list(itertools.chain.from_iterable(base.__mro__ for base in cls.__bases__))

Or, if leaving the extra object out was intentional, that's almost as easy:

[cls] + list(itertools.chain.from_iterable(base.__mro__[:-1] for base in cls.__bases__)) + [object]

But now, how do we get that into the class's __mro__ attribute?

It's a read-only property; you can't just set it. And, even if you could, you need type.__new__ to actually return something for you to modify--but if you give it a non-linearizable inheritance tree, it'll raise an exception. And finally, even if you could get it set the way you want, every time __bases__ is changed, __mro__ is automatically rebuilt.

So, we need to hook the way type builds __mro__.

I'm not sure if this is anywhere in the reference documentation or not, but the answer is pretty easy: the way type builds __mro__ is by calling its mro method. This is treated as a special method (that part definitely isn't documented anywhere), meaning it's looked up on the class (that is, the metaclass of the class being built) rather than the instance (the class being built), doesn't go through __getattribute__, etc., so we have to build a metaclass.

But once you know that, it's all trivial:

class MyMRO(type): 
    def mro(cls): 
        return ([cls] + 
                list(itertools.chain.from_iterable(base.__mro__[1:] for base in cls.__bases__)) +
                [object])

class Mixin(metaclass=MyMRO): pass
class A(Mixin): pass
class B(Mixin, A): pass

And now, B.__mro__ is B-Mixin-A-Mixin-object, exactly as desired.

For normal method calls, this does what the OP wanted: Mixin gets to override A.

But, as mentioned earlier, it obviously won't enable the kind of super he wants, and there's no way it could. So, we'll have to build our own replacement.

Bizarro Super


If you want to learn how super works, I think the documentation in Invoking Descriptors is complete, but maybe a little terse to serve as a tutorial. I know there's a great tutorial out there, but I don't have a link, so... google it.

Anyway, how super works isn't important; what's important is the define what we want here. Once we actually know exactly what we want, anything is possible as long as you believe, that's what science is all about.

Since we're defining something very different from super but still sort of similar, the obvious name is bizarro.

Now, we want a call to bizarro().spam() inside B.spam to call Mixin.spam, a call inside Mixin.spam to call A.spam, a call inside A.spam to call Mixin.spam, and a call inside Mixin.spam to call object.spam.

The first problem is that calling object.spam is just going to raise an AttributeError. Multiple inheritance uses of super are all about cooperative class hierarchies, and part of that cooperation is usually that the root of your tree knows not to call super. But here, Mixin is the root of our tree, but it also appears in other places on the tree, so that isn't going to work.

Well, since we're designing our own super replacement, there's no reason it can't also cooperate with the classes, instead of trying to be fully general. Just make it return a do-nothing function if the next class is object, or if the next class doesn't have the method, or if the next class has a different metaclass, etc. Pick whatever rule makes sense for your specific use case. Since I don't have a specific use case, and don't know what the OP's was (he wanted to create a "Liftable" mixin that helps convert instances of a base class into instances of a subclass, but he didn't explain how he wanted all of the edge cases to work, and didn't explain enough about why he wanted such a thing for me to try to guess on his behalf), I'll go with the "doesn't have the method".

While we're at it, we can skip over any non-cooperating classes that end up in the MRO (which would obviously be important if we didn't block object from appearing multiple times--but even with the MRO rule above, you'll have the same problem if your root doesn't directly inherit object).

The next problem--the one that's at the root of everything we're trying to work around here--is that we want two different things to happen "inside Mixin.spam", depending on whether it's the first time we're inside or the second. How are we going to do that?

Well, obviously, we need to keep track of whether it's the first time or the second time. One obvious way is to keep track of the index, so it's not A.spam if we're in Mixin.spam or object.spam if we're in Mixin.spam, it's B.__mro__[2] if we're in B.__mro__[1], and B.__mro__[4] if we're in B.__mro__[3]. (After first coding this up, I realized that an iterator might be a lot nicer than an index, so if you actually need to implement this yourself, try it that way. But I don't want to change everything right now.)

How can we keep track of anything? Make the classes cooperate. Part of the protocol for calling bizarro is that you take a bizarro_index parameter and pass it into the bizarro call. Let's make it as an optional parameter with a default value of 0, so your users don't have to worry about it, and make it keyword-only, so it doesn't interfere with *args or anything. So:

class Mixin(metaclass=MyMRO):
    def doit(self, *, bizarro_index=0):
        print('Mixin')
        bizarro(Mixin, self, bizarro_index).doit()

class A(Mixin):
    def doit(self, *, bizarro_index=0):
        print('A')
        bizarro(A, self, bizarro_index).doit()

class B(Mixin, A):
    def doit(self, *, bizarro_index=0):
        print('B')
        bizarro(B, self, bizarro_index).doit()

And now, we just have to write bizarro.

The key to writing something like super is that it returns a proxy object whose __getattribute__ looks in the next class on the MRO. If you found that nice tutorial on how super works, you can start with the code from there. We then have to make some changes:

  1. The way we pick the next class has to be based on the index.
  2. Our proxy has to wrap the function up to pass the index along.
  3. Whatever logic we wanted for dealing with non-cooperating classes has to go in there somewhere.

Nothing particularly hard. So:

def bizarro(cls, inst, idx): 
    class proxy: 
        def __getattribute__(self, name): 
            for superidx, supercls in enumerate(type(inst).__mro__[idx+1:], idx+1): 
                try:
                    method = getattr(supercls, name).__get__(inst) 
                except AttributeError: 
                    continue 
                if not callable(method):
                    return method # so bizarro works for data attributes
                @functools.wraps(method) 
                def wrapper(*args, **kw):
                    return method(*args, bizarro_index=superidx, **kw)
                return wrapper 
            return lambda *args, **kw: None 
    return proxy() 

And now, we're done.

Bizarro am very beautiful


In Python 3, super(Mixin, self) was turned into super(). This uses a bit of magic, and we can use the same magic here.

Every method has a cell named __class__ that tells you which class it's defined in. And every method takes its self as the first parameter. So, if we just peek into the caller's frame, we can get those easily. And while we're peeking into the frames, since we know the index has to be the bizarro_index parameter to any function that's going to participate in bizarro super-ing, we can grab that too:

def bizarro():
    f = sys._getframe(1)
    cls = f.f_locals['__class__']
    inst = f.f_locals[f.f_code.co_varnames[0]]
    idx = f.f_locals['bizarro_index']
    # everything else is the same as above

This is cheating a bit; if you read PEP 3135, the super function doesn't actually peek into the parent frame; instead, the parser recognizes calls to super() and changes them to pass the two values. I'm not sure that's actually less hacky--but it is certainly more portable, because other Python implementations aren't required to provide CPython-style frames and code objects. Also leaving the magic up to the parser means that, e.g., PyPy can still apply its no-frames-unless-needed optimization, trading a tiny bit of compile-time work for a small savings in every call.

If you want to do the same here, you can write an import hook that AST-transforms bizarro calls in the same way. But I'm going to stick with the frame hack.

Either way, now you can write this:

class Mixin(metaclass=MyMRO):
    def doit(self, *, bizarro_index=0):
        print('Mixin')
        bizarro().doit()

class A(Mixin):
    def doit(self, *, bizarro_index=0):
        print('A')
        bizarro().doit()

class B(Mixin, A):
    def doit(self, *, bizarro_index=0):
        print('B')
        bizarro().doit()

Meanwhile, notice that we don't actually use cls anywhere anyway, so... half a hack is only 90% as bad, right?

But still, that bizarro_index=0 bit. All that typing. All that reading. There's gotta be a better way.

Well, now you can!

We've already got a metaclass. We're already peeking under the covers. We're already wrapping functions. So, let's have our metaclass peek under the covers of all of our functions and automatically wrap anything that uses bizarro to take that bizarro_index parameter. The only problem is that the value will now be in the calling frame's parent frame (that is, the wrapper), but that's easy to fix too: just look in f_back.f_locals instead of f_locals.

import functools
import itertools
import sys

class BizarroMeta(type):
    def mro(cls):
        return ([cls] +
                list(itertools.chain.from_iterable(base.__mro__[:-1] for base in cls.__bases__)) +
                [object])
    def __new__(mcls, name, bases, attrs):
        def _fix(attr):
            if callable(attr) and 'bizarro' in attr.__code__.co_names:
                @functools.wraps(attr)
                def wrapper(*args, bizarro_index=0, **kw):
                    return attr(*args, **kw)
                return wrapper
            return attr
        attrs = {k: _fix(v) for k, v in attrs.items()}
        return super().__new__(mcls, name, bases, attrs)

def bizarro():
    f = sys._getframe(1)
    inst = f.f_locals[f.f_code.co_varnames[0]]
    idx = f.f_back.f_locals['bizarro_index']
    class proxy: 
        def __getattribute__(self, name): 
            for superidx, supercls in enumerate(type(inst).__mro__[idx+1:], idx+1):
                try:
                    method = getattr(supercls, name).__get__(inst)
                except AttributeError: 
                    continue 
                if not callable(method):
                    return method # so bizarro works for data attributes
                @functools.wraps(method) 
                def wrapper(*args, **kw): 
                    return method(*args, bizarro_index=superidx, **kw)
                return wrapper 
            return lambda *args, **kw: None 
    return proxy() 

class Mixin(metaclass=BizarroMeta):
    def doit(self):
        print('Mixin')
        bizarro().doit()

class A(Mixin):
    def doit(self):
        print('A')
        bizarro().doit()

class B(Mixin, A):
    def doit(self):
        print('B')
        bizarro().doit()

B().doit()

Run this, and it'll print B, then Mixin, then A, then Mixin.

Unless I made a minor typo somewhere, in which case it'll probably crash in some way you can't possibly debug. So you'll probably want to add a bit of error handling in various places. For example, it's perfectly legal for something to be callable but not have a __code__ member--a class, a C extension function, an instance of a class with a custom __call__ method... Whether you want to warn that you can't tell whether Spam.eggs uses bizarro or not because you can't find the code, assume it doesn't and skip it, assume it does and raise a readable exception, or something else, I don't know, but you probably don't want to raise an exception saying that type objects don't have __code__ attributes, or whatever comes out of this mess by default.

Anyway, the implementation is pretty it's pretty small, and not that complicated once you understand all the things we're dealing with, and the API for using it is about as nice as you could want.

I still don't know why you'd ever want to do this, but if you do, go for it.
1

View comments

It's been more than a decade since Typical Programmer Greg Jorgensen taught the word about Abject-Oriented Programming.

Much of what he said still applies, but other things have changed.

5

I haven't posted anything new in a couple years (partly because I attempted to move to a different blogging platform where I could write everything in markdown instead of HTML but got frustrated—which I may attempt again), but I've had a few private comments and emails on some of the old posts, so I

6

Looking before you leap

Python is a duck-typed language, and one where you usually trust EAFP ("Easier to Ask Forgiveness than Permission") over LBYL ("Look Before You Leap").

1

Background

Currently, CPython’s internal bytecode format stores instructions with no args as 1 byte, instructions with small args as 3 bytes, and instructions with large args as 6 bytes (actually, a 3-byte EXTENDED_ARG followed by a 3-byte real instruction).

6

If you want to skip all the tl;dr and cut to the chase, jump to Concrete Proposal.

8

Many people, when they first discover the heapq module, have two questions:

Why does it define a bunch of functions instead of a container type? Why don't those functions take a key or reverse parameter, like all the other sorting-related stuff in Python? Why not a type?

At the abstract level, it'

1

Currently, in CPython, if you want to process bytecode, either in C or in Python, it’s pretty complicated.

The built-in peephole optimizer has to do extra work fixing up jump targets and the line-number table, and just punts on many cases because they’re too hard to deal with.

3

One common "advanced question" on places like StackOverflow and python-list is "how do I dynamically create a function/method/class/whatever"? The standard answer is: first, some caveats about why you probably don't want to do that, and then an explanation of the various ways to do it when you reall

1

A few years ago, Cesare di Mauro created a project called WPython, a fork of CPython 2.6.4 that “brings many optimizations and refactorings”. The starting point of the project was replacing the bytecode with “wordcode”. However, there were a number of other changes on top of it.

1

Many languages have a for-each loop.

4

When the first betas for Swift came out, I was impressed by their collection design. In particular, the way it allows them to write map-style functions that are lazy (like Python 3), but still as full-featured as possible.

2

In a previous post, I explained in detail how lookup works in Python.

2

The documentation does a great job explaining how things normally get looked up, and how you can hook them.

But to understand how the hooking works, you need to go under the covers to see how that normal lookup actually happens.

When I say "Python" below, I'm mostly talking about CPython 3.5.

7

In Python (I'm mostly talking about CPython here, but other implementations do similar things), when you write the following:

def spam(x): return x+1 spam(3) What happens?

Really, it's not that complicated, but there's no documentation anywhere that puts it all together.

2

I've seen a number of people ask why, if you can have arbitrary-sized integers that do everything exactly, you can't do the same thing with floats, avoiding all the rounding problems that they keep running into.

2

In a recent thread on python-ideas, Stephan Sahm suggested, in effect, changing the method resolution order (MRO) from C3-linearization to a simple depth-first search a la old-school Python or C++.

1

Note: This post doesn't talk about Python that much, except as a point of comparison for JavaScript.

Most object-oriented languages out there, including Python, are class-based. But JavaScript is instead prototype-based.

1

About a year and a half ago, I wrote a blog post on the idea of adding pattern matching to Python.

I finally got around to playing with Scala semi-seriously, and I realized that they pretty much solved the same problem, in a pretty similar way to my straw man proposal, and it works great.

About a year ago, Jules Jacobs wrote a series (part 1 and part 2, with part 3 still forthcoming) on the best collections library design.

1

In three separate discussions on the Python mailing lists this month, people have objected to some design because it leaks something into the enclosing scope. But "leaks into the enclosing scope" isn't a real problem.

2

There's a lot of confusion about what the various kinds of things you can iterate over in Python. I'll attempt to collect definitions for all of the relevant terms, and provide examples, here, so I don't have to go over the same discussions in the same circles every time.

8

Python has a whole hierarchy of collection-related abstract types, described in the collections.abc module in the standard library. But there are two key, prototypical kinds. Iterators are one-shot, used for a single forward traversal, and usually lazy, generating each value on the fly as requested.

2

There are a lot of novice questions on optimizing NumPy code on StackOverflow, that make a lot of the same mistakes. I'll try to cover them all here.

What does NumPy speed up?

Let's look at some Python code that does some computation element-wise on two lists of lists.

2

When asyncio was first proposed, many people (not so much on python-ideas, where Guido first suggested it, but on external blogs) had the same reaction: Doing the core reactor loop in Python is going to be way too slow. Something based on libev, like gevent, is inherently going to be much faster.

Let's say you have a good idea for a change to Python.

1

There are hundreds of questions on StackOverflow that all ask variations of the same thing. Paraphrasing:

lst is a list of strings and numbers. I want to convert the numbers to int but leave the strings alone.

2

In Haskell, you can section infix operators. This is a simple form of partial evaluation. Using Python syntax, the following are equivalent:

(2*) lambda x: 2*x (*2) lambda x: x*2 (*) lambda x, y: x*y So, can we do the same in Python?

Grammar

The first form, (2*), is unambiguous.

1

Many people—especially people coming from Java—think that using try/except is "inelegant", or "inefficient". Or, slightly less meaninglessly, they think that "exceptions should only be for errors, not for normal flow control".

These people are not going to be happy with Python.

2

If you look at Python tutorials and sample code, proposals for new language features, blogs like this one, talks at PyCon, etc., you'll see spam, eggs, gouda, etc. all over the place.

Most control structures in most most programming languages, including Python, are subordinating conjunctions, like "if", "while", and "except", although "with" is a preposition, and "for" is a preposition used strangely (although not as strangely as in C…).

There are two ways that some Python programmers overuse lambda. Doing this almost always mkes your code less readable, and for no corresponding benefit.

1

Some languages have a very strong idiomatic style—in Python, Haskell, or Swift, the same code by two different programmers is likely to look a lot more similar than in Perl, Lisp, or C++.

There's an advantage to this—and, in particular, an advantage to you sticking to those idioms.

1

Python doesn't have a way to clone generators.

At least for a lot of simple cases, however, it's pretty obvious what cloning them should do, and being able to do so would be handy. But for a lot of other cases, it's not at all obvious.

5

Every time someone has a good idea, they believe it should be in the stdlib. After all, it's useful to many people, and what's the harm? But of course there is a harm.

3

This confuses every Python developer the first time they see it—even if they're pretty experienced by the time they see it:

>>> t = ([], []) >>> t[0] += [1] --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <stdin> in <module>()

11
Blog Archive
About Me
About Me
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.