One common "advanced question" on places like StackOverflow and python-list is "how do I dynamically create a function/method/class/whatever"? The standard answer is: first, some caveats about why you probably don't want to do that, and then an explanation of the various ways to do it when you really do need to.

But really, creating functions, methods, classes, etc. in Python is always already dynamic.
Some cases of "I need a dynamic function" are just "Yeah? And you've already got one". More often, you do need something a little more complicated, but still something Python already gives you. Occasionally, even that isn't enough. But, once you understand how functions, methods, classes, etc. work in Python, it's usually pretty easy to understand how to do what you want. And when you really need to go over the edge, almost anything you can think of, even if it's almost always a bad idea, is probably doable (either because "almost always" isn't "always", or just because almost nobody would ever think to try to do it, so it wasn't worth preventing).

Functions

A normal def statement compiles to code that creates a new function object at runtime.

For example, consider this code:
def spam(x):
    return x+1
Let's say you type that into the REPL (the interactive interpreter). The REPL reads lines until it gets a complete statement, parses and compiles that statement, and then interprets the resulting bytecode. And what does that definition get compiled to? Basically the same thing as this:
spam = types.FunctionType(
    compile('return x+1\n', '__main__', mode='function'),
    globals(),
    'spam',
    (),
    ())
spam.__qualname__ = 'spam'
You can't quite write this, because the public interface for the compile function doesn't expose all of the necessary features--but outside of that compile, the rest is all real Python. (For simple lambda functions, you can use, e.g., compile('1 + 2', '__main__', mode='eval') and the whole thing is real Python, but that doesn't work for def functions. When you really need to create code objects, there are ways to do it, but you very rarely need to, so let's not worry about that.)

If you put the same thing in a module instead of typing it at the REPL, the only difference is that the body is compiled ahead of time and stored in a marshaled code object inside the .pyc file so it never needs to be compiled again. The def statement is still compiled and then interpreted as top-level module code that constructs a function on the fly out of that code constant, every time you import the module..

For a slightly more complicated example, consider this:
def add_one(x: int=0) -> int:
    """Add 1"""
    return x+1
This is equivalent to:
spam = types.FunctionType(
    compile('return x+1\n', '__main__', mode='function'),
    globals(),
    'add_one',
    (0,), # This is where default values go
    ())
spam.__qualname__ = 'add_one'
spam.__doc__ = """Add 1"""
adder.__annotations__ = {'return': 'int', 'x': 'int'}
Notice that the default values are passed into that FunctionType constructor. That's why defaults are bound in at the time the def statement is executed, which is how you can do tricks like using a dict crammed into a default value as a persistent cache.

Closures

The real point of functions always being created on the fly is that this means any function can be a closure--it can capture values from the environment that the function was defined in. The standard Lisp-style example looks like this:
def make_adder(n):
    def adder(x):
 return x+n
    return adder
That's equivalent to:
adder = types.FunctionType(
    compile('return x+n', '__main__', mode='exec'),
    globals(),
    'adder',
    (),
    (CellType(locals(), 'n'),)) # tuple of closure cells
adder.__qualname__ = 'make_adder.<locals>.adder'
So every time you call make_adder, you get back a new adder function, created on the fly, referencing the particular n local variable from that particular call to make_adder.

(Unfortunately, I cheated a bit. Unlike function objects, and even code objects, you can't actually manually create closure cells like this. But you rarely want to. And if you ever do need it, you can just do a trivial lambda that captures n and then do a minor frame hack to get at the cell.)

Even if you never want to do anything this Lispy, closures get used all over the place. For example, if you've ever written a Tk GUI, you may have done something like this in your Frame subclass:
def __init__(self):
    Frame.__init__(self)
    self.hello_button = tkinter.Button(
        self, text='Hello',
        command=lambda: self.on_button_click(self.hello_button))
That lambda is creating a new function that captures the local self variable so it can access self.hello_button whenever the button is clicked.

(A lambda compiles in almost the same way as a def, except that it's a value in the middle of an expression rather than a statement, and it doesn't have a name, docstring, etc.).

Another common way to write the same button is with functools.partial:
    self.hello_button = tkinter.Button(
        self, text='Hello',
        command=partial(self.on_button_click, self.hello_button)
If Python didn't come with partial, we could easily write it ourselves:
def partial(func, *args, **kw):
    def wrapped(*more_args, **more_kw):
        return func(*args, *more_args, **kw, **more_kw)
    return wrapped
This is also how decorators work:
def simple_memo(func):
    cache = {}
    def wrapped(*args):
        args = tuple(args)
        if args not in cache:
     cache[args] = func(*args)
 return cache[args]
    return wrapped

@simple_memo
def fib(n):
    if n < 2: return n
    return fib(n-1) + fib(n-2)
I've written a dumb exponenially-recursive Fibonacci function, and that @simple_memo magically turns it into a linear-time function that takes a fraction of a second instead of hours. How does this work? Simple: after the usual fib = types.FunctionType blah blah stuff, it does fib = simple_memo(fib). That's it. Because function are already always created on the fly, decorators don't need anything complicated.

By the way, if you can follow everything above, you pretty much know all there is to know about dynamic higher-order programming, except for how the theory behind it maps to advanced math. (And that part is simple if you already know the math, but meaningless if you don't.) That's one of those things that sounds scary when functional programmers talk about it, but if you go from using higher-order functions to building them to understanding how the work before the theory, instead of going from theory to implementation to building to using, it's not actually hard.

Fake functions

Sometimes, you can describe what code should run when you call spam, but it's not obvious how to construct a function object that actually runs that code. Or it's easy to write the closure, but hard to think about it when you later come back and read it.

In those cases, you can create a class with a __call__ method, and it acts like a function. For example:
class Adder:
    def __init__(self, n):
        self._n = n
    def __call__(self, x):
        return x+n
An Adder(5) object behaves almost identical to a make_adder(5) closure. It's just a matter of which one you find more readable. Even for experienced Python programmers, the answer is different in different cases, which is why you'll find both techniques all over the stdlib and popular third-party modules.

In fact, functools.partial isn't actually a closure, but a class, like this:
class partial:
    def __init__(self, func, *args, **kw):
        self.func, self.args, self.kw = func, args, kw
    def __call__(self, *more_args, **more_kw):
        return self.func(*self.args, *more_args, **self.kw, **more_kw)
(Actually, the real partial has a lot of bells and whistles. But it's still not that complicated. The docs link to the source code, if you want to see it for yourself.)

Methods

OK, so you can create functions on the fly; what about methods?

Again, they're already always created on the fly, and once you understand how, you can probably do whatever it was you needed.

Let's look at an example:
class Spam:
    def __init__(self, x, y):
        self.x, self.y = x, y
    def eggs(self):
        return self.x + self.y
spam = Spam()
print(spam.eggs())
The definition of eggs is compiled and interpreted exactly the same as the definition of any other function. And the result is just stored as a member of the Spam class (see the section on classes to see how that works), so when you write Spam.eggs you just get that function.

This means that if you want to add a new method to a class, there's no special trick, you just do it:
def cheese(self):
    return self.x * self.y
Spam.cheese = cheese
print(spam.cheese())
That's all it takes to add a method to a class dynamically.

But meanwhile, on the instance, spam.eggs is not just a function, it's a bound method. Try print(spam.eggs) from the interactive REPL. A bound method knows which instance it belongs to, so when you call it, that instance can get passed as the self argument.

The details of how Python turns the function Spam.eggs into the bound method spam.eggs are a bit complicated (and I've already written a whole post about them), but we don't need to know that here.

Obviously, bound methods get created dynamically. Every time you do spam.eggs or Spam().cheese or string.ascii_letters.find, you're getting a new bound method.

And if you want to create one manually, you can just call types.MethodType(func, obj).

So, if you want to add a new method beans to just the spam instance, without adding it to the Spam class? Just construct the same bound method that Python would have constructed for you whenever you looked up spam.beans, and store it there:
def beans(self):
    return self.x / self.y
spam.beans = types.MethodType(beans, spam)
And now you know enough to implement Javascript-style object literals, or even prototype inheritance. Not that you should do either, but if you ever run into something that you really do want to do, that requires creating methods on the fly, either on classes or on instances, you can do it. Because creating methods on the fly is what Python always does.

Classes

What if we want to create a class dynamically?

Well, I shouldn't have to tell you at this point. You're always creating classes dynamically.

Class definitions work a bit differently from function definitions. Let's start with a simple example again:
class Spam:
    z = 0
    def __init__(self, x, y):
        self.x, self.y = x, y
    def eggs(self):
        return self.x + self.y + self.z
First, Python interprets the class body the same as any other code, but it runs inside an empty environment. So, those def calls create new functions named __init__ and eggs in that empty environment, instead of at the global level. Then, it dynamically creates a class object out of that environment, where every function or other value that got created becomes a method or class attribute on the class. The code goes something like this:
_Spam_locals = {}
exec('def __init__(self, x, y):\n    self.x, ... blah blah ...\n',
     globals=globals(), locals=_Spam_locals)
Spam = type('Spam', (object,), _Spam_locals)
This is why you can't access the Spam class object inside the class definition--because there is nothing named Spam until after Python calls type and stores the result in Spam. (But of course you can access Spam inside the methods; by the time those methods get called, it'll exist.)

So, what if you want to create some methods dynamically inside the class? No sweat. By the time it gets to calling type, nobody can tell whether eggs gets into the locals dict by you calling def eggs(...): or eggs = fancy_higher_order_function(...), so they both do the same thing.

In fact, one idiom you'll see quite often in the stdlib is this:
    def __add__(self, other):
        blah blah
    __radd__ = __add__
This just makes __radd__ another name for the same method as __add__.

And yes, you can call type manually if you need to, passing it any dict you want as an environment:
def __init__(self, x, y):
    self.x, self.y = x, y
Spam = type('Spam', (object,),
    {'z': 0, '__init__': __init__,
     'eggs': lambda self: self.x + self.y + self.z}
There are a few more details to classes. A slightly more complicated example covers most of them:
@decorate_my_class
class Spam(metaclass=MetaSpam, Base1, Base2):
    """Look, I've got a doc string"""
    def __init__(self, x, y):
        self.x, self.y = x, y
This is equivalent to:
_Spam_locals = {}
exec('def __init__(self, x, y):\n    self.x, ... blah blah ...\n',
     globals=globals(), locals=_Spam_locals)
Spam = MetaSpam('Spam', (Base1, Base2), _Spam_locals)
Spam.__doc__ = """Look, I've got a doc string"""
Spam = decorate_my_class(Spam)
As you can see, if there's a metaclass, it gets called in place of type, and if there are base classes, they get passed in place of object, and docstrings and decorators work the same way as in functions.

There are a few more complexities with qualnames, __slots__, closures (if you define a class inside a function), and the magic to make super() work, but this is almost everything.

Remember from the last section how easy it is to add methods to a class? Often that's simpler than trying to programmatically generate methods from inside the class definition, or customize the class creation. (See functools.total_ordering for a nice example.) But when you really do need a dynamically-created class for some reason, it's easy.

Generating code

Occasionally, no matter how hard you try, you just can't come up with any way to define or modify a function or class dynamically with your details crammed into the right place the right way, at least not readably. In that case, you can always fall back to generating, compiling, and executing source code.

The simplest way to do this is to just build a string and call exec on it. You can find a few examples of this in the stdlib, like collections.namedtuple. (Notice the trick it uses of calling exec in a custom empty namespace, then copying the value out of it. This is a bit cleaner that just executing in your own locals and/or globals.)

You've probably hard "exec is dangerous". And of course it is. But "dangerous" really just means two things: "powerful" and "hard to control or reason about". When you need the first one badly enough that you can accept the second, danger is justified. If you don't understand things like how to make sure you're not letting data some user sent to your web service end up inside your exec don't use it. But if you're still reading at this point, I think you can learn how to reason through the issues.

Sometimes you don't want to exec right now, you want to compile something that you can pass around and exec later. That works fine too.

Sometimes, you even want to generate a whole module and save it as a .py file. Of course people do that kind of thing in C all the time, as part of the build process--but in Python, it doesn't matter whether you're generating a .py file during a build or install to be used later, or generating one at normal runtime to be used right now; they're the same case as far as Python is concerned.

Sometimes you want to build an AST (abstract source tree) or a stream of tokens instead of source code, and then compile or execute that. (And this is a great time to yet again plug macropy, one of the coolest projects ever.)

Sometimes you even want to do the equivalent of inline assembly (but assembling Python bytecode, not native machine code, of course) with a library like byteplay or cpyasm (or, if you're a real masochist, just assembling it in your head and using struct.pack on the resulting array of 16-bit ints...). Again, unlike C, you can do this at runtime, then wrap that code object up in a function object, and call it right now.

You can even do stuff like marshaling code objects to and from a database to build functions out of later.

Conclusion

Because almost all of this is accessible from within Python itself, and all of it is inherently designed to be executed on the fly, almost anything you can think of is probably doable.

So, if you're thinking "I need to dynamically create X", you need to think through exactly what you need, but whether that turns out to be "just a normal function" or something deeply magical, you'll be able to do it, or at least explain the magic you're looking for in specific enough terms that someone can actually show you how to do it.
1

View comments

It's been more than a decade since Typical Programmer Greg Jorgensen taught the word about Abject-Oriented Programming.

Much of what he said still applies, but other things have changed.

5

I haven't posted anything new in a couple years (partly because I attempted to move to a different blogging platform where I could write everything in markdown instead of HTML but got frustrated—which I may attempt again), but I've had a few private comments and emails on some of the old posts, so I

6

Looking before you leap

Python is a duck-typed language, and one where you usually trust EAFP ("Easier to Ask Forgiveness than Permission") over LBYL ("Look Before You Leap").

1

Background

Currently, CPython’s internal bytecode format stores instructions with no args as 1 byte, instructions with small args as 3 bytes, and instructions with large args as 6 bytes (actually, a 3-byte EXTENDED_ARG followed by a 3-byte real instruction).

6

If you want to skip all the tl;dr and cut to the chase, jump to Concrete Proposal.

8

Many people, when they first discover the heapq module, have two questions:

Why does it define a bunch of functions instead of a container type? Why don't those functions take a key or reverse parameter, like all the other sorting-related stuff in Python? Why not a type?

At the abstract level, it'

1

Currently, in CPython, if you want to process bytecode, either in C or in Python, it’s pretty complicated.

The built-in peephole optimizer has to do extra work fixing up jump targets and the line-number table, and just punts on many cases because they’re too hard to deal with.

3

One common "advanced question" on places like StackOverflow and python-list is "how do I dynamically create a function/method/class/whatever"? The standard answer is: first, some caveats about why you probably don't want to do that, and then an explanation of the various ways to do it when you reall

1

A few years ago, Cesare di Mauro created a project called WPython, a fork of CPython 2.6.4 that “brings many optimizations and refactorings”. The starting point of the project was replacing the bytecode with “wordcode”. However, there were a number of other changes on top of it.

1

Many languages have a for-each loop.

4

When the first betas for Swift came out, I was impressed by their collection design. In particular, the way it allows them to write map-style functions that are lazy (like Python 3), but still as full-featured as possible.

2

In a previous post, I explained in detail how lookup works in Python.

2

The documentation does a great job explaining how things normally get looked up, and how you can hook them.

But to understand how the hooking works, you need to go under the covers to see how that normal lookup actually happens.

When I say "Python" below, I'm mostly talking about CPython 3.5.

7

In Python (I'm mostly talking about CPython here, but other implementations do similar things), when you write the following:

def spam(x): return x+1 spam(3) What happens?

Really, it's not that complicated, but there's no documentation anywhere that puts it all together.

2

I've seen a number of people ask why, if you can have arbitrary-sized integers that do everything exactly, you can't do the same thing with floats, avoiding all the rounding problems that they keep running into.

2

In a recent thread on python-ideas, Stephan Sahm suggested, in effect, changing the method resolution order (MRO) from C3-linearization to a simple depth-first search a la old-school Python or C++.

1

Note: This post doesn't talk about Python that much, except as a point of comparison for JavaScript.

Most object-oriented languages out there, including Python, are class-based. But JavaScript is instead prototype-based.

1

About a year and a half ago, I wrote a blog post on the idea of adding pattern matching to Python.

I finally got around to playing with Scala semi-seriously, and I realized that they pretty much solved the same problem, in a pretty similar way to my straw man proposal, and it works great.

About a year ago, Jules Jacobs wrote a series (part 1 and part 2, with part 3 still forthcoming) on the best collections library design.

1

In three separate discussions on the Python mailing lists this month, people have objected to some design because it leaks something into the enclosing scope. But "leaks into the enclosing scope" isn't a real problem.

2

There's a lot of confusion about what the various kinds of things you can iterate over in Python. I'll attempt to collect definitions for all of the relevant terms, and provide examples, here, so I don't have to go over the same discussions in the same circles every time.

8

Python has a whole hierarchy of collection-related abstract types, described in the collections.abc module in the standard library. But there are two key, prototypical kinds. Iterators are one-shot, used for a single forward traversal, and usually lazy, generating each value on the fly as requested.

2

There are a lot of novice questions on optimizing NumPy code on StackOverflow, that make a lot of the same mistakes. I'll try to cover them all here.

What does NumPy speed up?

Let's look at some Python code that does some computation element-wise on two lists of lists.

2

When asyncio was first proposed, many people (not so much on python-ideas, where Guido first suggested it, but on external blogs) had the same reaction: Doing the core reactor loop in Python is going to be way too slow. Something based on libev, like gevent, is inherently going to be much faster.

Let's say you have a good idea for a change to Python.

1

There are hundreds of questions on StackOverflow that all ask variations of the same thing. Paraphrasing:

lst is a list of strings and numbers. I want to convert the numbers to int but leave the strings alone.

2

In Haskell, you can section infix operators. This is a simple form of partial evaluation. Using Python syntax, the following are equivalent:

(2*) lambda x: 2*x (*2) lambda x: x*2 (*) lambda x, y: x*y So, can we do the same in Python?

Grammar

The first form, (2*), is unambiguous.

1

Many people—especially people coming from Java—think that using try/except is "inelegant", or "inefficient". Or, slightly less meaninglessly, they think that "exceptions should only be for errors, not for normal flow control".

These people are not going to be happy with Python.

2

If you look at Python tutorials and sample code, proposals for new language features, blogs like this one, talks at PyCon, etc., you'll see spam, eggs, gouda, etc. all over the place.

Most control structures in most most programming languages, including Python, are subordinating conjunctions, like "if", "while", and "except", although "with" is a preposition, and "for" is a preposition used strangely (although not as strangely as in C…).

There are two ways that some Python programmers overuse lambda. Doing this almost always mkes your code less readable, and for no corresponding benefit.

1

Some languages have a very strong idiomatic style—in Python, Haskell, or Swift, the same code by two different programmers is likely to look a lot more similar than in Perl, Lisp, or C++.

There's an advantage to this—and, in particular, an advantage to you sticking to those idioms.

1

Python doesn't have a way to clone generators.

At least for a lot of simple cases, however, it's pretty obvious what cloning them should do, and being able to do so would be handy. But for a lot of other cases, it's not at all obvious.

5

Every time someone has a good idea, they believe it should be in the stdlib. After all, it's useful to many people, and what's the harm? But of course there is a harm.

3

This confuses every Python developer the first time they see it—even if they're pretty experienced by the time they see it:

>>> t = ([], []) >>> t[0] += [1] --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <stdin> in <module>()

11
Blog Archive
About Me
About Me
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.