Many people—especially people coming from Java—think that using try/except is "inelegant", or "inefficient". Or, slightly less meaninglessly, they think that "exceptions should only be for errors, not for normal flow control".

These people are not going to be happy with Python.
You can try to write Python as if it were Java or C, using Look-Before-You-Leap code instead of Easier-to-Ask-Forgiveness-than-Permission, returning error codes instead of raising exceptions for things that aren't "really" errors, etc. But you're going to end up with non-idiomatic, verbose, and inefficient code that's full of race conditions.

And you're still going to have exceptions all over the place anyway, you're just hiding them from yourself.

Hidden exceptions

Iteration

Even this simple code has a hidden exception:
    for i in range(10):
        print(i)
Under the covers, it's equivalent to:
    it = iter(range(10))
    while True:
        try:
            i = next(it)
        except StopIteration:
            break
        else:
            print(i)
That's how iterables work in Python. An iterable is something you can call iter on an get an iterator. An iterator is something you can call next on repeatedly and get 0 or more values and then a StopIteration.

Of course you can try to avoid that by calling the two-argument form of next, which lets you provide a default value instead of getting an exception. But under the covers, next(iterator, default) is basically implemented like this:
    try:
        return next(iterator)
    except StopIteration:
        return default
So, even when you go out of your way try to LBYL, you still end up EAFPing.

Operators

This even simpler code also has hidden exception handling:
    print(a+b)
Under the covers, a+b looks something like this:
    def _checkni(ret):
        if ret is NotImplemented: raise NotImplementedError
        return ret

    def add(a, b):
        try:
            if issubclass(type(b), type(a)):
                try:
                    return _checkni(type(b).__radd__(b, a))
                except (NotImplementedError, AttributeError):
                    return _checkni(type(a).__add__(a, b))
            else:
                try:
                    return _checkni(type(a).__add__(a, b))
                except (NotImplementedError, AttributeError):
                    return _checkni(type(b).__radd__(b, a))
        except (NotImplementedError, AttributeError):
            raise TypeError("unsupported operand type(s) for +: '{}' and '{}'".format(
                type(a).__name_, type(b).__name__))
        else:
            return ret

Attributes

Even the simple dot syntax in the above examples hides further exception handling. Or, for a simpler example, this code:
    print(spam.eggs)
Under the covers, spam.eggs looks something like this:
    spam.__getattribute__('eggs')
So far, so good. But, assuming you didn't define your own __getattribute__ method, what does the object.__getattribute__ that you inherit do? Something like this:
    def _searchbases(cls, name):
        for c in cls.__mro__:
            try:
                return cls.__dict__[name]
            except KeyError:
                pass
        raise KeyError

    def __getattribute__(self, name):
        try:
            return self.__dict__[name]
        except KeyError:
            pass
        try:
            return _searchbases(type(self), name).__get__(self, type(self))
        except KeyError:
            pass
        try:
            getattr = _searchbases(type(self), '__getattr__')
        except KeyError:
            raise AttributeError("'{}' object has no attribute '{}'".format(
                type(self).__name__, name))
        return getattr(self, name)
Of course I cheated by using cls.__mro__, cls.__dict__ and descriptor.__get__ above. Those are recursive calls to __getattribute__. They get handled by base cases for object and type.

hasattr

Meanwhile, what if you want to make sure a method or value attribute exists before you access it?

Python has a hasattr function for exactly that purpose. How does that work? As the docs say, "This is implemented by calling getattr(object, name) and seeing whether it raises an AttributeError or not."

Again, even when you try to LBYL, you're still raising and handling exceptions.

Objections

People who refuse to believe that Python isn't Java always raise the same arguments against EAFP.

Exception handling is slow

No, it isn't. Except in the sense that Python itself is horribly slow, which is a sense that almost never matters (and, in the rare cases when it does, you're not going to use Python, so who cares?).

First, remember that, at least if you're using CPython, every bytecode goes through the interpreter loop, every method call and attribute lookup is dynamically dispatched by name, every function call involves a heavy-duty operation of building a complex frame object and executing multiple bytecodes, and under the covers all the values are boxed up. In other words, Python isn't C++.

But let's do a quick comparison of the simplest possible function, then the same function plus a try statement:
    def spam():
        return 2
    def eggs():
        try:
            return 2
        except Exception:
            return 0
When I time these with %timeit on my laptop, I get 88.9ns and 90.8ns. So, that's 2% overhead. On a more realistic function, the overhead is usually below measurability.

In fact, even in C++, you'll see pretty much the same thing, unless you're using a compiler from the mid-90s. People who say "exceptions are slow" really don't know what they're talking about in any language.

But it's especially true in Python. Compare that 1.9ns cost to the 114ns cost of looking up spam as a global and calling it. If you're looking to optimize something here, the 128% overhead is surely a lot more important than the 2%.

What about when you actually raise an exception? That's a bit more expensive. It costs anywhere from 102ns to 477ns. So, that could almost quintuple the cost of your function! Yes, it could—but only if your function isn't actually doing anything. How many functions do you write that take less than 500ns to run, and which you run often enough that it makes a difference, where optimizing out 477ns is important but optimizing out 114ns isn't? My guess would be none.

And now, go back and look at the for loop from the first section. If you iterate over a million values, you're doing the 1.9ns wasted cost 999,999 times—buried inside a 114ns cost of calling next each time, itself buried in the cost of whatever actual work you do on each element. And then you're doing the 477ns wasted cost 1 time. Who cares?

Exceptions should only be for exceptional cases

Sure, but "exceptional" is a local term.

Within a for loop, reaching the end of the loop is exceptional. To code using the loop, it's not. So the for loop handles the StopIteration locally.

Similarly, in code reading chunks out of a file, reaching the end of the file is exceptional. But in code that reads a whole file, reaching the end is a normal part of reading the whole file. So, you're going to handle the EOFError at a low level, while the higher-level code will just receive an iterator or list of lines or chunks or whatever it needs.

Raise exceptions, and handle them at the level at which they're exceptional—which is also generally going to be the level where you know how to handle them.

Sometimes that's the lowest possible level, in which case there isn't be much difference between using exceptions and returning (value, True) or (None, False). But often it's many levels up, in which case using exceptions guarantees that you can't forget to check and percolate the error upward to the point where you're prepared to deal with it. That, in a nutshell, is why exceptions exist.

Exceptions only work if you use them everywhere

That's true. And it's a serious problem in C++ (and even more in ObjC). But it's not a problem in Python—unless you go out of your way to create a problem by fighting against Python. Python uses exceptions everywhere. So does all the idiomatic Python code you're going to be interfacing with. So exceptions work.

C++ wasn't designed around exceptions in the first place. This means:
  • C++ has a mishmash of APIs (many inherited from C), some raising exceptions, others returning error values.
  • C++ doesn't make it easy to wrap up error returns in exceptions. For example, your compiler almost certainly doesn't come with a helper function that wraps up a libc or POSIX function by checking for nonzero return and constructing an exception out of the errno and the name of the function—and, even if it did, that function would be painful to use everywhere.
  • C++ accesses functions from C libraries just like C, meaning none of them raise exceptions. And similarly for accessing Java functions via JNI, or ObjC functions via ObjC++, or even Python functions via the Python C API. Compare that to Python bindings written with ctypes, cffi, Cython, SIP, SWIG, manually-built extension modules, Jython, PyObjC, etc.
  • C++ makes it very easy to design classes that end up in an inconsistent state (or at least leak memory) when an exception is thrown; you have to manually design an RAII class for everything that needs cleanup, manage garbage yourself, etc. to get exception safety.
In short, you can write exception-safe code in C++ if you exercise sufficient discipline, and make sure all of the other code you deal with also exercises such discipline or go out of your way to write boilerplate-filled wrappers for all of it.

By comparison, ou can write exception-safe code in Python just by not doing anything stupid.

Exceptions can't be used in an expression

This one is actually true. It might be nice to be able to write:
    process(d[key] except KeyError: None)
Of course that particular example, you can already do with d.get(key), but not every function has exception-raising and default-returning alternatives, and those that do don't all do it the same way (e.g., str.find vs. str.index), and really, doesn't expecting everyone to write two versions of each function seem like a pretty big DRY violation?

This argument is often a bit oversold—it's rarely that important to cram something non-trivial into the middle of an expression (and you can always just wrap it in a function when it is), so it's usually only a handful of special cases where this comes up, all of which have had alternatives for decades by now.

Still, in a brand-new language an except expression seems like a better choice than d[k] vs. d.get(k) and so on. And it might even be worth adding today (as PEP 463 proposes).

But that's not a reason to avoid exceptions in your code.

What about Maybe types, callback/errback, Null-chaining, Promise.fail, etc.?

What about them? Just like exceptions, these techniques work if used ubiquitously, but not if used sporadically. In Python, you can't use them ubiquitously unless you wrap up every single builtin, stdlib, or third-party idiomatic exception-raising function in a Maybe-returning function.

(I'm ignoring that fact that most of these don't provide any information about the failure beyond that there was a failure, because it's simple to extend most of them so they do. For example, instead of a Maybe a that's Just a or Nothing, useone that's Just a or Error msg, with the same monad rules, and you're done.)

So, if you're using Haskell, use Maybe types; if you're using Node.js, use promises; if you're using Python, use exceptions. Which just takes us back to the original point: if you don't want to use exceptions, don't use Python.

Race conditions

I mentioned at the top that, among other problems, trying to use LBYL everywhere is going to lead to code that's full of race conditions. Many people don't seem to understand this concept.

External resources

Are these two pieces of code functionally equivalent?
    with tempfile.NamedTemporaryFile(dir=os.path.dirname(path), delete=False) as f:
        f.write(stuff)
        if not os.path.isdir(path):
            os.replace(f.name, path)
            return True
        else:
            return False

    with tempfile.NamedTemporaryFile(dir=os.path.dirname(path), delete=False) as f:
        f.write(stuff)
        try:
            os.replace(f.name, path)
            return True
        except IsADirectoryError:
            return False
What if the user renamed a directory to the old path between your isfile check and your replace? You're going to get an IsADirectoryError—one that you almost certainly aren't going to handle properly, because you thought you designed your code to make that impossible. (In fact, if you wrote that code, you probably didn't think to handle any of the other possible errors…)

But you can make this far worse than just an unexpected error. For example, what if you were overwriting a file rather than atomically replacing it, and you used os.access to check that he's actually allowed to replace the file? Then he can replace the file with a symlink between the check and the open, and get you to overwrite any file he's allowed to symlink, even if he didn't have write access. This may sound like a ridiculously implausible edge case, but it's a real problem that's been used to exploit real servers many times. See time-to-check-time-of-use at Wikipedia or at CWE.

Plus, the first one is much less efficient. When the path is a file—which is, generally, the most common and most important case—you're making two Python function calls instead of one, two syscalls instead of one, two filesystem accesses (which could be going out over the network) instead of one. When the path is a directory—which is rare—they'll both take about the same amount of time.

Concurrency

Even without external resources like files, you can have the same problems if you have any internal concurrency in your code—e.g., because you're using threading or multiprocessing.

Are these the same?
    if q.empty():
        return None
    else:
        return q.get()

    try:
        return q.get(block=False)
    except Empty:
        return None
Again, the two are different, and the first one is the one that's wrong.

In the first one, if another thread gets the last element off the queue between your empty check and your get call, your code will end up blocking (possibly causing a deadlock, or just hanging forever because that was the last-ever element).

In the second one, there is no "between"; you will either get an element immediately, or return None immediately.

Conclusion

    try:
        use_exceptions()
    except UserError:
        sys.exit("Don't use Python")
2

View comments

It's been more than a decade since Typical Programmer Greg Jorgensen taught the word about Abject-Oriented Programming.

Much of what he said still applies, but other things have changed. Languages in the Abject-Oriented space have been borrowing ideas from another paradigm entirely—and then everyone realized that languages like Python, Ruby, and JavaScript had been doing it for years and just hadn't noticed (because these languages do not require you to declare what you're doing, or even to know what you're doing). Meanwhile, new hybrid languages borrow freely from both paradigms.

This other paradigm—which is actually older, but was largely constrained to university basements until recent years—is called Functional Addiction.

A Functional Addict is someone who regularly gets higher-order—sometimes they may even exhibit dependent types—but still manages to retain a job.

Retaining a job is of course the goal of all programming. This is why some of these new hybrid languages, like Rust, check all borrowing, from both paradigms, so extensively that you can make regular progress for months without ever successfully compiling your code, and your managers will appreciate that progress. After all, once it does compile, it will definitely work.

Closures

It's long been known that Closures are dual to Encapsulation.

As Abject-Oriented Programming explained, Encapsulation involves making all of your variables public, and ideally global, to let the rest of the code decide what should and shouldn't be private.

Closures, by contrast, are a way of referring to variables from outer scopes. And there is no scope more outer than global.

Immutability

One of the reasons Functional Addiction has become popular in recent years is that to truly take advantage of multi-core systems, you need immutable data, sometimes also called persistent data.

Instead of mutating a function to fix a bug, you should always make a new copy of that function. For example:

function getCustName(custID)
{
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

When you discover that you actually wanted fields 2 and 3 rather than 1 and 2, it might be tempting to mutate the state of this function. But doing so is dangerous. The right answer is to make a copy, and then try to remember to use the copy instead of the original:

function getCustName(custID)
{
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

function getCustName2(custID)
{
    custRec = readFromDB("customer", custID);
    fullname = custRec[2] + ' ' + custRec[3];
    return fullname;
}

This means anyone still using the original function can continue to reference the old code, but as soon as it's no longer needed, it will be automatically garbage collected. (Automatic garbage collection isn't free, but it can be outsourced cheaply.)

Higher-Order Functions

In traditional Abject-Oriented Programming, you are required to give each function a name. But over time, the name of the function may drift away from what it actually does, making it as misleading as comments. Experience has shown that people will only keep once copy of their information up to date, and the CHANGES.TXT file is the right place for that.

Higher-Order Functions can solve this problem:

function []Functions = [
    lambda(custID) {
        custRec = readFromDB("customer", custID);
        fullname = custRec[1] + ' ' + custRec[2];
        return fullname;
    },
    lambda(custID) {
        custRec = readFromDB("customer", custID);
        fullname = custRec[2] + ' ' + custRec[3];
        return fullname;
    },
]

Now you can refer to this functions by order, so there's no need for names.

Parametric Polymorphism

Traditional languages offer Abject-Oriented Polymorphism and Ad-Hoc Polymorphism (also known as Overloading), but better languages also offer Parametric Polymorphism.

The key to Parametric Polymorphism is that the type of the output can be determined from the type of the inputs via Algebra. For example:

function getCustData(custId, x)
{
    if (x == int(x)) {
        custRec = readFromDB("customer", custId);
        fullname = custRec[1] + ' ' + custRec[2];
        return int(fullname);
    } else if (x.real == 0) {
        custRec = readFromDB("customer", custId);
        fullname = custRec[1] + ' ' + custRec[2];
        return double(fullname);
    } else {
        custRec = readFromDB("customer", custId);
        fullname = custRec[1] + ' ' + custRec[2];
        return complex(fullname);
    }
}

Notice that we've called the variable x. This is how you know you're using Algebraic Data Types. The names y, z, and sometimes w are also Algebraic.

Type Inference

Languages that enable Functional Addiction often feature Type Inference. This means that the compiler can infer your typing without you having to be explicit:


function getCustName(custID)
{
    // WARNING: Make sure the DB is locked here or
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

We didn't specify what will happen if the DB is not locked. And that's fine, because the compiler will figure it out and insert code that corrupts the data, without us needing to tell it to!

By contrast, most Abject-Oriented languages are either nominally typed—meaning that you give names to all of your types instead of meanings—or dynamically typed—meaning that your variables are all unique individuals that can accomplish anything if they try.

Memoization

Memoization means caching the results of a function call:

function getCustName(custID)
{
    if (custID == 3) { return "John Smith"; }
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

Non-Strictness

Non-Strictness is often confused with Laziness, but in fact Laziness is just one kind of Non-Strictness. Here's an example that compares two different forms of Non-Strictness:

/****************************************
*
* TO DO:
*
* get tax rate for the customer state
* eventually from some table
*
****************************************/
// function lazyTaxRate(custId) {}

function callByNameTextRate(custId)
{
    /****************************************
    *
    * TO DO:
    *
    * get tax rate for the customer state
    * eventually from some table
    *
    ****************************************/
}

Both are Non-Strict, but the second one forces the compiler to actually compile the function just so we can Call it By Name. This causes code bloat. The Lazy version will be smaller and faster. Plus, Lazy programming allows us to create infinite recursion without making the program hang:

/****************************************
*
* TO DO:
*
* get tax rate for the customer state
* eventually from some table
*
****************************************/
// function lazyTaxRateRecursive(custId) { lazyTaxRateRecursive(custId); }

Laziness is often combined with Memoization:

function getCustName(custID)
{
    // if (custID == 3) { return "John Smith"; }
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

Outside the world of Functional Addicts, this same technique is often called Test-Driven Development. If enough tests can be embedded in the code to achieve 100% coverage, or at least a decent amount, your code is guaranteed to be safe. But because the tests are not compiled and executed in the normal run, or indeed ever, they don't affect performance or correctness.

Conclusion

Many people claim that the days of Abject-Oriented Programming are over. But this is pure hype. Functional Addiction and Abject Orientation are not actually at odds with each other, but instead complement each other.
5

View comments

Blog Archive
About Me
About Me
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.