Many people—especially people coming from Java—think that using try/except is "inelegant", or "inefficient". Or, slightly less meaninglessly, they think that "exceptions should only be for errors, not for normal flow control".
These people are not going to be happy with Python.
You can try to write Python as if it were Java or C, using Look-Before-You-Leap code instead of Easier-to-Ask-Forgiveness-than-Permission, returning error codes instead of raising exceptions for things that aren't "really" errors, etc. But you're going to end up with non-idiomatic, verbose, and inefficient code that's full of race conditions.
And you're still going to have exceptions all over the place anyway, you're just hiding them from yourself.
Hidden exceptions
Iteration
Even this simple code has a hidden exception:
for i in range(10):
print(i)
Under the covers, it's equivalent to:
it = iter(range(10))
while True:
try:
i = next(it)
except StopIteration:
break
else:
print(i)
That's how iterables work in Python. An iterable is something you can call iter on an get an iterator. An iterator is something you can call next on repeatedly and get 0 or more values and then a
StopIteration
.
Of course you can try to avoid that by calling the two-argument form of
next
, which lets you provide a default value instead of getting an exception. But under the covers,
next(iterator, default)
is basically implemented like this:
try:
return next(iterator)
except StopIteration:
return default
So, even when you go out of your way try to LBYL, you still end up
EAFPing.
Operators
This even simpler code also has hidden exception handling:
print(a+b)
Under the covers,
a+b
looks something like this:
def _checkni(ret):
if ret is NotImplemented: raise NotImplementedError
return ret
def add(a, b):
try:
if issubclass(type(b), type(a)):
try:
return _checkni(type(b).__radd__(b, a))
except (NotImplementedError, AttributeError):
return _checkni(type(a).__add__(a, b))
else:
try:
return _checkni(type(a).__add__(a, b))
except (NotImplementedError, AttributeError):
return _checkni(type(b).__radd__(b, a))
except (NotImplementedError, AttributeError):
raise TypeError("unsupported operand type(s) for +: '{}' and '{}'".format(
type(a).__name_, type(b).__name__))
else:
return ret
Attributes
Even the simple dot syntax in the above examples hides further exception handling. Or, for a simpler example, this code:
print(spam.eggs)
Under the covers,
spam.eggs
looks something like this:
spam.__getattribute__('eggs')
So far, so good. But, assuming you didn't define your
own
__getattribute__
method, what does
the
object.__getattribute__
that you inherit do? Something like this:
def _searchbases(cls, name):
for c in cls.__mro__:
try:
return cls.__dict__[name]
except KeyError:
pass
raise KeyError
def __getattribute__(self, name):
try:
return self.__dict__[name]
except KeyError:
pass
try:
return _searchbases(type(self), name).__get__(self, type(self))
except KeyError:
pass
try:
getattr = _searchbases(type(self), '__getattr__')
except KeyError:
raise AttributeError("'{}' object has no attribute '{}'".format(
type(self).__name__, name))
return getattr(self, name)
Of course I cheated by
using
cls.__mro__
,
cls.__dict__
and
descriptor.__get__
above. Those are recursive calls
to
__getattribute__
. They get handled by base cases
for
object
and
type
.
hasattr
Meanwhile, what if you want to make sure a method or value attribute
exists before you access it?
Python has a
hasattr
function for exactly that
purpose. How does that work? As the docs say, "This is implemented by
calling
getattr(object, name)
and seeing whether it
raises an
AttributeError
or not."
Again, even when you try to LBYL, you're still raising and handling
exceptions.
Objections
People who refuse to believe that Python isn't Java always raise the same arguments against EAFP.
Exception handling is slow
No, it isn't. Except in the sense that Python itself is horribly slow,
which is a sense that almost never matters (and, in the rare cases
when it does, you're not going to use Python, so who cares?).
First, remember that, at least if you're using CPython, every bytecode
goes through the interpreter loop, every method call and attribute
lookup is dynamically dispatched by name, every function call involves
a heavy-duty operation of building a complex frame object and
executing multiple bytecodes, and under the covers all the values are
boxed up. In other words, Python isn't C++.
But let's do a quick comparison of the simplest possible function, then the same function plus a try statement:
def spam():
return 2
def eggs():
try:
return 2
except Exception:
return 0
When I time these with
%timeit
on my laptop, I get 88.9ns and 90.8ns. So, that's 2% overhead. On a more realistic function, the overhead is usually below measurability.
In fact, even in C++, you'll see pretty much the same thing, unless
you're using a compiler from the mid-90s. People who say "exceptions
are slow" really don't know what they're talking about in any
language.
But it's especially true in Python. Compare that 1.9ns cost to the
114ns cost of looking up spam as a global and calling it. If you're
looking to optimize something here, the 128% overhead is surely a lot
more important than the 2%.
What about when you actually raise an exception? That's a bit more
expensive. It costs anywhere from 102ns to 477ns. So, that could
almost quintuple the cost of your function! Yes, it could—but only if
your function isn't actually doing anything. How many functions do you
write that take less than 500ns to run, and which you run often enough
that it makes a difference, where optimizing out 477ns is important
but optimizing out 114ns isn't? My guess would be none.
And now, go back and look at the for loop from the first section. If
you iterate over a million values, you're doing the 1.9ns wasted cost
999,999 times—buried inside a 114ns cost of calling next each time,
itself buried in the cost of whatever actual work you do on each
element. And then you're doing the 477ns wasted cost 1 time. Who
cares?
Exceptions should only be for exceptional cases
Sure, but "exceptional" is a local term.
Within a
for
loop, reaching the end of the loop is
exceptional. To code using the loop, it's not. So the for loop handles
the
StopIteration
locally.
Similarly, in code reading chunks out of a file, reaching the end of
the file is exceptional. But in code that reads a whole file, reaching
the end is a normal part of reading the whole file. So, you're going
to handle the
EOFError
at a low level, while the higher-level code
will just receive an iterator or list of lines or chunks or whatever
it needs.
Raise exceptions, and handle them at the level at which they're
exceptional—which is also generally going to be the level where you
know how to handle them.
Sometimes that's the lowest possible level, in which case there
isn't be much difference between using exceptions and returning
(value, True)
or
(None, False)
. But often
it's many levels up, in which case using exceptions guarantees that
you can't forget to check and percolate the error upward to the point
where you're prepared to deal with it. That, in a nutshell, is why
exceptions exist.
Exceptions only work if you use them everywhere
That's true. And it's a serious problem in C++ (and even more in
ObjC). But it's not a problem in Python—unless you go out of your way
to create a problem by fighting against Python. Python uses exceptions
everywhere. So does all the idiomatic Python code you're going to be
interfacing with. So exceptions work.
C++ wasn't designed around exceptions in the first place. This means:
- C++ has a mishmash of APIs (many inherited from C), some raising
exceptions, others returning error values.
- C++ doesn't make it easy to wrap up error returns in
exceptions. For example, your compiler almost certainly doesn't
come with a helper function that wraps up a libc or POSIX function
by checking for nonzero return and constructing an exception out
of the errno and the name of the function—and, even if it did,
that function would be painful to use everywhere.
- C++ accesses functions from C libraries just like C, meaning
none of them raise exceptions. And similarly for accessing Java
functions via JNI, or ObjC functions via ObjC++, or even Python
functions via the Python C API. Compare that to Python bindings
written with ctypes, cffi, Cython, SIP, SWIG, manually-built
extension modules, Jython, PyObjC, etc.
- C++ makes it very easy to design classes that end up in an
inconsistent state (or at least leak memory) when an exception is
thrown; you have to manually design an RAII class for everything
that needs cleanup, manage garbage yourself, etc. to get exception
safety.
In short, you
can write exception-safe code in C++ if you
exercise sufficient discipline, and make sure all of the other code
you deal with also exercises such discipline or go out of your way to
write boilerplate-filled wrappers for all of it.
By comparison, ou can write exception-safe code in Python just by not
doing anything stupid.
Exceptions can't be used in an expression
This one is actually true. It might be nice to be able to write:
process(d[key] except KeyError: None)
Of course that particular example, you can already do
with
d.get(key)
, but not every function has
exception-raising and default-returning alternatives, and those that
do don't all do it the same way (e.g.,
str.find
vs.
str.index
), and really, doesn't expecting everyone to
write two versions of each function seem like a pretty big DRY
violation?
This argument is often a bit oversold—it's rarely that important to
cram something non-trivial into the middle of an expression (and you
can always just wrap it in a function when it is), so it's usually
only a handful of special cases where this comes up, all of which have
had alternatives for decades by now.
Still, in a brand-new language an except expression seems like a
better choice than
d[k]
vs.
d.get(k)
and so
on. And it might even be worth adding today
(as
PEP 463
proposes).
But that's not a reason to avoid exceptions in your code.
What about Maybe
types, callback/errback, Null-chaining, Promise.fail
, etc.?
What about them? Just like exceptions, these techniques work if used
ubiquitously, but not if used sporadically. In Python, you can't use
them ubiquitously unless you wrap up every single builtin, stdlib, or
third-party idiomatic exception-raising function in
a
Maybe
-returning function.
(I'm ignoring that fact that most of these don't provide any
information about the failure beyond that there was a failure, because
it's simple to extend most of them so they do. For example, instead of
a
Maybe a
that's
Just a
or
Nothing
, useone that's
Just a
or
Error msg
, with the same monad rules, and you're
done.)
So, if you're using Haskell, use
Maybe
types; if you're
using Node.js, use promises; if you're using Python, use
exceptions. Which just takes us back to the original point: if you
don't want to use exceptions, don't use Python.
Race conditions
I mentioned at the top that, among other problems, trying to use LBYL
everywhere is going to lead to code that's full of race
conditions. Many people don't seem to understand this concept.
External resources
Are these two pieces of code functionally equivalent?
with tempfile.NamedTemporaryFile(dir=os.path.dirname(path), delete=False) as f:
f.write(stuff)
if not os.path.isdir(path):
os.replace(f.name, path)
return True
else:
return False
with tempfile.NamedTemporaryFile(dir=os.path.dirname(path), delete=False) as f:
f.write(stuff)
try:
os.replace(f.name, path)
return True
except IsADirectoryError:
return False
What if the user renamed a directory to the old path between
your
isfile
check and your
replace
? You're
going to get an
IsADirectoryError
—one that you almost
certainly aren't going to handle properly, because you thought you
designed your code to make that impossible. (In fact, if you wrote
that code, you probably didn't think to handle any of the other
possible errors…)
But you can make this far worse than just an unexpected error. For
example, what if you were overwriting a file rather than atomically
replacing it, and you used os.access to check that he's actually
allowed to replace the file? Then he can replace the file with a
symlink between the check and the open, and get you to overwrite any
file he's allowed to symlink, even if he didn't have write
access. This may sound like a ridiculously implausible edge case, but
it's a real problem that's been used to exploit real servers many
times. See time-to-check-time-of-use
at
Wikipedia
or at
CWE.
Plus, the first one is much less efficient. When the path is a
file—which is, generally, the most common and most important
case—you're making two Python function calls instead of one, two
syscalls instead of one, two filesystem accesses (which could be going
out over the network) instead of one. When the path is a
directory—which is rare—they'll both take about the same amount of
time.
Concurrency
Even without external resources like files, you can have the same
problems if you have any internal concurrency in your code—e.g.,
because you're using
threading
or
multiprocessing
.
Are these the same?
if q.empty():
return None
else:
return q.get()
try:
return q.get(block=False)
except Empty:
return None
Again, the two are different, and the first one is the one that's wrong.
In the first one, if another thread gets the last element off the
queue between your
empty
check and your
get
call, your code will end up blocking (possibly causing a deadlock, or
just hanging forever because that was the last-ever element).
In the second one, there is no "between"; you will either get an
element immediately, or return
None
immediately.
Conclusion
try:
use_exceptions()
except UserError:
sys.exit("Don't use Python")
View comments