So, why?
Often, you don't really need a list, you just need something you can iterate over. In the case of d.items(), the Py3 version can just use a view of the dictionary and not have to copy anything. In the case of map(), you do end up with the same amount of copying, but you don't have to build the list all at once, so there's less memory allocation going on—and, if you don't consume the whole list, you don't pay for the whole thing.
But that's not the point. The point is that if you combine iterator-transforming functions, it's semantically different from combining sequence-transforming functions. Your code ends up interleaved into a pipeline, working element-by-element. Some things that would be hard to express, or even impossible, now become trivial.
When Haskell programmers want to prove that their language is cooler than yours without using the word "monad" (which, they've learned, just gives people headaches and makes them stop listening), they use examples from math, and show how easy they are to transform into Haskell. Let's pick one of their favorites: Computing the first n primes. Here's how to do it:
numbers = itertools.count(2) while True: prime = next(numbers) yield prime def relatively_prime(n, prime=prime): return n % prime numbers = filter(relatively_prime, numbers) first_100_primes = itertools.islice(primes(), 100)That's pretty close to the way you'd describe it mathematically. The problem with coding it this way is that you're starting with an infinite list, and at each step you're filtering it into another infinite list, which takes infinite time. The fact that you're later going to throw away all but the first 100 doesn't help.
But if you're dealing with iterators instead of lists, it _does_ help. You start with an infinite iterator. At each step, you're adding a finite filter to the future values of that iterator. So, when you generate the first 100 primes, you only end up stepping through the first 540 numbers in the infinite list, and applying a finite number of filters to each one.
Think about how you'd have to express that if you didn't have iterator-transforming pipelines. Something like this:
first_100_primes = [] hopefully_big_enough = 1000 hopefully_enough_numbers = range(2, hopefully_big_enough) while len(first_100_primes) < 100: prime = hopefully_enough_numbers.pop() first_100_primes.append(prime) for i, n in hopefully_enough_numbers[:]: if n % prime == 0: del hopefully_enough_numbers[i] while not hopefully_enough_numbers: hopefully_enough_numbers = range(hopefully_big_enough, hopefully_big_enough + 1000) hopefully_big_enough += 1000 for prime in first_100_primes: for i, n in hopefully_enough_numbers[:]: if n % prime == 0: del hopefully_enough_numbers[i]
But obviously, you wouldn't do it like that; you'd manually simulate the infinite list, and manually pipeline all of the filters:
first_100_primes = [] next_number = 2 while len(first_100_primes) < 100: first_100_primes.append(next_number) def is_prime(n, primes): for prime in primes: if n % prime == 0: return False return True def get_next_relatively_prime(n, primes): while True: n += 1 if is_prime(n, primes): return n next_number = get_next_relatively_prime(next_number, first_100_primes)
The Haskell-y version is obviously a lot easier to read. Plus, it lets you laugh at the smug Haskell types by proving that your language can do everything theirs can, at which point they have to resort to talking about the rich type system and you win, because of Guido's Law, which is like Godwin's Law, but with monads in place of Hitler. (I know, anyone with dual PhDs in type theory and history knows that Hitler is more like an applicative than a monad, but just as practicality beats purity, a heavy klomp beats a PhD.)
But Haskellites aren't the only smug types out there. Coroutines are cool too, and anyone who's still programming with those old-fashioned subroutines is so last century. Python generators give you all the benefits of coroutines, but with a syntax that means you don't have to realize it when it's not important.
And if you think about your iterator-transforming pipeline, that's what you're doing. At each step, each filter suspends and hands things off to the next filter until you get to the end of the chain and come back to the loop. You create a new coroutine and add it to that chain just by writing "numbers = filter(relatively_prime, numbers)". Who yields to who doesn't need to be specified—not because of some deep magic, but because it's obvious.
And that leads to even cooler things. The two obvious ways to write async I/O are with threads (green/micro or otherwise), which leads to synchronization problems they bring, or with callbacks, which forces you to think of the flow of control backward in complex cases, and leads to callback-hell even in simple cases. You can improve things a little by atomic data structures like queue.queue for all inter-thread communication, or by using promises (deferreds) for chaining callbacks, but still, you're adding a lot of mental burden on top of the equivalent synchronous code. But with explicit coroutines as in PEP 3156, you don't have any of that burden. The synchronous semantics translate directly to asynchronous semantics just by putting "yield from" at all suspension points, and you're done. The resulting code is clearly readable and easily writable.
View comments