The (Updated) Truth About Unicode in Python
Back in 2008, Christopher Lenz wrote a great article called The Truth About Unicode in Python, which covers a lot of useful information that you wouldn't get just by reading the Unicode HOWTO, and that you might not even know to ask about unless you were already… well, not a Unicode expert, but more
1How do I make a recursive function iterative?
You've probably been told that you can convert any recursive function into an iterative loop just by using an explicit stack.
1Sockets and multiprocessing
If you want to write a forking server, one obvious way to do it is to use the multiprocessing module to create a pool of processes, then hand off requests as you get them.
The problem is that handling a request usually involves reading from and writing to a client socket.
Micro-optimization and Python
I was recently trying to optimize some C code using cachegrind, and discovered that branch misprediction in an inner loop was the culprit. I began wondering how much anything similar could affect Python code.
3Why does my 100MB file take 1GB of memory?
A lot of people have questions like this:
I've got a 100MB CSV file. I read it in to a list, populated with a csv.DictReader, and my computer starts swapping. Why? Let's look at what it takes to store a 100MB file as a list of dicts of strings.
How to edit a file in-place
There are hundreds of questions on StackOverflow from people who want to know something like, "How do I edit a file in-place, without creating a new file."
In general, the answer is, "You can't.
ADTs for Python
Recently, as an aside to the proposal for static type annotations in Python, Guido offhandedly noted another suggested improvement to Python, adding ADTs (Algebraic Data Types).
5A pattern-matching case statement for Python
The idea of a pattern-matching case statement has come up a few times recently (first in a proposal to add a C-style case statement, more recently as part of a the proposal to add static type annotations), but the discussion ends as soon as people realize that it's not quite as easy as it appears.
2How strongly typed is Python?
Weak typing
A language with weak typing is one where programs can escape the type system. Another way to describe it is that values can change types.
How do comprehensions work?
The official tutorial does a great job explaining list comprehensions, iterators, generators, and generator expressions at a high level. Since some people don't want to read even that much, I wrote a post on comprehensions for dummies to summarize it.
1Reverse dictionary lookup and more, on beyond z
Reverse lookup
Here's a dead simple problem: I have a dictionary, and I want to find the key corresponding to a certain value.
How to handle exceptions
Earlier today, I saw two different StackOverflow questions that basically looked like this:
Why does this not work? try: [broken code] except: print('[-] exception occurred') Unless someone can read minds or gets lucky or puts a whole lot more work into your question than you have any right to exp
Three ways to read files
There are three commonly useful ways to read files: Read the whole thing into memory, iterate them element by element (usually meaning lines), or iterate them in chunks.
2Lazy Python lists
Code for this post can be found at https://github.com/abarnert/lazylist.
The same discussion that brought up lazy tuple unpacking also raised the idea of implementing a full lazy list class in Python.
This is easy to do, but isn't as useful as it sounds at first glance.
Lazy cons lists
This post is meant as background to the following post on lazy Python lists. If you already know all about cons lists, triggers, lazy evaluation, tail sharing, etc., you don't need to read it. Linked lists
What Python calls "list" is actually a dynamic array.
Lazy tuple unpacking
Lazy tuple unpacking
In a recent discussion on python-ideas, Paul Tagliamonte suggested that tuple unpacking could be lazy, using iterators.
Getting atomic writes right
Something that comes up all the time—and not just in Python—is how to write a file atomically. And the solutions given are usually wrong.
tl;dr
You just want some code that makes it easy to do atomic writes? Try fatomic.
Suites, scopes, and lifetimes
To my post Why Python doesn't need blocks, julien tayon replied with a comment about a completely different meaning of blocks, which I think leads to some interesting points, even if it's irrelevant to that post.
1Swift-style map and filter views
Along the way to an attempt to port itertools from Python to Apple's new language Swift (see my Stupid Swift Ideas blog), I discovered something interesting: Swift's map and filter functions are in some ways better than Python's.
1Inline (bytecode) assembly
In a recent discussion on adding an empty set literal to Python, the conversion went as far off-base as usual, and I offhandedly suggested:
Alternatively, it might be nice if there were a way to do "inline bytecode assembly" in CPython, similar to the way you do inline assembly in many C compilers,
Why Python (or any decent language) doesn't need blocks
Ruby programmers love blocks. When Apple added blocks to Objective-C and then to C, it made Cocoa/Cocoa Touch programming a lot easier. So it's only natural that people keep suggesting that blocks should be added to Python, right?
Wrong.
SortedContainers
In Sorted Collections in the stdlib, I went over why I think we want sorted collections in the stdlib, what they should look like, and why we don't have them.
1Fixing lambda
In a recent thread on python-ideas, Nick Coghlan said:
The reason I keep beating my head against this particular wall (cf.
Arguments and parameters, under the covers
In my post on arguments and parameters, I explained how arguments get matched up to parameters in Python.
pip, extension modules, and distro packages
On Stack Overflow, a user asked, "Should python-dev be required to install pip"? I think the answer to that is no, at least the way things are split up on most distros today. But there's clearly a potential for confusion for new developers, as the OP pointed out in a comment.
Python doesn't have encapsulation?
Part of the common wisdom among some OO fanatics is that Python isn't a real OO language because it "doesn't have encapsulation." There are a few different things that could mean, but none of them say anything useful.
3Grouping into runs of adjacent values
The itertools module in the standard library comes with a nifty groupby function to group runs of equal values together.
dbm: not just for Unix
Often, you have an algorithm that just screams out to use a dict for storage, but your data set is just too big to hold in memory. Or you need to keep the data persistently, but pickling or JSON-ing takes way too long.