There are two ways to do this, using defaultdict, or using a regular dict with setdefault. And as soon as someone posts an answer using one or the other, someone else suggests they should have used the other in a comment, and sometimes it even devolves into an argument about which is better.
Compare these two functions:
def f1(pairs): d = {} for key, value in pairs: d.setdefault(key, []).append(value) return d def f2(pairs): d = collections.defaultdict(list) for key, value in csv.reader(f): d[key].append(value) return d
It's hard to argue that either one is unclear, overly verbose, hard to understand, etc.
And, while one or the other is probably faster, it's probably not enough to make a difference in real-life programs.*
So, how do you decide between them?
The answer is simple: This isn't the relevant code for making the decision. You have to look at how the returned value is going to be used in the code. When you later look up a missing key, do you want an empty list, or a KeyError?
* From a quick test, setdefault is about 60% slower, so in the rare cases where it matters, if you want a plain dict, it might be worth using defaultdict anyway, then converting at the end.
View comments