Terminology
Like many words in computing, "block" is overloaded with multiple, contradictory meanings.Ruby has a syntactic feature that it calls "blocks," a way to define special-purpose anonymous callables, which it calls "procs". Objective-C (and Apple-extended C) borrowed that feature, and used the word "block" to refer both to the syntax for defining these things, and for the things so defined.
But meanwhile, there's a completely separate concept in most imperative languages: a sequence of statements set off by braces, indentation, begin/end, etc., that can be used syntactically like a single statement. And "block" is a pretty common word to refer to this idea, but I'll use the term "suite", as used by Python's reference documentation, to avoid confusion.
Julien's reply
Scope, what is it good for?
well scopes (not blocks) are for limiting the existence of a variable so that it does not clubber the namespace later and make side effects.This is only half of what scopes do, and not the most interesting half.
Scopes are closures—the variables defined in a scope are only available inside that scope. But in a language with lexical scoping and first-class functions, this means you can capture those variables in a lexical closure and pass it around. That isn't possible in C, but in Python (and many other languages), you can write this:
def make_adder(addend):
def adder(augend):
return augend+addend
return adder
adder_of_5 = make_adder(5)
adder_of_10 = make_adder(10)
(And, at the risk of increasing the confusion again, it's this use of scopes that Ruby's blocks were invented for—unlike Python, Ruby functions are not lexically scoped, so they needed something else that was.)
I don't want to give a full tutorial on closures, how cool they are, and the effects of alternative scoping rules, etc.; if you don't already know all that, you'll find better versions on Google (or probably even Wikipedia).
It is also useful for enforcing the locality of variable (which is very usefull for triggering CPU cache optmization but since python variables are boxed it is not very useful in python).I'll ignore the micro-optimization stuff here, because it really isn't relevant.
Suites and scopes
Python is lacking of blocks, I miss strictures from Perl.Python is not lacking either suites (blocks) or scopes.
The difference between Python and the C family of languages is that not every suite defines a scope; basically, it's just function and class definitions.
So, why not have every suite define a scope?
First, there's not much benefit (except in the case of a few unusual languages like C++, which I'll get to in the next section). Functions can bind scopes, but for statements can't. So, all you get out of a scoped for statement is the ability to (usually accidentally) shadow outer variables. If you look inside the code generated by a C compiler, a function's stack frame has room for all of the variables defined in the deepest scope in the function, just like in Python.
And there is a cost. Besides making the language harder to understand, compile, and interpret, it makes it harder to debug your code. For example, in this function:
int foo() {
int bar = 1;
// ...
if (1) {
int bar = 2;
// ...
}
// ...
return bar;
}
… there are two different variables, both named bar, where the inner one shadows the outer. Of course in real life, this would probably be a 120-line function, where the two definitions were 58 lines apart and couldn't even fit on the screen together. This makes it harder for a reader to understand the code. And it means a debugger needs some way to distinguish "not that bar, the other bar in the same function". And it means the debugger's interface needs some way to let you distinguish them in your commands.
In C, there are (at least historical) reasons why long, complex functions are worth having, and occasionally maybe it's even worth reusing the same variable name 58 lines later because it's the locally most-readable name. (That being said, in some cases, compilers will warn you about doing so…)
But in most newer languages—including Python—there's no reason not to break your code up into smaller pieces. If your inner code needs to use the name bar for readability, factor it out into a separate function. You can even keep that function inline if you want (gaining the added advantage over C that you can explicitly control which variables you do and don't want to share with the parent scope):
def foo():
bar = 1
# ...
def inner():
bar = 2
# ...
inner()
# ...
return bar
So, you can have a new scope whenever you want it, but you don't get one when you don't need it. You don't accidentally shadow variables, but you can do so when you need to. And so on.
RAII
C++, like Python, is designed to encourage small functions, but it still makes extensive use of suite scopes. In fact, you'll even often see "bare suites" created just for scoping purposes:void foo(shared_ptr<thing_t> thing) {
immutable_stuff(thing);
{
lock_t lock(thing->mutex);
mutating_stuff(thing);
}
more_immutable_stuff(thing);
}
This is a feature that C++ calls Resource Acquisition Is Initialization.
In C++, everything, even class types, gets allocated on the stack, and destroyed when it goes out of scope. If you want something to live longer, you have to use the new operator to allocate it on the heap, use a pointer to it, and delete it when you're done. If you need to allocate it on the heap, but still want it to go away at the end of the scope, you can allocate a smart pointer (the stdlib comes with various different variations) that itself lives on the stack, goes away at the end of the scope, and deletes the object.
Building on that, each class can have a destructor that gets called right before the object gets destroyed. So, if you acquire a resource in your constructor, and release it in your destructor, you can tie the resource's lifetime to your object's lifetime, which you can tie to the scope.
The problem with this feature is that it relies on manual memory management. In most other current languages, objects are allocated on the heap, and automatically garbage collected after you no longer refer to them. Since the garbage collector isn't tied to any scope, you can't rely on destructors to tie resource lifetimes to scopes. Instead, you have to do it using something like Java's try/finally statement:
void foo(Thing thing) {
immutable_stuff(thing);
try {
lock(thing->mutex);
mutating_stuff(thing);
} finally {
unlock(thing->mutex);
}
more_immutable_stuff(thing);
}
In that code, the extra scope created by the try suite is obviously not helping anything. Without RAII, suite scopes aren't useful.
Special-purpose suites
Many C++-family languages have a special kind of statement that wraps this up lock-a-mutex-for-the-duration-of-a-suite use case:
void foo(Thing thing) {
immutable_stuff(thing);
synchronized(thing->mutex) {
mutating_stuff(thing);
}
more_immutable_stuff(thing);
}
immutable_stuff(thing);
synchronized(thing->mutex) {
mutating_stuff(thing);
}
more_immutable_stuff(thing);
}
But that only works for locks; it doesn't work for files or large chunks of memory or anything else, unless the language has a special statement for each kind of resource. RAII effectively exposes a try/finally contexts in a way that's accessible at the language level, so you (or the stdlib, or a third-party library) can write any number of "synchronized"-like wrappers for every kind of object you want.
Python has a very similar, but different, solution to the same problem, as Julien acknowledges:
Context managers
And python with context manager has finally began to understand the wisdom in scopes:
with open("/tmp/whatever") as f: do some_thing_of(f)
# f does not exists here, and file is closed \o/I'm not sure what's meant by "finally" here, since context managers have been part of Python since 2.5 (see PEP 343), and were under discussion for 3 years before being implemented.
Also, it's not true that f does not exist after the with statement. It still exists; you can print it out and see that it's a perfectly valid TextIOWrapper around a closed file.
And this is the key. With statements are not about scopes; they're a way of managing resource lifetimes independently of the scopes of any variables that may reference them. You get all of the advantages of RAII without needing the otherwise-useless scoped suites.
(The fact that a with statement has a suite, and so does a function definition or other scoped statement, is really only marginally relevant here; all compound statements have suites, most of them don't have scopes, and most of them have nothing to do with resource lifetime.)
View comments