Python facilitates and encourages bundling data together with methods that work on that data, in the exact same way that Smalltalk, C++, and their descendants do (and equivalent way to what other OO paradigms like Self-style prototypes). This is really the only useful definition of "encapsulation", and in this sense, Python does have encapsulation.
The fact that Python's idiom doesn't encourage getters and setters is irrelevant, because getters and setters just provide a different way of spelling attribute access, and one which (except in the case of syntactically restricted languages like C++) adds no flexibility, future-proofing, or other benefits.
The fact that Python's _-prefix idiom doesn't actually hide or protect private variables is true, but the same is true for almost every other paradigm OO language. So, if you want to define encapsulation in these terms, then no, Python is not a good encapsulating OO language—and neither is Smalltalk, C#, Ruby, JavaScript, …
Hidden internal state
One definition of encapsulation is that the data members of an object should not be visible.The C++ family (including Java and C#), Objective C, even Eiffel have the full list of the data members visible in the source. And in most of those languages, the header file or other "interface" that you distribute with a compiled module includes them as well.
But at least they hide that list at runtime, right? Well, if you look at most of the OO languages with any kind of reflection—Ruby, Java, etc.—no, no they don't. And in languages without reflection, like C++, the methods are just as hidden as the members.
In fact, this kind of "encapsulation" does exist, and is used frequently, in non-OO languages like C and Lisp: Just pass around an opaque, untyped, meaningless handle instead of a reference to an actual object. This can be a void*-cast pointer to an object whose type isn't defined anywhere in the headers you distribute (like a Win32 HANDLE), or it can be a key into some table that you maintain inside your library (like a POSIX file descriptor). And of course you can do this in Java or Eiffel or Python as well (in fact, slightly more easily in Python than in most languages, because it has a built-in mapping type to use for that table), but I don't think that makes it an OO feature.
(In C++, there's an idiom ("pimpl") that wraps up this kind of handle in an OO interface. And this same idiom works in Python, it's just not very common. But that just puts Python in the same class of languages as Java, Ruby, Eiffel, ObjC, etc.)
Restricted access
So forget about actually hiding information, what about restricting access to it?
In C++ and friends, you can mark an attribute as "private". Python's equivalent is to prefix the attribute name with an "_".
Python's "_" doesn't actually stop you from accessing the attribute from outside, it just discourages you. It's a clear signal to the user of your class that he shouldn't be using this attribute, that it could disappear or change meaning in future versions, etc. (It also prevents the attribute from showing up in various kinds of reflection, but you can always get around that with other kinds of reflection—e.g., in IPython or various IDEs, private names aren't be offered as a completion, unless someone first types a _ to see them.) This is pretty closely equivalent to the POSIX notion of "hidden files" with a "." prefix, as opposed to, say, MacOS or Windows actual hidden files.
But then very few other languages actually stop you from accessing the attribute either. This protection effectively comes from static type checking, and almost all statically-typed OO languages either have leaky type systems, or inflexible type systems that need (and have) escape hatches. For example, in C++, you can always cast through void* to char* and get at the structure members. Or, even simpler, define a class with identical but all-public structure and just cast to that. (Of course it's easier to do that to a C++ class from Python via ctypes or Cython than from C++, but that doesn't actually speak well of C++'s "protection" of its private members.) Just as with, say, MacOS or Windows actual hidden files, there are flags to pass to allow access to the hidden files if you want it.
If you build an object system on top of, say, Haskell, it can actually prevent access in ways that these languages can't. But the fact that few if any OO languages have static strong typing, and people who use strongly-typed languages like Haskell tend to see only limited use for OO, implies that this kind of restricted access is not an OO feature, any more than access through opaque tokens is.
Sandboxing
Java (and, to an extent, C#) tries to restrict access further than C++, despite providing more reflection, by effectively making the static protection information available at load time (and attempting to make that secure even for code from different sources, in Java's case) and then running the entire program inside a sandbox that can cover holes in the leaky type system.
If someone really wanted to argue that this means Java (modulo design flaws or JVM implementation bugs) is OO in a way that Eiffel, Ruby, Smalltalk, etc. are not, I suppose that would count as a way that Python isn't OO either. But that doesn't seem like a very useful distinction.
The standard idiom in most "encapsulated" OO languages is to provide "getter" and "setter" methods for every data attribute. Some, like C# and Eiffel, have ways to automate that for you. Not only does Python have no way to automate this, the idiom explicitly discourages this kind of design.If someone really wanted to argue that this means Java (modulo design flaws or JVM implementation bugs) is OO in a way that Eiffel, Ruby, Smalltalk, etc. are not, I suppose that would count as a way that Python isn't OO either. But that doesn't seem like a very useful distinction.
Getters and setters
But using ubiquitous getters and setters means the data members are conceptually not hidden at all. They add absolutely nothing. What Python idiomatically spells as "foo.spam" and "foo.spam = eggs" is exactly the same thing C# idiomatically spells "foo.GetSpam()" and "foo.SetSpam(eggs)". The spam attribute is idiomatically visible in both languages.
Of course getters and setters have an advantage: you can later decide to replace the real attribute with a virtual, computed attribute; just change the getter and setter and your API is unchanged.
But that isn't necessary in Python—or even in C#, ObjC, and similar languages. You can always just replace the real attribute with a @property, and the API is unchanged, but now it's accessing a virtual, computed attribute. (Or, of course, you can always intercept access via __getattr__ and friends…)
Bundling data and methods
Another common definition of encapsulation is that it facilitates bundling data together with the methods that work on that data.
There's really nothing objectionable about that definition.
And it applies perfectly well to the class notion in C++, Java, C#, Eiffel, Sather, Smalltalk, ObjC, Ruby, etc.—and in Python. (And the prototype notion in Self or JavaScript, etc.)
This is something that you don't get from C or Lisp—you have to build encapsulation manually, and use project-specific naming conventions, header-file layouts, documentation, etc. to expose the API you want to the user—while in OO languages, there's a construct that makes it easy to build and self-documenting.
So, in the most useful sense of the word, Python does have encapsulation. And in every other sense where it doesn't, neither do any of the languages it's compared to do.
View comments