Tkinter makes slapping together a simple GUI very easy. But unfortunately, many of its features aren't very well documented. The only way to figure out how to do something is often to figure out what Tk objects it's using, look up the Tcl/Tk documentation for those objects, then try to work out how to access them through the Tkinter layer.

One of the places this most often comes up is validating Entry boxes. Novices expect that Entry must have some kind of <Change> event, and if they could just find the right name and bind it, they could handle validation there.

The first obvious thought is to bind <KeyDown> or <KeyUp>. But that doesn't work, because some key presses don't change anything (like Tab), and there are ways to change the contents without typing anything (like pasting or dragging). If you work hard enough, you can find all the right events to bind and filter out the right cases and get the equivalent of a <Change> event…

But after doing so, you can't do anything useful in the event handler! All of these events get fired before the contents of the Entry have been changed. So all you can validate is whatever used to be there, which isn't very helpful.

There is a way around this, but it's very clunky: your event handler, can set up the real handler with to run the next time through the event loop, via after_idle. In that real handler when you access the contents of the Entry, you're getting the new contents.

Surely there must be a better way.

And there is. In fact, two different ways. And I'll explain both, with links to the Tcl/Tk docs. Hopefully, after reading this, you'll not only know how to validate Entry boxes, but also how to figure out how to do things in Tkinter that aren't explained anywhere.

But first, make sure you've read the first few sections of the Tkinter docs, at least up to the section called Mapping Basic Tk into Tkinter, but ideally the whole chapter.

Example Program

Let's create a dead-simple stupid program (using Python 3; just change the "tkinter" to "Tkinter" and it will work in Python 2):
    from tkinter import *

    class MyFrame(Frame):
        def __init__(self, parent):
            Frame.__init__(self, parent)
            self.text = Label(self, text='Name')
            self.text.pack()
            self.name = Entry(self)
            self.name.pack()
            self.name.focus_set()
            self.submit = Button(self, text='Submit', width=10, 
                                 command=self.callback)
            self.submit.pack()
            self.entered = Label(self, text='You entered: ')
            self.entered.pack()

        def callback(self):
            self.entered.config(text='You entered: ' + self.name.get())
            self.name.delete(0, END)

    root = Tk()
    frame = MyFrame(root)
    frame.pack()
    root.mainloop()
Now, we want to validate this in the simplest way possible: when the Entry is empty, the Button should be disabled.

Validation

Tk Entry boxes have a validation feature. Unfortunately, the Tkinter docs only mention this off-hand in one place, and give no more information than that they "support validate and validatecommand". Unless you were already a Tcl/Tk expert, you'd have know idea what this means. But at least you can Google for "Tk entry validatecommand", which should get you to the docs here.

Reading those docs, we want our command to get called whenever the Entry is edited, which means we want to set the "validate" value to "key". That's easy.

We also want our command to get called with the updated value of the Entry, so that we can tell whether it's empty or not. For this, we want to use the substitution "%P". But how do we do that?

This is the tricky part that you won't find anywhere in the docs. The Tcl/Tk docs say to do it "just as you would in a bind script", but in a Python/Tkinter event binding you get pass a callable, and it gets called with some arguments that are specified by Tkinter. That doesn't work here.

Instead, you have to manually do what bind does for you under the covers: What you actually pass is a tuple of function ID and one or more argument strings. To get that function ID, you tell Tkinter to register your callable and return an ID. Then, when Tkinter tries to call the function by ID, Tkinter looks up your registered callable and calls it, with arguments matching your string argument spec.

Your validate method can do whatever it wants, but at the end it has to return True to allow the change, False to reject it, or None to disable itself (so the Entry is no longer validated). If you return False to reject the change, the Entry contents will not be changed (just as if the user hadn't typed/pasted/whatever anything). And, if you've set an invalidcommand, it will get called. Just like the validatecommand, the invalidcommand has to be a function ID and argument strings.

Sound confusing? Yeah, it is, especially since it's not documented anywhere. But it's not that hard once you get the hang of it.

First, we create a validate method. It's going to take the "%P" argument, so let's call the parameter "P":
    def validate(self, P):
        self.submit.config(state=(NORMAL if P else DISABLED))
        return True
Now, in our constructor, we have to register that method, and just pass that ID along with the "%P" string as the validatecommand (and "key" as the validate):
    def __init__(self, parent):
        # ...
        vcmd = parent.register(self.validate)
        self.name = Entry(self, validate='key', validatecommand=(vcmd, '%P'))
        self.name.pack()
        # …
One last thing: because the validate method doesn't get called until the Entry changes, you'll want to either start the Button off disabled, or manually call the validate method at the end of the constructor (making sure to pass the appropriate value for the P parameter, of course).

You can find a complete version of the code at Pastebin.

Actually, one more one last thing—which doesn't come up very often, but will confuse the hell out of you if it does. If your validatecommand (or invalidcommand) modifies the Entry directly or indirectly (e.g., by calling set on its StringVar), the validation will get disabled as soon as your function returns. (This is how Tk prevents an infinite loop of validate triggering another validate.) You have to turn it back on (by calling config). But you can't do that from inside the function, because it gets disabled after your function returns. So you need to do something like this:
    def validate(self, P):
        if P == 'hi':
            self.name.delete(0, END)
            self.name.insert(0, 'hello')
            self.after_idle(lambda: self.name.config(validate='key'))
            return None
        else:
            return True

Variable tracing

There's another way to do this, which existed before Tcl/Tk had validation commands.

Tk lets you attach a variable to an Entry widget; the variable will hold the current contents of the Entry at any time. Of course it has to be a Tcl variable, not a Python variable, but Python/Tkinter lets you create a Tcl variable with StringVar and related classes. (Also see Coupling Widget Variables in the Python docs.)

So far, that doesn't sound useful. But Tcl has another feature called variable tracing. You can attach an observer callback that gets called whenever a variable is read (accessed), or written (assigned a new value), or unset (deleted). The existence of the trace function is documented in Tkinter, but that's as far as it goes; there's just a big "FIXME: describe the mode argument and how the callback should look, and when it is called." However, another page, called A Validating Entry Widget, serves as an example. It still doesn't document the API, but it happens to show exactly what we want to do with tracing.

To find out how trace actually works rather than blindly copy-pasting magic code, you have to turn to the Tcl/Tk docs again. And of course that still doesn't tell you how Tkinter maps between Python and Tcl. So, here's the deal:

To set a trace on a StringVar or other variable, you call its trace method with two arguments: a mode and a callback, just as the docs say. The "r" and "u" modes aren't very useful, but the "w" mode is called whenever the variable is written--which happens every time the Entry you've attached it to changes contents. When your callback is called, it gets three arguments: name1, name2, and mode.

Together, name1 and name2 provide the Tcl name of the variable. For an array or other collection, name1 is the array variable and name2 is the index into the array (as a string, like "2"—everything in Tcl is a string). For a scalar, like a string or integer, name1 is the scalar variable and name2 is an empty string. Since Tkinter doesn't make it easy to create and wrap Tcl arrays, name2 will always be empty. But what's name1? Tkinter creates Tcl variables dynamically, giving them names like PY_VAR0. You can find the name of the Tcl variable underlying any StringVar as its _name attribute. So, if you have 10 identical Entry boxes, and you want to do the same code when any of them change, but be able to tell which one it is, you can use name1 for that. That being said, it's a lot easier to just create 10 separate closures around the same function in Python and not bother with the _name nonsense. So, you'll rarely use these arguments either.

And that's why the same just declares the callback with *dummy for the parameters.

Unlike a validation function, the trace function can't interfere with what's happening. In fact, by the time your function gets called, the Entry has already been modified and the variable has already been updated. If you want to reject (or modify) the change, you have to do that manually, rather than just returning False.

As before, first we'll write our validate method:
    def validate(self, name, index, mode): # or just self, *dummy
        self.submit.config(state=(NORMAL if self.namevar.get() else DISABLED))

And now, we'll hook it up:
    def __init__(self, parent):
        # ...
        self.namevar = StringVar()
        self.namevar.trace('w', self.validate)
        self.name = Entry(self, textvariable=self.namevar)
        self.name.pack()
        # …
And that's all there is to it. Again, complete code is at Pastebin.

You may be wondering about performance. Isn't attaching debugging hooks ridiculously slow? Maybe, but who cares if you waste hundreds of microseconds on every user input, when user inputs take at least 1000x as long as that? If you're really worried about that kind of thing, you shouldn't be using Tkinter—everything it does is Tcl under the hood, and everything Tcl does is building and evaluating string commands in the most wasteful way you can possibly imagine (and that's before you even add all the Tkinter Tcl<->Python bridging on top of it). If your GUI is responsive enough (which it almost certainly is), adding variable tracing will not change that.

Deciding which one to use

In Python, there should be only one obvious way to do it. But here, there are two ways to do it. Which one is the obvious one?

The key thing to note here is that they're not the same thing. They only overlap in functionality for the simplest use cases:

  • Validation gets called before the modification goes through, and you can reject the change. Tracing gets called after the modification goes through.
  • Validation can take a wide variety of parameters, including things like the index into the string at which the change happened; tracing takes no useful parameters (although you can easily fetch the new value from the variable itself).
  • Validation can hook things like focusout instead of key (meaning that instead of constantly telling the user "that's not a valid phone number" after each character until he's finally done, you can let him type whatever he wants and then check it after he tabs to the next field). Tracing hooks all changes, no matter what user event or code triggered them.
  • Validation doesn't require a StringVar, and generally can't take advantage of one usefully; tracing obviously does and can.
  • Validation is intended for validating user input; tracing is intended for debugging.
Basically, if you already wanted a StringVar, tracing is often a good idea; even if you didn't, in simple cases, tracing is simpler. But in general, validation is the right thing.

Other widgets

There are other widgets that can take text entry besides Entry. What if you wanted to validate one of them?

Some of them handle validatecommand, or textvariable (or the same thing under the name "variable" or "valid"), or both. This often isn't documented. You can always try adding the extra keyword arguments—if the widget doesn't handle validatecommand, it'll tell you that with a pretty simple exception. That works great for, say, the ttk.Entry widget, which (as you'd hope) works as a drop-in replacement for the stock Entry. 

In some other cases, you can access the real Tk widgets under the Python or Tcl wrappers—e.g., if you've installed itk and the Tkinter wrappers around it, itk.EntryField effectively has a ttk.Entry underneath it.

And then there's Text.

Text

Text (and things like ScrolledText that wrap or emulate it) doesn't anything like Entry. There's no validatecommand, textvariable, variable, value… And if you look through the Tk docs for Text, there's nothing that looks even remotely useful from the Tk side.

Well, there's a reason that Text doesn't do validation—it's meant to be a (possibly rich) text editor, not a simple multi-line Entry field. But unfortunately, as long as Tkinter doesn't come with a simple multi-line Entry field, people are going to use Text. In fact, even in Tcl/Tk, people often use Text as a multi-line Entry field, which is why people have come up with different ways to extend it with validation and/or textvariable, as shown on the Tk wiki. There just is no way to validate Text widgets cleanly, but sometimes there's nothing else to use by Text widgets.

If you were using Tcl, it's not at all hard to extend or wrap the Tk text command (see the wiki link above). But if you're using Python, you have to do that in Tcl/Tk, and then wrap the resulting command in Python/Tkinter. Almost nobody who uses Python wants to learn Tcl and the internal guts of Tkinter. If that's you, this is simply not an option.

If you weren't using Tk, all of the other major cross-platform widget frameworks with Python bindings (Qt, wxWidgets, Gtk+, …) and platform-specific GUI bindings (PyWin32, PyObjC, …) have ways to write multi-line text controls with some way to validate them. Sure, they all have a higher learning curve (and none of them come pre-installed with Python), but if you're banging your head against the wall trying to make Tkinter do things that it can't do, you're probably wasting more effort than you're saving.

If you insist on staying with pure Python/Tkinter, you just have to accept its limitations. And that may be fine. Go back to the hack mentioned in the introduction to this post—if you bind all the relevant events, you can get a handler called before the change goes through, and use after_idle in that handler to get a second handler called after the change goes through, and… is that good enough for your GUI? If so, do it.
7

View comments

It's been more than a decade since Typical Programmer Greg Jorgensen taught the word about Abject-Oriented Programming.

Much of what he said still applies, but other things have changed. Languages in the Abject-Oriented space have been borrowing ideas from another paradigm entirely—and then everyone realized that languages like Python, Ruby, and JavaScript had been doing it for years and just hadn't noticed (because these languages do not require you to declare what you're doing, or even to know what you're doing). Meanwhile, new hybrid languages borrow freely from both paradigms.

This other paradigm—which is actually older, but was largely constrained to university basements until recent years—is called Functional Addiction.

A Functional Addict is someone who regularly gets higher-order—sometimes they may even exhibit dependent types—but still manages to retain a job.

Retaining a job is of course the goal of all programming. This is why some of these new hybrid languages, like Rust, check all borrowing, from both paradigms, so extensively that you can make regular progress for months without ever successfully compiling your code, and your managers will appreciate that progress. After all, once it does compile, it will definitely work.

Closures

It's long been known that Closures are dual to Encapsulation.

As Abject-Oriented Programming explained, Encapsulation involves making all of your variables public, and ideally global, to let the rest of the code decide what should and shouldn't be private.

Closures, by contrast, are a way of referring to variables from outer scopes. And there is no scope more outer than global.

Immutability

One of the reasons Functional Addiction has become popular in recent years is that to truly take advantage of multi-core systems, you need immutable data, sometimes also called persistent data.

Instead of mutating a function to fix a bug, you should always make a new copy of that function. For example:

function getCustName(custID)
{
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

When you discover that you actually wanted fields 2 and 3 rather than 1 and 2, it might be tempting to mutate the state of this function. But doing so is dangerous. The right answer is to make a copy, and then try to remember to use the copy instead of the original:

function getCustName(custID)
{
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

function getCustName2(custID)
{
    custRec = readFromDB("customer", custID);
    fullname = custRec[2] + ' ' + custRec[3];
    return fullname;
}

This means anyone still using the original function can continue to reference the old code, but as soon as it's no longer needed, it will be automatically garbage collected. (Automatic garbage collection isn't free, but it can be outsourced cheaply.)

Higher-Order Functions

In traditional Abject-Oriented Programming, you are required to give each function a name. But over time, the name of the function may drift away from what it actually does, making it as misleading as comments. Experience has shown that people will only keep once copy of their information up to date, and the CHANGES.TXT file is the right place for that.

Higher-Order Functions can solve this problem:

function []Functions = [
    lambda(custID) {
        custRec = readFromDB("customer", custID);
        fullname = custRec[1] + ' ' + custRec[2];
        return fullname;
    },
    lambda(custID) {
        custRec = readFromDB("customer", custID);
        fullname = custRec[2] + ' ' + custRec[3];
        return fullname;
    },
]

Now you can refer to this functions by order, so there's no need for names.

Parametric Polymorphism

Traditional languages offer Abject-Oriented Polymorphism and Ad-Hoc Polymorphism (also known as Overloading), but better languages also offer Parametric Polymorphism.

The key to Parametric Polymorphism is that the type of the output can be determined from the type of the inputs via Algebra. For example:

function getCustData(custId, x)
{
    if (x == int(x)) {
        custRec = readFromDB("customer", custId);
        fullname = custRec[1] + ' ' + custRec[2];
        return int(fullname);
    } else if (x.real == 0) {
        custRec = readFromDB("customer", custId);
        fullname = custRec[1] + ' ' + custRec[2];
        return double(fullname);
    } else {
        custRec = readFromDB("customer", custId);
        fullname = custRec[1] + ' ' + custRec[2];
        return complex(fullname);
    }
}

Notice that we've called the variable x. This is how you know you're using Algebraic Data Types. The names y, z, and sometimes w are also Algebraic.

Type Inference

Languages that enable Functional Addiction often feature Type Inference. This means that the compiler can infer your typing without you having to be explicit:


function getCustName(custID)
{
    // WARNING: Make sure the DB is locked here or
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

We didn't specify what will happen if the DB is not locked. And that's fine, because the compiler will figure it out and insert code that corrupts the data, without us needing to tell it to!

By contrast, most Abject-Oriented languages are either nominally typed—meaning that you give names to all of your types instead of meanings—or dynamically typed—meaning that your variables are all unique individuals that can accomplish anything if they try.

Memoization

Memoization means caching the results of a function call:

function getCustName(custID)
{
    if (custID == 3) { return "John Smith"; }
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

Non-Strictness

Non-Strictness is often confused with Laziness, but in fact Laziness is just one kind of Non-Strictness. Here's an example that compares two different forms of Non-Strictness:

/****************************************
*
* TO DO:
*
* get tax rate for the customer state
* eventually from some table
*
****************************************/
// function lazyTaxRate(custId) {}

function callByNameTextRate(custId)
{
    /****************************************
    *
    * TO DO:
    *
    * get tax rate for the customer state
    * eventually from some table
    *
    ****************************************/
}

Both are Non-Strict, but the second one forces the compiler to actually compile the function just so we can Call it By Name. This causes code bloat. The Lazy version will be smaller and faster. Plus, Lazy programming allows us to create infinite recursion without making the program hang:

/****************************************
*
* TO DO:
*
* get tax rate for the customer state
* eventually from some table
*
****************************************/
// function lazyTaxRateRecursive(custId) { lazyTaxRateRecursive(custId); }

Laziness is often combined with Memoization:

function getCustName(custID)
{
    // if (custID == 3) { return "John Smith"; }
    custRec = readFromDB("customer", custID);
    fullname = custRec[1] + ' ' + custRec[2];
    return fullname;
}

Outside the world of Functional Addicts, this same technique is often called Test-Driven Development. If enough tests can be embedded in the code to achieve 100% coverage, or at least a decent amount, your code is guaranteed to be safe. But because the tests are not compiled and executed in the normal run, or indeed ever, they don't affect performance or correctness.

Conclusion

Many people claim that the days of Abject-Oriented Programming are over. But this is pure hype. Functional Addiction and Abject Orientation are not actually at odds with each other, but instead complement each other.
5

View comments

Blog Archive
About Me
About Me
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.