Resolving the Python TCO Problem
The BDFL of Python has recently stirred up a hornets nest when he fully explained why the so-called ‘tail call optimization’ will never be in Python. Lovers of functional languages have responded, and I think we’re all the wiser for both of their points of view.
Guido’s post mentions the accusation that Python is becoming the “Basic of the Future”, which the author responds might be more of a badge of honor than a derogatory statement. This is because Python was originally conceived as a teaching language, and you can see that much of what Guido does in design work is to strive to make Python as easy to learn and understand as possible. His biggest overall gripe about TCO is that it screws up stack traces, and it is only useful in a handful of circumstances anyway.
In a way, this can be expected. After all, part of the Zen of Python is that “Explicit is better than Implicit”. Behind the scenes optimizations are a good example of the implicit things the BDFL is looking to prevent.
So lets make TCO no longer an optimization. Let’s just simply talk about tail calls, and the optimization is simply a side effect. Much of what Joe Marshall talks about refers to the power of tail calls, mainly via continuation passing. It makes certain ideas very easy to implement. Perhaps the reason Guido believes there is little use for continuation-passing style constructs which Joe brings up is because it is a newer idea, and quite frequently, new ideas are completely useless until we find a use for them. And then they explode into a world in which we can’t remember living without them.
I’d propose a change to the Python language that makes tail calls explicit, backwards compatible and, in my opinion, Pythonic. That is to say, in the same pattern that other interesting function/control flow mechanisms have been implemented. Check out the following example, which I’m not going to give as anything fancy, because it’s pretty obvious and simple what I’m after. Let’s compare a normal recursive function call in our extended Python to a tail call version.
That’s our current vanilla implementation. Nothing really to see here, except that it’s inefficient yet easy to debug. Now the tail call version:
The use of continuations is very explicit, which is in the Zen of Python. But how else is it really Pythonic? Compare a tail call function to a Python generator, which were added to the language more recently:
for elem in lst:
The yield keyword was introduced for generators (and extended to allow for co-routines) as a different mechanism of returning a value from a function with interesting control flow. The language identifies functions that yield values instead of returning them and behind the scenes turns those into generators. Is it implicit? Not really – you can see from the code itself that it’s a generator. It yields. Likewise, the new use of the ‘continue’ keyword would also signify the function is using a ‘continuation’.
Does this work? Can we ‘abuse’ (if you want to call it that!) the ‘continue’ keyword like this? I’m no expert, but in the current Python standard, I believe continue, like break, is meant to be and is always used as a standalone statement in loops. It takes no ‘arguments’, and thus, if we added it to the language in the way I’m proposing, it’d be easy for the parser to tell when we’re continuing in a loop versus continuing from a function (a continuation). If we’re in a loop and we simply continue, we go back to the top of the loop. If we continue with an argument, then we’re using a function as a continuation.
Can there be ambiguous uses? First of all, using recursion in a loop probably doesn’t happen as much as you think because people who think recursively hate loops and people who use loops hate recursion (I keed). But secondly, no, there is no ambiguity. A continuation only makes sense with a function as its argument. There can be no ‘continue’ statement in a recursive call that passes nothing – that doesn’t make any sense. There is a clear deliniation between using continue to return to the top of your loop and using it for continuation passing. If you want a recursive call to simply return nothing, then use a return statment with no arguments. This mixing of continues and returns parallels with how generators mix yields with returns.
What if I write a loop and I mean to call a function as a continuation but I leave the function off? Then, yes, this will use the loop-style continue. But it’s a parse error anyway, and if you weren’t in a loop, the function wouldn’t even make sense. The opposite, somehow accidentally adding an argument to a continue statement you meant to use as a loop-style continue would also be an error, although it would be easier to catch since only a few things would be able to be used as a continuation (any fully evaluated callables, to be specific). We can’t design all potential bugs out of the language, but the potential for ‘continue’ accidents happening is pretty small, and mixing intricate loop logic (continue is pretty rare as it is and only shows up in more complex loops) with intricate functional logic (continuation passing, like any other tool, has specific uses) either means you’re in over your head and doing it wrong anyway, or potentially awesome, in which case you won’t make any errors.
Is this really abuse? I’d argue no. Both the concept and the name flow from the word – continue obviously looking very similar to ‘continuations’, which is what we’re doing, and the idea of ‘continuing’ on to a different function also makes intuitive sense rather than returning the value of that function evaluated, signified with the ‘return’ statement. More importantly, it does not differ that much from the way the keyword is already used.
Just as many claim most recursive calls can be turned into loops, it also seems altogether poetically just that loops and recursion have other similar concepts. In this case, continuing in a loop means return to the top of the loop, continuing in recursion means go back ‘up’ to the next function. If you were to use a ‘trampoline’ similar to what Guido has suggested in his post, continue would practically mean the same thing and have the same effect. Putting the continue statement in a loop that simulates a trampoline does the same thing as a continuation in a function – we pass control onto the next function in the line in either case. Kinda. If you squint at it funny.
Just as some have claimed continuations are the ‘goto’ of functional languages, it’s ironic that the continue statement started out its own life as a replacement for ‘goto’ in loops. They’re parallel constructs and its altogether too fitting to just use the same keyword to do both.
Python is a learning language, but learning procedural code is not where we’re headed. In the future, students and amateurs are going to need to learn functional style as well as imperative style, and as such, Python should be their introduction to both. It already handles the paradigm quite well. The ‘self’ argument to methods is quite functional (in my own opinion), as it allows methods to be passed around like first class objects without being tied to an object. Generator expressions too are rather functional, yet these things are easy to learn and very Pythonic. Why not continuations too? But explicit – it’s better than implicit, and in a style that already fits into the way Python has expanded in the past towards different ways of returning from functions. With the keyword, it suddenly becomes ok to optimize behind the scenes because the user knows its a tail recursive call. The parser too can annotate information in the callstack to make it easier to debug. After all, the yield statement makes debugging slightly more difficult, but is still in.
Don’t make it an optimization, make it a feature, and settle this whole debate.
25 Comments »