The problems with Python as a teaching language

Python was supposedly intended as an easy language intended for teaching. But there’s design decisions which really don’t work well for that purpose. Some of these have been fixed or changed in the latest version of Python, so this criticism applies primarily to Python 2.7, but some things are still the same. Here’s some ways it confuses students:

The special-case of print vs normal functions. Teaching people to write things like print "Hello World!" seems easy at first. And print lets you comma-separate an unlimited number of parameters like so: print "Hello", name, "welcome aboard!". But this kind of syntax is unique to print, which makes it extra confusing when the student tries to apply the same kind of string manipulation to other cases. I’ve seen numerous examples similar to this: input("Hello", name, "please enter a number:") which does not work because input is a normal function and only expects a single parameter. The print syntax teaches students to use commas to join strings, but that is not the correct way to join strings. Also, print forces a newline unless you do the ridiculous looking syntax of print "No newline", which is really strange even to me.
Excessive operator overloading. Most people, for whatever reason, seem to be able to handle the idea that + is used for arithmetic and string joining. But the usage of % for either modulo or format-substitution is endlessly confusing. Perhaps it’s because the idea of using operations such as modulo and format-substitution are new to the students. But for people new to programming, they don’t necessarily see the lexical differences which we see. Where we see two expressions separated by a symbol, they see a jumble of characters. And it doesn’t help that the % is not only used to mean modulo and format-substitution, but is also used in the format string syntax. I get it, the choice of % makes perfect sense to me as it invokes the printf-style format string syntax. But to a newbie, they see a % here and a % there and get confused, often calling it “modulo” when they mean something else.
Terrible error messages. This is something that every programming language compiler or interpreter struggles with. Having good error messages is actually really difficult, and requires a lot of extra work on the parser. But if Python wants to be seen as a teaching language, then it is critical to make the messages clear and direct. One problem is that the error messages are dumped in a large block of difficult to read text. A bigger problem is that Python, like many other interpreters, spits out an error message for the point in the program where it finally gave up, instead of the point in the program where the error really lies. As a long-time programmer, I’m accustomed to dealing with poor quality error messages. I’ve written parsers and I know how they get lost in poorly formed programs. But for students just learning, it is really confusing and disheartening to read error messages which make little sense and don’t even point at the erroneous line of code.
IDLE sucks. This one isn’t so much on Python, although it does ship with it. Unfortunately, if you are expecting a good introduction to the Python programming environment, it isn’t this. It’s barely stable, and on Mac OS X, it’s even worse. I have students who can’t change the font size in the editing window because IDLE crashes every time they try. IDLE gets stuck in all sorts of weird ways to the point where I just advise students to restart the program. I’ve had students attempt to save files with the extension .py and end up with .py.py — but when they attempted to compensate by not including the extension, then the file was saved without any extension.
Significant whitespace. This one cuts both ways. At least, when the program is written correctly, it is easier for me to read. But I lost count of the number of times I look at a broken program and see a place where the student accidentally hit the spacebar and inserted an extra space somewhere. They can’t see it, because they haven’t spent decades obsessively scanning text to ensure that it lines up. Still, if it wasn’t whitespace, then it would probably be missing braces. This is an editor issue more so than a Python problem: IDLE should be better at showing significant whitespace visibly, and helping people put together blocks of code that are lined up properly.
Dynamic typing. Being able to quickly assemble a running program with inconsistencies has the benefit of being able to run something that is half-broken. But that’s also the problem. And it loses newcomers, who are not often aware of the types of values they are dealing with because they are new to the concept of types. This gets worse when combined with operator overloading. Here’s an example: Python is flexible enough to permit you to make statements like for x in (1, 10): but the problem is that this usually happens when someone really intended for x in range(1, 10): so they get puzzled why their program isn’t counting from 1 to 9. This really pairs with the problem with operator overloading. The punning is cool and concise for people who know what’s going on, but pure confusion for people who don’t. And the dynamic types make it too easy to keep going without noticing that you have an error.
Assignment statements. After teaching some of this class, I think that functional programming advocates have a point when they say that imperative programming is not natural. The idea that a variable can change its value over time is causing people to get mixed up. Maybe worse, is the syntax Python uses, a = b. To non-programmers, that’s a mathematical equality, which is a symmetric relation (even if they don’t use fancy words like that). To programmers, that means a is a variable given some sort of storage and b is where the value comes from. To non-programmers, they don’t see why it can’t work both ways: if a = 5*6 works, then why not 5*6 = a? And they have difficulty with the different meaning of the same name on either side of the equal sign, e.g. total = total + 1. Just because C uses this syntax doesn’t mean it’s a good idea. Using the Pascal-style := or the Haskell-style <- makes the asymmetrical nature of the operator much more clear.
eval is evil. Although we don’t teach students to use eval, there are functions such as input which implicitly use eval on what they read from the user. This allows them to type in all sorts of nasty little things and not realize that it’s wrong. Anyone who’s dealt with languages with eval-like functionality knows what can happen. For example, I found one student thinking that they had to type pieces of their program into the input prompt. Since input just evaluates it, they actually got correct responses, so they did not realize they were doing it wrong. Arguably, we could simply not teach the students about input. But it seems like an awfully strange decision to make the most naturally named function into the one with the biggest pitfalls.
Broken module system. Students get confused because Python looks for modules in the current directory. Suppose you write a program to send e-mails and you call it email.py and you import smtplib. Congratulations, you just broke your program. Because smtplib imports another module named “email” and thanks to the poor design of the Python module system, it winds up trying to load your own program again, instead of looking for the email module in the Python installation directory.

Language design is often about compromise, and a lot of heavy lifting to make it practical. Python has the community and software available to make it useful and worthwhile for many purposes. But I find it really frustrating when an ostensible “teaching language” can cause so many problems for beginners.

Discussion (0) | October 10th, 2012 Categories: hacking

PL and OS

Hacking software and hardware at BU

The problems with Python as a teaching language

Categories

Archives

Links