Steaming Java (or: A Nice, hot Cup of Joel)

First a quick note to my readers from the LJ syndicated feed – please post comments in the original article at http://rant.aprotim.com . I don’t always see comments on the syndicated feed, and they disappear when the articles expire.

So, I’ve been saying it for years, but not that Joel on Software says it, everybody starts talking about it. Well, I’m going to reiterate because I feel that the article makes very good points, but misses a few, and doesn’t elucidate certain facets enough.

This is one of my favorite topics, as some of you may have heard. For the rest of you, pull up a chair. Let me start my argument thusly: I loathe the shift towards java as the language of choice in teaching intro computer science/programming. (For the 60% of you that are bored already, you’re excused – it just gets worse.) In the interest of full disclosure, I’m not a big fan of Java in any milieu, though I’ve become much less rabid, even accepting, but I’m still praying for Ruby or something similar to take on Java in its own niche.1 I will write java code, and I certainly appreciate how much easier (and thus more bug-free) it can make certain tasks, but I still have irreconcilable differences with the language.

I feel that there are two pedagogically pure ways to teach introductory CS, and that Java is the ugliest in-between ever.

Approach 1: Bottom-up. This is how I learned, and probably the way a great many people learned, before this Java craze swept education. The idea being that you teach students the fundamentals of computers – bits, bytes, math. And then, you give them a language, like C++ which literally encompasses (almost) every concept in modern programming, even if it’s slightly ugly. C++ has the advantage that your first program can be three lines long, with each line having a simple explanation that doesn’t require knowledge of higher-level programming concepts. Compare a simple “hello, world” in Java – from the beginning, you’re forced to either explain what a class is (a difficult concept when you don’t even know what a function or even a variable is), or gloss over it and tell the students “we’ll explain that later”. In addition, the concept of “pass by reference” can be astonishingly confusing to students who have no context for it.

From that first C++ program, each new facet is an iterative growth, and each new concept can be added. One can easily write (I know because this is how I started) essentially managed C++ simply by not even knowing about pointers until one is comfortable without worrying about garbage collection. And then the progression becomes simple. We start with the basic structure of a program, the syntax of statements, procedural coding, the declaration and use of variables, the use of functions, the declaration of functions, etc. all before one even comes near pointers (and garbage collection), references, object-orientation (and everything that goes with it) or other advanced topics. Thus, each student can gain intimate knowledge of the concepts.

On the other hand, there is:
Approach 2: Top-down. The other approach, and the one I experienced when I took my first AI course in high school, is the top-down approach. Here, the student is started with pure, mathematical concepts, and gradually brought down into the nitty-gritty. Typically, you start in a nice functional, LISP-like language (I learned using Common LISP, but it seems Scheme is more popular), and thus you start out with something everybody hopefully understands – algebra and functions. With a little coaching, most people can start to adjust to prefix notation, and everybody will be on relatively even footing. From there, one can begin to explain things like side-effects, and start to shift to other languages. I feel, however, that while this technique can make the relationship between the math and the CS more clear, in the end it ends up being an acclimatization tool, and that eventually one has to revert the bottom-up technique.

However, both techniques have the important characteristic that they imbue the student with a solid framework for looking at all kinds of problems, not just the ones that a particular tool set solves. A student thus armed is well prepared to take on a wide variety of novel tasks in novel languages, which is, after all what higher education is really about (or should be). In graduating a student from a respected university, we are not trying to give them specific trade skills to do a job–after all, it’s well known that the tools of the trade in education are frequently years behind the tools in industry, a gap that’s simply unacceptable in such a quick-changing field. Rather, we are attempting to provide them with the requisite ability and basis to learn those skills that are necessary to do a job. A student with a basic understanding of how the underlying bits work can quickly and easily learn to program in Java, and may in fact be thankful for the eased burden it provides. However, a student without the slightest clue of how to manage memory will have a long and arduous task in front of him when asked to write a virtual machine, or maintain an OS kernel, or write a new language. In my experience, when we create Java kids, we create students who cannot easily adjust to not having certain things done for them, or to looking at non-OO paradigms.

None of this is by way of saying that knowing Java isn’t a valuable skill, or that writing good Java (or other OOP) is easy. I’m saying that teaching students Java is excellent training, but what universities should be doing is educating. The sad fact is that by teaching students in Java (and especially keeping them in pure Java curricula), we are locking ourselves out of innovation and deeper understanding. The irony is that most of the students formed by such curricula could never create a Java VM.

The last point that I want to address was a minor point in Joel’s article, but one that’s near and dear to my heart. He says that the reason that a CS grad from MIT is more respected than one from Duke is this difference – that the MIT grad comes out prepared to tackle any problem, while the learning curve for the Duke grad is much steeper, and there’s no guarantees that he/she can handle it. This problem of reputation is especially important to me, because I’m in what should be considered a top-tier department in what is considered a top-tier university. In four years as an undergrad, however, I worried that a disproportionate amount of time was spent in oversimplifying things to make them accessible to everyone, rather than challenging everybody to rise to the challenge of really understanding things. I don’t know whether there’s some drive to have more CS majors enjoy/pass (and thus stay in) their classes – department funding is probably tied to head count, after all. But as an alumnus, it is of interest to me to make sure that regardless of the number of students who come out of the department, that they all be of the absolute highest caliber–that the department has a reputation for creating students who are not one-trick ponies, but who can take on any job. To fail to uphold those standards will only cheapen my degree.


1Java to me (and to everybody – this is why it was created) is too much unnecessary compromise. In fairness, it did spark the mass movement of using VMs and byte compilation, and it still is the only language to adequately fill its cross-platform niche. But it’s been ad hoc from the beginning and thus is always playing catchup. It suffers from a lack of design purity, as well as closedness that compounds the problem. I don’t mean closedness only in that its compilers are closed-source – I mean that Sun Microsystems routinely extends or changes its software (the de facto standard) with only partial documentation. There is no definitive reference for what must and must not be implmented in a compiler or VM, and for a language whose only goal is cross-compatibility, that’s unforgivable.