What Makes a Good Programming Language?

There are plenty of programming languages around. David Chisnall points out the various factors that determine what makes a “good” language. But note his caveat: These principles don’t always apply in any given set of circumstances!
What Is a Programming Language?

Programming languages, like spoken languages, are ways of communicating ideas. When two people who know the same language talk, they’re able to understand each other because they both know the rules that formalize how to translate sounds into meaning and vice versa.

Computers don’t understand human languages. Worse, the languages they do understand—their instruction sets—don’t mesh very well with how most humans speak. Imagine for a moment a French person talking to a Chinese person. Neither of them understands the other’s native language, but if they speak a common second language, such as English, they can still talk to each other. This is more or less the situation when it comes to programming languages.

The English skills of the speakers in that last example also have direct parallels in terms of programming languages. When you have one person speaking and another person listening, comprehension depends on two things:

* The speaker’s ability to translate ideas into the language
* The listener’s ability to translate the spoken words into ideas

The speaker’s ability is equivalent to the programmer’s skill, and the listener’s ability is equivalent to the compiler’s efficiency.

A programming language is a compromise. Translating a language such as English directly into a machine language is very difficult for a machine. Similarly, “speaking” a machine language well is very difficult for a human. A programming language is one that a human can speak reasonably well, and that a computer can translate into a language that it understands.
Language, Framework, and Runtime

Most languages are very small; for example, C contains only about 20 keywords. They control the flow of a program, but do little else. Writing a useful program using just the C language is almost impossible. Fortunately, the designers recognized this problem. The C language specification also defines some functions that must be available for C programs to call. The majority of these functions, known together as the C Standard Library, are themselves written in C, although a few primitive ones may need to be written in some language that’s yet more primitive.

Most languages have some kind of equivalent to the C Standard Library. Java, for example, has an enormous catalogue of classes in the java.* namespace that must be present for an implementation of the language to be called “Java.” Twenty years ago, Common Lisp had a similarly large collection of functions. In some cases, the language is most commonly used with a collection of libraries, usually referred to as a framework, that are specified separately from the language itself.

Sometimes the line between language and framework is blurry. It’s clear that C++ and the Microsoft Foundation Classes are separate entities, for example, but for Objective-C the distinction is less clear. The Objective-C specification defines very little by way of a standard library. (Although, because Objective-C is a proper superset of C, the C Standard Library is always there.) Objective-C is almost always used in conjunction with the OpenStep API; these days, the common implementations are Apple’s Cocoa and GNUstep.

When you start adding libraries to a language, you lose some of the clarity of what makes the language unique. Writing an OpenGL program in Java has a lot more in common with writing an OpenGL program in C than it does with writing a Swing application in Java.

Another source of confusion is the runtime system. Languages such as Smalltalk and Lisp were traditionally run in a virtual machine (although most current Lisp implementations are compiled into native machine code). This requirement can lead people to perceive a given language as slow. Interpreted code is almost always slower than compiled code. This doesn’t have anything to do with the language, but it is important. Code run in a Lisp interpreter will be much slower than compiled Lisp code.

When judging a language, it’s important to differentiate between characteristics of a language and characteristics of an implementation of the language. Early implementations of Java were very slow; the just-in-time compiler was little better than an interpreter. This led to people calling Java an “interpreted language.” The GNU project’s Java compiler destroyed this myth, although other design decisions in the Java language still prevent it from being run as fast as some other languages.

There are plenty of programming languages around. David Chisnall points out the various factors that determine what makes a “good” language. But note his caveat: These principles don’t always apply in any given set of circumstances!

How do you measure the speed of a language? The obvious way is to write some algorithms in it, run them, and see how long they take to execute. For the most part, this is a good, pragmatic solution. There are a few potential pitfalls, however:


Selecting the correct compiler. If you’re evaluating Lisp, for example, it wouldn’t be fair to benchmark an interpreted version of Lisp against compiled C++ or just-in-time compiled Java. But it would be fair to compare these languages with something like Perl, Ruby, or Python, in which the default (supported) implementation runs interpreted bytecode.

The fastest compiler isn’t always the correct one for your needs. For some time, the Microsoft Java runtime was the fastest implementation available. If you were looking for a language for cross-platform development, however, it would have been a mistake to select Java just because the Microsoft runtime was fast.

Be aware of the platform you’re targeting when you decide on a compiler. If you’re writing cross-platform code, you may want to standardize the compiler (or runtime) across all of your supported platforms to ensure that you have a minimum of issues from varying levels of language support across implementations. In this case, you might choose something like GCC, which is likely to produce code that’s slower than a compiler written for a specific architecture. Be sure to take this factor into account when performing a speed evaluation. Just because IBM’s XLC or Intel’s ICC produces faster code from C than whatever other language you’re evaluating, that doesn’t mean that you can always benefit from this speed

At the opposite extreme, you might need to support only one platform. In this case, you’re free to choose the best implementation available for that platform. Make sure that you do this for all of the languages you evaluate, however. It’s easy to read benchmarks showing that the Steel Bank Common Lisp compiler produces code that runs faster than C++ code, and miss the fact that it performs somewhat worse on register-starved architectures such as x86. If you’re targeting x86, this is an important factor.

Don’t place too much faith in micro-benchmarks. It’s quite easy to design an algorithm that plays to the strengths of a particular implementation—or to its weaknesses. Something that requires a lot of array accesses, for example, is likely to cripple Java’s performance. Something that requires a high degree of concurrency is likely to show off Erlang’s strengths. When you look at micro-benchmarks, try to imagine where you would use algorithms like the ones described in your own application. If you can’t, then disregard them.

Remember that speed isn’t always important. The CPU usage graph on my machine could almost be described as having two states: one showing 20% usage, and the other showing 100% usage. If you’re writing the kind of application that will contribute to the 20%, then you would be hard-pressed to select a language that the end user would notice was slow. If you’re writing an application that uses as much processing power as you can throw at it, or an embedded application in which processing power is still expensive, speed is a very important consideration.

There are plenty of programming languages around. David Chisnall points out the various factors that determine what makes a “good” language. But note his caveat: These principles don’t always apply in any given set of circumstances!

The Church-Turing thesis is one of the foundation stones of computer science. It tells us that any programming language that can simulate a Turing Machine can be used to implement any algorithm. But this doesn’t tell us much about a language beyond a simple yes or no answer to the question “Is this language Turing-complete?” In almost all cases, the answer is yes; few useful languages are not Turing-complete. One example is Adobe’s Portable Document Format (PDF), which began life as a non-Turing-complete subset of PostScript. This was created so that the time required to render (and hence print) a document was bounded by its size. PostScript contains loop constructs that PDF lacks, and so it’s possible for a PostScript program never to terminate.

Anyone who has criticized C will be familiar with the defense “but you can implement that in C!” This is true, of course. Any Turing-complete language can be implemented in any other. Prolog, for example, can be implemented in about 20 lines of Lisp. If you write a program requiring 1,000 lines of Prolog and include with it a 20-line Prolog interpreter written in Lisp, is it fair to call it a Lisp program?

The same principle applies in many other languages. The Objective-C runtime is usually implemented in C, so obviously you can do anything in C that you can do in Objective-C. The question is whether this practice makes sense. Is it more sensible to write your own dynamic object system, or to use one that other people are working on and constantly optimizing?

Of course, we’re assuming that the features you would need to implement are actually useful. If a language lacks a feature that isn’t useful anyway, that’s not a limit to its expressiveness.

A good question to ask is how many language features you have to throw away to gain a useful feature. In Smalltalk, you send messages to objects. This is the equivalent of calling methods in a language such as C++ or Java. If the object doesn’t have an explicit handler for that message type, then the runtime system delivers this message and its arguments to the object’s doesNotUnderstand: method. This setup allows for a lot of flexibility in programming.

Consider Java’s RMI mechanism. Each class to be exposed through RMI must be run through a preprocessor that generates a proxy class, which passes the name and arguments of each method through to the RMI mechanism. In Smalltalk (or Objective-C, for that matter), you don’t need to do all this. You can just create a single proxy class that implements the doesNotUnderstand: method and passes the message to the remote class. This one class can be used as a proxy for any other class.

If you wanted to implement something comparable in C++, however, you would need to throw away the C++ method-call mechanism and replace it with your own custom message-passing system. Each C++ class would implement a single handleMessage() method, which would then call the “real” methods. By the time you’ve done this, you’ve thrown away a lot of the convenience of using C++ in the first place.

here are plenty of programming languages around. David Chisnall points out the various factors that determine what makes a “good” language. But note his caveat: These principles don’t always apply in any given set of circumstances!

Sometimes you can write code that gets used for a bit and then thrown away. Most of the time, you aren’t so lucky. Eventually you have to go back and read it, or someone does. If you know you’re leaving the company soon, and you hate your coworkers, you might consider a language like Intercal or C++, with baroque syntax and even more peculiar semantics. On the other hand, if there’s a chance that you might be stuck maintaining the code, or you want to be able to show it to other people without hanging your head in shame, the readability of the language is important.

A lot of factors go into determining whether a language is readable. The most obvious is familiarity. The human mind is very good at adaptation, and often it’s astonishing what the mind will perceive as “normal.” Familiarity only comes from constant exposure, though, which means that languages with relatively simple syntax become familiar more quickly. Lisp is at one extreme, with only one syntactic construct. It’s very easy to become familiar with Lisp, although grasping the large Common Lisp standard library is another matter. C++ is the popular language at the other extreme. Most C++ coders I have encountered use only a relatively small subset of the C++ language. Worse, everyone uses a slightly different subset, so C++ code written by other people may be quite difficult for you to read, even though you’re familiar with the language.

Another aspect of readability is the syntax itself. This is where Smalltalk-like languages tend to win. Since Objective-C allows the use of syntax that works for both C and Smalltalk, it’s a good language for an example. Apple provides a set of objects that are “toll-free bridged” between C and Objective-C. This means that a set of C functions are passed the Objective-C object as an opaque data type, so you can directly manipulate the Objective-C objects without having to translate them into some other form. This approach makes them ideal for a syntax comparison:

[array insertObject:anObject atIndex:12];
CFArrayInsertValueAtIndex(array, 12, anObject);

Both of these lines accomplish the same thing: inserting anObject into array at index 12. To understand this result from the first line above, you need to understand that the Objective-C message-passing syntax specified an object and then a list of parameter name:parameter pairs. To understand this result from the second example, you need to know how C function-calling works, that it’s conventional for the first argument in functions of this type to be the data structure being manipulated, and that the author of the function decided that the arguments for the function should be passed the opposite way to how they occur in the function name.

In this particular example, it’s quite easy to spot that 12 is the index, if you know that arrays are indexed by numbers. Now imagine that both arguments were variable names.

Of course, the biggest impact on readability comes not from the language, but from the developer. A poor developer can write illegible code in any language. A good developer? I’ve even seen well-written, readable Visual Basic code. (Once.)

here are plenty of programming languages around. David Chisnall points out the various factors that determine what makes a “good” language. But note his caveat: These principles don’t always apply in any given set of circumstances!

A lot of factors go into determining what makes a good language, and they don’t always apply in any given set of circumstances. A good language for a particular task may be a particularly bad choice in another situation. A good programmer will always pick the right tool for the job.