CPUs speak only one language; they don’t care what language you write in
Programmers tend to make a big deal over the supposed difference between compiled languages and interpreted ones. Or dynamic languages vs. statically typed languages.
The conventional wisdom goes like this: A compiled language is stored in machine code and is executed by the CPU with no delay, an interpreted language is converted to machine language one instruction at a time, which makes it run slowly. Dynamic languages are slower because of the overhead of figuring out type at runtime.
The reality is nothing like this
A long long time ago, when programmers wrote to the bare metal and compilers were unsophisticated, this simplistic view may have been somewhat true. But there are two things that make this this patently false today.
The second thing is that VMs make it easier to easier to observe the program’s execution, which in turn makes JIT (Just in Time) compilation possible.
So how about a real example that illustrates why this matters: Java.
Java is compiled. Or is it? Well… sort of not really. Yes there is a compiler that takes your source and creates Java bytecode, but do you know what the JVM does with that bytecode as soon as it gets it?
First it build a tree. It disassembles the Java bytecode and builds a semantic tree so that it can figure out what the source code was trying to do. In order to accomplish this it must undo all of the snazzy optimizations that the compiler has so carefully figured out. It throws away the compiled code.
That sounds crazy! So why does it do this? The answer to that question is the key to understanding the title of this article.
The best way to understand code is to watch it running
This applies to humans, but it applies just as well to compilers. The reason why the modern JVM undoes the compilation and the optimizations is that “conventionally” compiled Java bytecode runs too slowly on the JVM. To attain the speed of execution for which Java is known these days, the JIT has to throw away the code that was statically compiled (and statically optimized) and “watch” the code running in JVM, and make optimizations based on the code’s actual behaviour at run time**.
Don’t go around saying things like “[insert language name here]is too slow because it is interpreted” because that is simply not true. Languages are just syntax and can have many implementations (for example there are several implementations of Java). You could say “[insert language name here] is slow because the interpreter/VM doesn’t have JIT” and that might be true.
Languages are not fast or slow
C++ is not a faster language. C++ runs fast simply because more hours have been spent on the compiler optimizations than for any other language, so of course they are better. It is worth noting that programs compiled using PGC++ and Clang are regularly 2 or 3 times faster than the same source code compiled using the AOCC compiler. This is proof that it is the compiler and its optimizations — not the language itself — that dramatically affect execution performance.
Java is generally considered next fastest, and that is because it has had more hours invested in its JIT compiler than anything except C/C++.
Framework or Language redux
But it is not all down to the compiler. I have already writtenabout the dangers of unexamined libraries and frameworks. The article could also have been titled Syntax or Library: Do you know which one you are using? I asked a trusted friendto review this article, and brought up the very good point that memory access patterns are the big performance culprit overall. He made the point that C programs benefit from the fact that…
“C only has primitive arrays and structs (but it’s very liberal with pointers so you can pretty much wrangle anything into these primitive structures with enough patience). Working with hashmaps or trees can be very painful, so you avoid those. This has the advantage of nudging you towards data layouts that make effective use of memory and caches. There’s also no closures and dynamically created functions, so your code and data mostly stay in predictable places in memory.
Then look at something like Ruby, where the language design encourages maximum dynamism. Everything tends to be behind a hashmap lookup, even calling a fixed unchanging method on an object (because someone might have replaced that method with another since you last called it). Anything can get moved or wrapped in yet another anonymous closure. This creates memory access patterns with a scattered “mosaic” of little objects scattered all over the place, and the code spends its time hunting each piece of the puzzle from memory which then points to the next one.
In short, C encourages very predictable memory layouts, while Ruby encourages very unpredictable memory layouts. An optimizing compiler can’t do much to fix this.”
I had to agree. Which led me to articulate my point thusly: Programmers who do not understand where syntax stops and libraries begin are doomed to write programs whose execution they do not really understand.
My belief is that it is more difficult to write a truly awful C program because (if it runs at all) it would be too much work to manually reproduce the memory chaos that Ruby so casually produces.
We then had an interesting chat about how a certain large tech company created a “cleaned-up PHP++”. He has some interesting things to say, maybe he will write an article about that.
Thank you for your help Pauli.
So an implicit part of my contention is that modern programming languages lower the bar so that many programmers do not think about about basic computer science (memory structures and computational complexity), and therefore have no basis upon which to understand how their programs will execute.
The other part of my contention is that any Turing complete language could run about as quickly as any other when considered from a pure syntax perspective. For example I believe that it would absolutely possible to create a high performance implementation of Ruby on let’s say the JVM. I readily acknowledge that most current Ruby code would break on that system, but that is as a result of the programming choices made in the standard libraries, not as a fundamental constraint of the language (syntax) itself.
I have say, (as a self-admitted cargo cult programmer) that it is definitely possible that I just don’t understand Ruby syntax and/or the Church–Turing thesis.
CPUs (or VMs) speak only one language
Ruby and its shotgun approach to memory management notwithstanding; any programming language that would have as many hours invested in optimizing its compilation as Java or C, would run just as fast as Java or C and this is because CPUs (and VMs) speak only one language: machine language. No matter what language you write in, sooner or later it gets compiled to machine language and the things that affect performance are how fundamental computer science principles are implemented in the standard libraries, and how effectivethe compilation is, notwhen it happens.
The moral of the story is that programmers should spend less time crushing out on languages and more time understanding how they work under the hood.
I will finish with a quote from the great Sussmann:
“…computers are never large enough or fast enough. Each breakthrough in hardware technology leads to more massive programming enterprises, new organizational principles, and an enrichment of abstract models. Every reader should ask himself periodically ‘‘Toward what end, toward what end?’’ — but do not ask it too often lest you pass up the fun of programming for the constipation of bittersweet philosophy.”
* The original Sun JVM was written in C. The original IBM JVM was written in SmallTalk. Other JVMs have been written in C++.
The Java API (class libraries without which most Java programmers would unable to make even a simple application) are written in Java itself for the most part.
** It is worth noting that this happens again when the machine language of the VM “hits” the hardware CPU, which immediately takes a group of instructions, breaks them apart, looks for patterns it can optimize, and then rebuilds it all so it can pipeline microcode.