This article is part of the sequence The Basics You Won’t Learn in the Basics aimed at eager people striving to gain a deeper understanding of programming and computer science.
When I started programming, I got introduced to C# and I thought it was pretty fun. As I advanced in my studies, I learned other programming languages as well. I learned JavaScript, PHP, Java.
Even though I learned to code in these languages, what I didn’t understand is why are there so many languages? What purpose did they all serve? Furthermore, I was curious where did all these languages came from? How did they come to be? What is a low-level language and why does it still exist?
The goal of this article is to try and help you find the answers for some of these questions and to further fire up your curiosity in the nature of programming languages and computers. I will walk you through the evolution of modern programming languages. Why did they come to be and what problem did they solve.
The processor and its language
It all comes down to the central unit of a computer – the processor. First, let’s start with what exactly its purpose is.
On a basic level, it is an electronic device just like your boiler. But the boiler has its electronic components coupled in such a way, that its only purpose is to heaten up the the water inside of it. The processor on the other hand serves a general purpose. You can use it for creating a device for your car, but you can use it for playing your favorite games on a gaming console as well.
In order to be a general-purpose electronic device, it has to provide you with some set of abstract instructions it can perform. Instructions like addition, subtraction, multiplication, etc. You can use those to calculate what your paycheck is, but you can use them to calculate the temperature in your room as well.
So, on a basic level, all that a processor does is to execute instructions in sequential order. A side note is that nowadays, there are processors, which have the capability of executing multiple instructions in order, but those capabilities solve an optimization problem. The principles behind them are the same.
The processor’s interface to the outside world is a sequence of bits. In order for it to perform the abstract instructions we mentioned, it has to somehow map those instructions into the bits it accepts from the outside world.
Machine Language
The specification which defines what instructions a processor supports and how those instructions are mapped to the bits passed to it is called an Instruction Set Architecture (ISA). Modern computers support much more instructions than what I showed you, but the principle is the same.
Furthermore, a side note I want to mention is that modern processors can vary in their implementation and capabilities, but they might all support a common ISA. The reason for that is that once we compile a program for a target instruction set, that program can run on many processors. An example of a modern instruction set is x86.
The binary representation of the instructions a processor supports is called machine language. This is at the core of computers and the stepping stone of the evolution of programming languages. No matter if you code in Java, C#, C or Python, your code eventually transforms into the machine language of the processor.
Assembly
In the early days of computer programming, people had to use a dictionary, which gave them information about what numbers do the instructions they want to execute map to on their processor. The process of translating symbolic code (such as ADD, SUBTRACT, MULTIPLY) into machine code (such as a set of bytes) is called assembling.
There were people called assemblers, which do this process. In our days, this responsibility is taken over by programs called assemblers. Their purpose is to get as input a set of symbols representing the instructions of the CPU and translate those into machine language.
Writing a sequence of symbols instead of numbers is a huge step towards simplifying the process of writing programs. But still, we are too close to the underlying hardware and due to that, the development time suffers.
I will give you an example of an assembly program in order for you to get a feel of what it’s like to write in assembly.
Here is a simple program in a high level language, which swaps the values of two numbers:
int temp = a; a = b; b = temp;
Here is the equivalent program, written in assembly:
STORE R0, R2 // store value of register R0 to register R2 STORE R1, R0 // store value of register R1 to R0 STORE R2, R1 // store value of register R2 to R1
Problems
As you can see, the big difference between writing in a high level language and assembly is that in the first case, you deal with abstractions such as variables. In assembly, you deal with hardware details such as registers and memory.
The main purpose of all programming languages is to provide the programmer with the ability to deal with abstractions, which are easier for him to understand, instead of hardware details, which are easier for the processor to understand.
Another problem of writing in assembly is that the code you write is not portable. If you want to write a program, which has to run in a general-purpose computer and on your phone, you have to rewrite your assembly code in the two different instruction sets your phone and computer use.
Nowadays, I don’t think any professional developer writes in assembly. Even if there are such, it might be for a very specific type of problem, which is rarely encountered. There might be hobbyists, who write in assembly for fun, though.
But I encourage you to expose yourself for a while to writing assembly code in order to get closer to the nature of your computer. If you want to attain such knowledge check this out.
Low-level languages
At one point, the first programming languages started to appear. These allowed you to write code, which was more human-readable and therefore easier to read and maintain. This code goes through a process of compilation and turns into assembly code, which is further translated by the assembler to machine language.
Not the first, but perhaps the most famous such language is C. The introduction of programming languages solve two major issues:
- Now, programmers can focus more on solving their problem instead of dealing with hardware details. When you write in C, you don’t have to worry about registers and memory addresses. Well, you do, but at least not so explicitly.
- You can write code which is portable. You can write a program in C, and you can compile it into the instruction set of different processors without modifying the code.
Due to this, development time is reduced drastically and the portability of programs is highly increased. But even so, we are still pretty close to the machine here. We still have to think in terms of pointers and memory. Arrays are not a high-level construct but a low-level abstraction of the actual memory. There is no notion of objects yet. Instead of thinking in terms of Person, Dog and Cat, we still think in terms of variables, structs and pointers.
Applications
High-level languages address he issue of making the code simpler. But that comes at a cost of performance and flexibility. Low-level languages (notably C) are still used when dealing with environments with resource constraints.
Such is for example the toster in your home, the small device attached to the LCD of your car or its central computer.
This branch of programming is called embedded development. It still heavily relies on low-level languages due to its flexibility. You can’t use JavaScript in this environment.
If you want to do that, you have to first embed a 2 MB JavaScript interpreter and only afterwards start writing your program. That can be a difficult challenge given that the device in your car has only 8 KB of memory.
Another interesting area which relies on low-level programming is system programming. Developers in this area focus on developing software systems which support other applications instead of the end user. An example of such a system is your operating system.
High-level languages
With high-level languages, the focus of developers shifts from dealing with hardware details to dealing with abstractions. Their purpose is to allow the developer to focus on solving complex problems, instead of focusing on the underlying machinery.
This shift of focus comes at a performance cost. In order to free the developer of the responsibility of dealing with pointers and memory, there is a garbage collector, whose purpose is to deal with the memory management for you.
Furthermore, the safety of the system is high priority. Due to that, there are certain checks performed, that the low level language doesn’t bother with. For example, in a high-level language if you try to read an index of an array which is out of bounds, you will get an error. In C, that is undefined behavior. Sometimes, you might get away with it and sometimes the system might crash.
All this causes the software to be slower and use more resources. But the capabilities of modern hardware are constantly increasing and what might have been considered resource-heavy in the old days, is a tiny overhead nowadays.
Modern languages and their applications
Examples of high-level languages are Java, C#, JavaScript, Python, etc. All these differ in their syntax and paradigms, but they are all a major simplification over the intricacies of low-level languages.
An interesting example of a programming language, which tries to get the best from both worlds is C++. In it, you can write low-level code, but you can also write in a high level using objects and classes. The main drawback of it, though, is its complexity. It is like a multi-purpose swiss army knife. It can be inconvenient for a programmer to use this tool, who wants to simply hammer a nail.
The application of all these languages is enormous. You can build websites, mobile applications, games, desktop applications and much more. You can implement all of these applications in all of the languages I mentioned. But even so, each of them has an appeal towards a particular area.
Conclusion
This has been a quick glance at the evolution of programming languages. Every one of them solves different problems and has its purpose. Developing in high-level languages is easy, but there will still be the need for low-level development. That is due to the trade-offs we have to make in terms of performance vs. ease of use.
But having an understanding of what problems do the different programming languages solve is important. Even if you can do everything in C#, you should know what its weakness is in comparison to C++. Remember, if all you have is a hammer, everything looks like a nail.
Next time, we will explore what is the path a program goes from source code to an executable file.