Useful Notes: Programming Language

"C isn't that hard: void (*(*f[])())() defines f as an array of unspecified size, of pointers to functions that return pointers to functions that return void"

"C++ is a write-only language, one can write programs in C++, but I can't read any of them"

Computers, for the most part, are dumb. If you were to take computer hardware that was freshly built off the assembly line, put the components together into a fully assembled device, and tried to turn it on, it wouldn't do anything useful (if anything at all). Yes, Windows and Mac OS don't magically appear in the computer right from the factory. But if you give them something to do, they'll be able to do it really fast! But how do you tell a machine what to do? Here comes the programming language. As the name implies, it's the language you use to program the computer to do what you want.

While there are other "languages" that may tell a computer what to do, there are defining points between them all.
  • Programming languages, according to The Other Wiki, describes programs. Basically any task that was meant for calculating, controlling the behavior of the machine directly, or offering a human-friendly interface.
  • Scripting languages typically control programs to do other things as an extension. For example, JavaScript can tell a web browser to do an animation, swap icons on a mouse hover, or do popups. You could not however, run the JavaScript source by itself.
  • Markup languages such as HTML and XML describe how a document should look. Akin to "marking up" a paper in editing before finalizing it. This is what you'll see when you hit the "Edit page" button on This Very Wiki.

Concepts

A programming language has four basic elements to it:
  • Symbols to hold data.
  • Operators that modify the data.
  • Conditional statements that control the flow of the program.
  • The ability to jump around in the program at will.

Programs are written into source files, which can be compiled or assembled for later execution, or interpreted for execution right away. If there's something immediately wrong with the source file, the compiler, assembler, or interpreter will complain until it's fixed.

It should also be noted that a computer is more or less a Literal Genie. It's very, very rare a computer makes a mistake because it actually made a mistake. It "makes mistakes" because of how the program was written. Thus, there is an artistic side of programming.

Low Level Languages

At its heart, a computer is a giant calculator that computes arithmetic billions of times per second. All of its constituent parts, from memory to modem to monitor to mouse, has an alphanumerical address associated with it. The computer uses these addresses to route data throughout itself. Hence, the earliest computer languages evolved to reflect how computers fundamentally worked. An extraordinarily simple instruction might be "Take the number stored at memory address X and subtract it from the number stored at memory address Y, then send it to the printer located at hardware address Z". Low level languages use hardware-specific instructions to talk directly to the computer this way.

A programmer can write a low level source code in two ways:

  • Machine code: The most barebone programming ever. This is literally writing the 0's and 1's or more commonly their hexadecimal equivalents. There's usually two portions to a machine code, the operation code (opcode), which is the instruction, and the operand(s), which is the datum or data that act on this instruction. Computers until The Fifties had to be programmed like this.
  • Assembly code: The next step in the language development introduced in The Fifties. Opcodes and some special operands are now given more human readable mnemonics but often kept to very abbreviated words. But this allowed certain features like labels that allowed sections of memory to have a name, getting rid of the tedious job of keeping track of where one was in memory. Assembly code is machine specific, and one cannot assume one mnemonic means the same exact operation in another machine.

The reason for using low level languages is for maximum performance and maximum flexibility. The code that's written is directly talking to hardware and the programmer has full access (barring specific security features) to the hardware. The tradeoff is that it's very easy to write code that breaks the system in software and a lack of portability.

It's unusual these days to write assembly code by hand, because compilers have gotten so good at optimizing slightly higher-level languages like C. So languages are sometimes termed "low-level" because they give you a lot of control over how the assembly code turns out. In addition, many modern compilers actually allow programmers to create the assembler "inserts" in the high-level source code. Some languages (such as Forth) are "multi-level" and allow both low-level and high-level coding to be done in the same syntax, and sometimes the same program.

High Level Languages

High level languages translate assembly code into something easily human-readable and automate the more mind-numbing parts. For example, you could write "x = 2" instead of "MOV x, 2" or "for(5: Array)" instead of manually looping through instructions. However, this sacrifices performance and (arguably) flexibility because the computer must parse the instructions and translate them into machine code. More abstracted high level languages go further, replacing, say, "x = 2" with "x is 2". Complex languages are created with earlier ones, "standing on the shoulders of giants"; a compiler is essentially a text parser that goes through source code for a given language and translates it into whatever language the compiler was written in. Along with readability, high level languages also allow for "portability" as long as a compiler or interpreter exists for the platform.

Some high level languages are dubbed mid level languages, meaning the language is closer to assembly. A mid level language can directly manipulate the computer's memory and input/output devicesnote . Higher level languages are instead sealed off from the hardware and must interface with a program (usually the part of an operating system called the kernel) written in a mid level or low level language that then interacts with the hardware and memory. Consequently, higher level languages like Java cannot be used to write operating systems or hardware drivers unassisted and often perform slower than mid level languages like the near-ubiquitous C and C++ (which Java was written in).

Source files can be executed in three different ways:
  • Ahead Of Time Compiling (AOT): This turns the source code into an executable that can be loaded into memory and ran without further processing. This has the fastest execution time, but can limit your software distribution if you need to have it run on other architectures.
  • Just-in-Time Compiling (JIT): Some of the source is compiled into executable code for the computer architecture when it's needed (hence, just-in-time). While it can be as fast as AOT and can run on more platforms (assuming the JIT system is available for it), it's usually more resource intensive.
  • Interpreting: This will turn the source code into executable code essentially line by line. While it's quite slow and resource hungry, it can be run as is.

Programming can be thought of as making a recipe for a dish. For instance, making a cake:
  1. Preheat the oven to 400F
  2. Put flour, eggs, sugar and milk into a bowl.
  3. Mix the ingredients for a batter.
  4. Put the batter into a pan.
  5. Bake for 30 minutes.
  6. Take the pan out and poke it with a toothpick. Does it come out clean?
    • If not, put the pan back in the oven for five minutes and test again.
    • If so, take the pan out and leave to cool for 10 minutes.
  7. You now have delicious cake to serve!

A program could look like this
  1. SET oven temperature TO 400F
  2. SET bowl TO flour + eggs + sugar + milk
  3. CALL mix ingredients WITH bowl
  4. CALL pour WITH bowl, pan
  5. CALL bake WITH pan, 30 minutes
  6. CALL toothpick test
  7. WHILE toothpick IS NOT clean
    1. CALL bake WITH pan, 5 minutes
    2. CALL toothpick test
  8. END WHILE
  9. CALL cool WITH pan, 10 minutes
  10. DISPLAY pan

A quirk with different programming languages is that, like natural language, different "words" have different meanings, or no meaning at all. If you wanted to display something on your monitor, you may have to type out "Print" "Display" or even the exotic "cout"note  (C++). There are also different paradigms to how to structure code. For example, procedural programming involves breaking up tasks into subroutines to make things legible. Another one, object-oriented programming, groups variables and tasks into "objects". With so many different ways to write a program or routine, a programming language can be thought of as any natural language you may learn. Thus, it's important to practice it, if you want to get good at it.


Historic Languages

  • BASIC (Beginner's All-purpose Symbolic Instruction Code): A family of languages designed in 1964 to be easy to learn and use. Has undergone many permutations and its the descendants bear no obvious resemblance to it or to each other. Although today it's mostly used in programmable calculators and hobbyist software, historically it was very common in home and school microcomputers, making its use into a trope for creators who grew up in the 70s and 80s. Many of the programming jokes in Futurama, for instance, are in BASIC.
  • FORTRAN: The very first high-level language in existence, though some call its early incarnations little more than a symbolic assembler, as a lot of features that modern programmers now take for granted simply weren't yet invented back then. Developed by IBM's John Backus in 1954 for scientific calculations and is still used to this day for the very same goal. Recent versions are actually closer to C than to the original language.
  • Lisp: Originally LISP, as in LISt Processor. Another early language (a second one, in fact), this time at a much higher level that the industry was ready for. Created by John McCarthy in 1955 as a research tool in the abstract algebra field and later found its use in AI development. Another Long Runner, which, although not as popular per se, influenced basically all modern programming languages, especially scripting ones like Python. Is known for several rather hard-to-bend-the-brain-around concepts like first order functions and closures, as well as for its idiosyncratic (or, as many say, non-existing) syntax that consists entirely of parentheses. Has evolved greatly with time. Popular dialects are Common Lisp and Scheme.

Popular Languages

  • C: The most widely used language in the world, with a compiler available for nearly every modern hardware platform known to man. Works "down to the metal", meaning very close to hardware and thus very fast. Originally used to write the UNIX OSnote , it's since been used in Windows NT (which includes XP to 8), Linux, and Mac OS, the "big three". Allows for a lot of Mind Screwy tricks, but with great power comes great responsibility; you can easily shoot yourself in the foot, which is why it's often jokingly referred to as a "high-level assembler".
  • C++: An extension of C which adds object oriented programming and differing syntax for some operations. C++ is (mostly) compatible with C functions. C++ is used in many places that would help in a C program, mostly in video games and operating system components. It has greatly grown in complexity since its inception, which brought it a fair share of criticism.
  • C#: Jokingly referred to as Microsoft's version of Javanote . C# mainly used in the development of Windows applications, Zune and Windows Phone 7 apps, and Xbox Live games. Like Java, it runs on a set of libraries called the Common Language Infrastructure, which on Windows is the .NET Framework, elsewhere it's Mono. It is just-in-time (JIT) compiled, as opposed to interpreted or precompiled, allowing for speed of execution generally comparable to compiled code with some portability from platform to platform.
  • Java: Developed by Sun in the 90s as an interpreted language, but later extended to be JIT-compiled. It mostly started in web applications, but soon expanded to many platforms that could run the virtual machine. It's still widely used in web applications but also found itself as the platform for Android OS applications.
  • Perl: aka Practical Extend and Report Language, aka Pathetically Eclectic Rubbish Lister thanks to the degree of unreadability code written in it can be. It was originally a popular language for common gateway interfaces which connected web pages to other services. Perl is the glue language of choice for UNIX and Linux Systems.
  • Objective-C: Apple's (originally NeXT's) cross between C and Smalltalk. Originated in NeXTstep and was briefly offered as a programming tool for Windows in the early Nineties, but didn't really take off, mainly due to performance problems: PCs of the time were a lot weaker than now, and the translators weren't up to task. It was, however, revitalized by the introduction of Mac OS X and iOS.
  • Python: Another interpreted language, used notably on UNIX and UNIX-like systems, which aims to be readable. Version 3 made several changes to the language that are often incompatible with older code, so for running older code version 2.7 is also maintained by the organization responsible for the official interpreter, the Python Software Foundation.
  • Ruby: A fairly young interpreted language, but already popular among web developers thanks to it's efficiency when used for rapid prototyping. It's commonly used with the rails framework to build dynamic webpages.

An ordered list of the fifty most popular programming languages (updated monthly) may be found here. This measures popularity based on search engine results, so it may not line up with other definitions (e.g. there may be bias towards languages for which people currently need resources, rather than those being used for production code).

Esoteric Languages

Languages made mostly for fun. Some of them are for testing the limits of a programmer.