Sunday, June 2, 2019

Clojure for non Clojure programmers - Chapter 2, Our first program

In this series, I try and teach how to program in Clojure. My target audience are people who already know how to program in more traditional imperative, procedural, OOP, and garbage-collected languages such as Python, Java, C#, etc. I assume the reader is proficient in at least one of these. I try to be succinct, to the point, and use more familiar terms and comparisons.

Table of Content

Our first program

Preface

Clojure programs are actually Java programs. That’s something which will be a bit strange if you are coming from a language like Python, Ruby, C#, …, which have their own interpreters/compilers, VMs, and where all the ecosystem around them is designed specifically for them. What that means is that Clojure is a parasitic language and it needs a host to thrive. I hope you’ll come to realize that this is actually a great strength of Clojure, but I’m sure at first, it will be a pain point for most of you, which I will try and minimize. Just keep in mind that, you want to embrace the host as well as Clojure, and learning Clojure implies learning to leverage the host just as much.

A Timer application

Our first Clojure program will be a very simple Timer application. The timer starts at a certain time, in seconds, and counts down to 0.

(def start-time
  "Starting time in seconds, for our timer."
  10)

(defn start-timer
  "Starting from start-time-seconds, loop until the countdown reaches 0,
   printing the count at every second."
  [start-time-seconds]
  (loop [countdown start-time-seconds]
    (println countdown) ; Print the current countdown on its own line.
    (Thread/sleep 1000) ; Sleep for 1000 milliseconds.
    (if (> countdown 1) ; If we're not yet at the last second.
      ;; Decrement the countdown and keep looping.
      (recur (dec countdown))
      ;; Otherwise, print that we're done on its own line.
      (println "Done!"))))

;; Call start-timer with our start-time to start our timer.
(start-timer start-time)

There we have it, our first program. A full implementation of a timer application which counts down from 10 seconds and notifies us when the count reaches 0.

We did quite a few things here:

  1. We wrote a fully runnable Clojure script (not to be confused with ClojureScript)
  2. We defined a global variable: start-time
  3. We defined a custom function: start-timer
  4. We ran a recursive loop using: loop and recur
  5. We used Java interop to make the JVM’s main thread sleep
  6. We made a top-level call to our start-timer function

Now let’s learn how to run our program, so we can see it in action, and so you can start messing with it!

Running our program

In order for you to run our Timer application, you will need to:

  1. Get a working JVM/JDK installed on your box.
  2. Install the Clojure CLI.
  3. Copy our code into a source file.
  4. Run a command using the Clojure CLI that will launch our app.

Getting a working JVM/JDK installed on your box

As I mentioned in the preface, Clojure is a hosted language, and Clojure programs are actually Java programs. It goes to say then that you need to have Java installed to use Clojure. If you’re unfamiliar with the JVM and the JDK, I recommend reading The Definitive Guide to Clojure on the JVM, though I’d suggest you read it after you are done with this chapter.

First, it is possible you already have a Java JDK, so run the following command in your terminal shell to check if you do:

java -version

What you want to see is the command existing, and the word JDK anywhere in the printed result, with a version of 1.8 or above, such as:

$ java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (IcedTea 3.9.0) (build 1.8.0_181-b13 suse-1.1-x86_64)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

Finally, you need to make sure that JAVA_HOME is set on your path by running:

$ echo $JAVA_HOME
/usr/lib64/jvm/java

Or if you are on Windows with PowerShell: echo $Env:JAVA_HOME or with the command prompt: echo %JAVA_HOME%.

If you have all that, move to Install the Clojure CLI, and skip what follows.

A small note about the JDK and Oracle…

The JDK is a piece of software which contains many applications. The main one being the Java Virtual Machine (JVM), which is used to run Java and thus Clojure applications. Though it also contains things like the Java compiler, profiling tools, debugging tools, documentation generating tools, etc.

It used to be that some of those tools, and certain features of the JVM were not open source, but in 2019, Oracle made it all Open Source under GPL with classpath exception (code you link to it doesn’t need to be GPL).

Thus, the source code for the JVM and all JDK tools is fully Open Source and free to use under GPL, and maintained at OpenJDK.

That said, as the code is in C and C++ mostly, and building it yourself is a huge pain, you’re going to want pre-built binaries. This is where there are multiple vendors which come into the picture, offering pre-built binaries of the source code for the OpenJDK for various platforms. Not all vendors offer the binaries for free. This is often a source of confusion to people new to the Java ecosystem.

Specifically, Oracle offers binaries known as the OracleJDK, and as 2019, they are no longer free to use for commercial purpose, but are still free for personal or development use. Some of the guides I link below, to help you install and setup a JDK on your machine, link you to download the OracleJDK binaries. As they are not free for commercial purpose, it’s important for me to warn you about it. If your goal is just to learn for now, go ahead, but let me mention that there are free alternatives and I’ll put the download links as well, so as you follow the guides for installing and setting up a JDK, I recommend you swap out the download of the OracleJDK for one of the free (for all usage even commercial) ones instead:

  1. Oracle’s OpenJDK builds

Oracle also offers free (even for commercial use) binaries of OpenJDK, known as Oracle’s OpenJDK. The caveat with these is that older versions don’t get back-ported security fixes or critical bug fixes. Thus to stay up to date with security and bug fixes, you have to upgrade to the latest version. For example, as soon as JDK 12 releases, after JDK 11, Oracle will stop releasing newer updated binary builds of JDK 11. So if you want new fixes, you have to upgrade to JDK 12. A new version is released every 6 months, thus to be up to date on all fixes, you have to upgrade the JDK version every 6 months. In a commercial setting, this might be annoying, and a little too fast. Which is why in general, I’d recommend you first try out one of the alternatives such as #2, #3 or #4.

  1. AdoptOpenJDK

The Java community also offers free (for any use) binary builds of OpenJDK, known as AdoptOpenJDK. If you’re used to CPython, Ruby MRI, Haskell GHC, NodeJS, and other mostly community maintained language runtimes, you will feel right at home here, since it’s basically the same idea. The community volunteers do the work of packaging pre-built binaries and offers them for free. Unless #4 is available to you, or you know better, I’d go for this one.

Also, I’d suggest getting an LTS release, with the HotSpot GC, unless you know better.

  1. Amazon Corretto

Amazon also offers free (for any use including commercial) binary builds of OpenJDK, known as Amazon Corretto. They only offer LTS releases, and have good documentation. The binaries are pretty high quality, as they are certified Java SE compatible, and are used by Amazon in production. This is a solid alternative to #2, especially for production deployments.

  1. OpenJDK from your Linux distro

Many Linux distros offer their own binary builds as well. Search your package manager for openjdk and if you find a version of 8 or above, installing that might be the easiest. To get started, if your distro offers this, it’s the way to go, quick and painless, but might not be updated as regularly depending on your distro.

Whichever you choose, remember that they are all built from the same OpenJDK source code. These are not alternative implementations. The difference is what commits they are built from, if they cherry picked some extra security or bug fixes from later releases, how much testing was done on them afterwards, what platform they build it for, etc. Often times, the biggest difference is how long after they offer builds for older versions with patches applied. Otherwise, they are all going to give you the same set of features. Your Clojure code will be identical and will run on all of them.

Installing and setting up Java and the JDK

With that out of the way, here are the guides to help you install a JDK and/or set JAVA_HOME. I recommend you go for either JDK version 8 or 11, as they both offer long term support (or any newer LTS release if you are reading this into the far future). Similarly, current Clojure version (1.10 as of writing) requires version 8 minimum for the JVM. I will be using version 8 personally as I work through the series.

Installing a JDK
Setting up JAVA_HOME

Install the Clojure CLI

One thing that might be surprising is that Clojure is actually just a Java library. For those of you that know Java, that means Clojure is just a jar. It is published to Maven Central like any other Java library.

This is a bit inconvenient, because it means you always have to launch java with Clojure as a dependency to do anything with Clojure. To address that problem, there is an official Clojure CLI which wraps Java and Clojure together and makes it way easier to interact with Clojure without needing to worry too much about the details of Java. That’s what we’ll be using throughout this series. It is known as the Clojure CLI or tools.deps.

To install the Clojure CLI, follow the instructions in the Clojure installer and CLI tools sections of the official getting started guide.

Create a source file and run our program

Clojure code is contained within files that have the .clj or .cljc extension. For more complex programs, the source files have to be organized in a particular folder structure and with a certain naming convention very similar to that of Java’s. That said, for our first program, we only wrote a script. A script is a Clojure program which is fully contained within a single file. So for this chapter, we will focus on Clojure scripts only.

  1. Create a file anywhere you want and call it timer.clj.
  2. Copy/paste or hand type into the file our program code.
  3. From your terminal shell, within the same directory as where your source file is, run clj timer.clj.

clj is the Clojure CLI. If you give it the path relative or absolute to a Clojure script file it will run it for you.

Now admire:

$ clj timer.clj 
10
9
8
7
6
5
4
3
2
1
Done!

Understanding the execution model

Clojure is a compiled language. This is unlike most other dynamic languages such as Python, JavaScript, Ruby, etc. Which are all interpreted languages, not compiled. As such, Clojure has a compiler which takes Clojure source code and compiles it to Java byte code.

For those who don’t know Java, it runs on a virtual machine called the JVM, and that machine takes its instructions from an assembly language which is known as the Java byte code. When it runs, the Java virtual machine (JVM) will take the Java byte code and compile it to compatible machine code for the currently running hardware (this is known as just in time compilation - JIT). Java source code is thus compiled to Java byte code using the Java compiler. The resulting Java byte code is then shipped to end users along with a JVM for their platform. Using the JVM, they can then run the compiled Java byte code on their respective machine.

It’s similar for Clojure, except it takes the place of Java source code. You write Clojure source code, and use the Clojure compiler to compile it to Java byte code. You can then ship the compiled byte code along with a JVM to an end user, and they can run your compiled Clojure program. This is effectively what you’ve done in this chapter. You installed a JVM, which is bundled as part of the JDK you installed. After which, you’ve used the Clojure CLI to launch a JVM process, with Clojure loaded as a dependency, and had it execute your Clojure script. So why didn’t you have to first compile your script you ask?

You see, another unfamiliar thing about Clojure is that the compiler is a part of the Clojure runtime. What this means is that you don’t have to explicitly compile your source code, Clojure will perform the compilation itself as your program launches. When your script is ran, Clojure first compiles all of the Clojure source code into Java byte code, and then loads the byte code into its own process, which is already running inside a JVM instance. This is how, even though it is a compiled language, it maintains the feel of an interpreted scripting language.

Let me walk through this once more. The Clojure CLI first starts a JVM process using the java command. It sets it up so that the Clojure runtime (which is just a Java library bundled as a .jar file) is on the classpath (Java parlance to mean it is a loaded dependency). It also sets it up so that clojure.main is run on load. That’s the main Clojure runtime entry point. Once the JVM process is loaded, it thus calls the Clojure runtime’s main method. This receives your script as an argument, at which point, it will compile it into Java byte code, load the byte code into itself, and execute your program.

Sounds more complicated than it is. Bottom line, if you use the Clojure CLI, Clojure feels like an interpreted language, yet it gives you the performance of a JIT compiled one. This is a great strength of Clojure, it is one of the most, if not the most performant dynamic programming language around today. Definitely faster than Python, Ruby, JavaScript. In fact, most of the time, Clojure is within 10% of Java’s performance, and with some more advanced knowledge, can be made to match it. Now it comes at the expense of start time, since Clojure has to compile the code and dynamically load all of the byte code when the program first starts. Don’t despair though, there are solutions to that as well, such as relying on one of its dialects: ClojureScript, Joker; or doing native image compilation using GraalVM.

Thus, when people refer to Clojure compilation, or Clojure being a compiled language, it doesn’t mean compiled to native machine code, but to JVM machine code, aka, Java byte code, aka, JVM byte code, or just byte code for short.

Postface

Awesome work! You now have Clojure installed on your machine and ready to be used. You’ve already ran your first Clojure program, a timer application. You have a better understanding of the JVM, JDK and Java byte code. You have a rough idea of how Clojure source code goes from code to compiled byte code to machine code to running program. Most importantly, you can start messing with Clojure on your own, by writing Clojure scripts, and running them with Clojure CLI (clj command) to try them out.

In the next chapter, we’re going to revisit our program, because now that we know how to run it, it is time to understand how it works.

Feel free to ask questions in the comments, and be patient, I’m working on the next chapters!