Ashton Kemerling

AK

My Increasing Frustration With Clojure

Edit: TL;DR: This is about how bugs in Clojure are handled by the Clojure Team, not just complaints about specific bugs I’ve seen.

First off, this is not a “I’m quiting in disgust” post. Those are childish and a waste of everyone’s time. But this is a post of frustration as I watch something I really like being slowly allowed to get worse.

First off, some history. My first job out of College was in Common Lisp, and I love/hated it. The power it brought and the pain it brought were both one and the same. No modern libraries, no modern build tools (this was before QuickLisp). One on hand, I loved working with paredit and Emacs, being able to quickly fly about my code and manipulate it in blocks rather than line by line. On the other, I couldn’t help but be envious of those who could actually ask for help from a functioning open source community.

A few years of Python, Ruby, and Javascript later, I found Clojure. And I thought I’d found the solution to literally all of the things. Paredit works again? Check. A thriving open source community? You got it. Deploy as a Jar rather than CL’s hilarious “dump the state of a running program and call it good” setup? Fuck. Yes.

And beyond the superficial things, there was a lot to love, especially coming from a more recent brush with Ruby on Rails. Clojure makes it very easy to make things referentially transparent, and it tends to favor explicit calling semantics over convention (or more derisively, “magic”). This means that a Clojure code base will require more plumbing code, but that also means it’s possible to navigate to the code that does routing and understand how it works, no more having to search through your framework’s codebase just because they do dynamic method creation and method_missing magic.

As far as I was concerned, the editor was the only weak point for Clojure. Back when I got into Clojure Cursive was still brand new, Emacs really was the only editor that was worth using and even it had some stability and usability issues. But I assumed that continued interest would stabilize Emacs, bring Vim up to speed, and improve Cursive to the point where it would be competitive with Emacs/Vim.

But all was not well, and if I’d paid attention I might have noticed a few places where the core Clojure’s teams priorities didn’t seem to make much sense to me. And now that I work in Clojure professionally, I really cannot ignore them or remain silent about them any more.

The core Clojure team prefers green field development over improvements and bug fixes to existing code to a degree that deeply worries me. I no longer trust that any issues I find stand a chance of getting fixed, as all the bugs we’ve posted are either in limbo, or flat out rejected. Multiple members of my team have given up on posting new bugs because they have no faith that it’ll help anyone.

These are pretty heavy and vague accusations, so I’m gonna break this down a bit to make it clearer and easier to digest.

Ignorance or Apathy of Underlying Principles

Programming isn’t math per-se, especially in a language that’s not explicitly based on Category or Type theory. That said a lot of the things that we do are backed or defined by mathematics, and to ignore that is to guarantee bugs. This is most clear in clojure.set which contains functions that are supposed to mirror the definitions created by Set Theory like union, difference, intersection, etc.

And the namespace is completely riddled with bugs. union returns duplicates if some of the inputs are lists instead of sets depending on their length. intersection will either return nonsense values or throw a ClassCastException if you provide it anything other than sets, again dependent on data.

On their own, this is no big deal. Bugs happen, there’s really no point in berating people just because they made a mistake. Instead the bug gets fixed as best and as soon as reality allows and we all move on. In fact, for the above bugs there are two possible fixes: raise an IllegalArgumentException if anything other than sets are provided, or coerce lists and vectors to sets before continuing. Both of these approaches are valid due to the fact that this is a dynamic language that defaults to immutable collection semantics; which one you pick is then a matter of how you want to affect your downstream users.

Oh wait, some of these bugs were filed in 2009, 7 fucking years ago! Here comes the berating. These functions are tiny, a simple implementation of union is one line. And while they’re heavily used, they’re simple in usage and signature; no need to change a lot of call sites to fix this bug. There are only two reasons to explain why these bugs have not been fix; they either do not understand that this is an issue, or they do not care.

Actually, their comments on the issues lets us know that they do not understand that this is an issue. Rich Hickey said in 2009 “the fact that these functions happen to work when the second argument is not a set is an implementation artifact and not a promise of the interface”. How you define getting the wrong type with nonsense values counts as “working” is beyond me. Is it just because it doesn’t throw an Exception? Anyone here prefer bad data instead of exceptions when dealing with functions like this? I doubt it.

Inconsistency Between Best Practices and Clojure Implementation

Clojure includes a pretty powerful concept called protocols. Basically a protocol is an interface that can be added to classes after the fact, and lets you dispatch to different behavior silently at run time.

This is pretty neat, it lets you abstract over multiple data types and include Java classes in the fun. For example ISeq provides all the methods needed to iterate over a collection and it works with all the Clojure and Java data types. So you can use Clojure’s map function over its own data types and Java’s because it depends on the seq interface.

As you can imagine, this is the recommended way to work with things. Rather than having to do cond or if logic on various classes, define and use an appropriate protocol and you’re good to go!

It sounds like a good theory doesn’t it? But Clojure itself doesn’t actually do this. Clojure.core contains 89 calls to instance? in order to check runtime type, instead of the helper methods to check for protocol implementation.

Here is a bug found by my colleague that highlights this issue. List and Vector are both seqs, but into for a map only accepts vectors, lists causes a ClassCastException. This is kind of nuts because an IllegalArgumentException makes more sense, and there’s no practical reason to differentiate between a list of two elements and a vector of two elements. Actually, Clojure considers [1 2] and (list 1 2) to be equal, so this really makes no sense

Even more obnoxious, it was closed as wontfix. Apparently a single sentence in the docs is good enough for the Clojure team, as well as a paper-thin argument about performance on a 2 element list. So not only is this just broken in a barely documented and very surprising way, Clojure itself ends up programmed in a way that isn’t recommended by the Clojure docs.

This has spread to other projects. Om has a bug where lists aren’t acceptable in its data structures, only maps sets and vectors. To say that I was treated pretty shabbily by David Nolen on this issue almost goes without saying. Naturally the intro docs barely call this out, and the docs dedicated to the troubled component does not mention this at all. To be fair, the troubleshooting guide explains this, but in my opinion that’s probably a clue that the bug is common enough that you should find a fix for it.

Show Stopping Bugs Remain Untouched

There are a shocking number of big, bad bugs hiding in the Clojure Jira, some really old

I could go on, but I feel that I’ve made my point. Bugs, even major ones are either closed as “wontfix”, or are ignored for years despite the pain felt by users. That’s not even covering the dismissive and distrustful attitude given in some of the replies.

Strange Priorities

The Clojure team appears to be super focused on new features, at the exclusion of existing namespaces. The big highlights from the past year or so have been Transit, Transducers, and Spec.

These are okay, I guess. We use transit a bit, and it’s kinda cool. But we really don’t use 90% of its features, it’s basically JSON for us that can convert numbers to BigDecimals.

We have yet to find a place that Transducers would help us. They’re neat enough, but the built in lazy sequences are working A-OK for us, so we don’t really feel the need to change over.

I’m not holding my breath for Spec. It doesn’t fix anything for me that other libraries aren’t already providing.

Know what hasn’t seen any major improvements in forever? Clojure.test. Clojure.Test is frankly sad. Fixtures are done via some global state, and you can’t even setup fixtures to work across the entire test suite. Need a database to run functional tests? Well either you need to override the main test runner (good luck running individual tests now!) or you have to setup each namespace to open and close its own database connection (don’t forget, or your DBA will wonder why Emacs has 1000+ database connections). I’m 100% behind the idea that I’ll have to write a bit of glue code, but without anywhere to put that code I’m kind of screwed.

And then there is the is function. It’s literally the only assertion provided by clojure.test. It’s this fancy little macro that grabs its body, evaluates it, then uses the body to produce a human readable message about the failure.

And it’s garbage. The fact that plugins exist to make this easier on the eyes should tell you everything you need to know. Oh but don’t use that with Emacs/Cider! It’ll crash the Cider plugin, which is trying to parse the default output.

Back when I used Emacs, I had a stash on my box that disabled AOT, pedantic checking, and the humane-test-output plugin from my project.clj in order to use Cider. Without that stash applied Cider wouldn’t start, couldn’t reload code, and would crash when running tests. Now that I use Cursive that’s less of an issue, but it’s still kind of nuts I had to decide between a working editor and readable output when I ran lein test

Sorry, I didn’t even highlight the craziest bit of that last paragraph, did you catch it? I had to disable humane-test-output from my project.clj. That’s because you install it by injecting some code in project.clj that redefines some multi-methods, because there’s no plugin architecture. How nuts is that?

Now I might hear you say “You don’t have to use clojure.test!”, and you’re right. But clojure.test has clearly won in the Clojure testing namespace. The only real competitors for clojure.test are Speclj and Midje. I’ve literally never met someone in person that’s used Speclj, and Midje is super polarizing because it’s basically a collection of magic macros. The fact that the second entry for Midje is about CircleCI rewriting from Midje to clojure.test should tell you a lot.

So why don’t we have more creature comforts for clojure.test? I’m not really sure. As far as I can tell the change to it was the inclusion of test.check, but that really was nothing more than simple-check getting renamed and transferred to Clojure ownership.

Okay, Now What?

As I stated before, this isn’t a “I’m quitting Clojure!” post. Partly this is because I work in Clojure on a daily basis, and I both like my job and am professional enough to keep working despite my complaints. And partly this is because I do not have a replacement for Clojure in mind for my own personal projects. But off the top of my head, there are the things I’d like to see fixed in the Clojure areas.

  • More love for clojure.test.
  • No tolerance for bugs that result in bad-data. Built in functions should either work, or throw an understandable exception.
  • Friendlier responses in Jira. Someone who has gone to the work to sign up and try to help out should be treated with more respect.
  • Fix underlying compiler bugs before adding features. The other way only codifies bad behavior and guarantees that it cannot be fixed.
  • Understand that if enough people have the same issue, it’s the codes fault and a FAQ entry is not good enough.

Basically I want Clojure to be a simple to use language backed by a friendly and active community. What I see now is drifting in the wrong direction, and I’d like to see that corrected.

Integrating Test.Check and Javascript

Introduction

I was recently on The Cognicast with Craig Andera where we discussed using Generative Testing on a large non-Clojure(script) codebase, in particular Ruby on Rails and Backbonejs. If you haven’t listened to the show yet I highly recommend it first.

As I promised on the show, I’d like to share how we used Test.Check to test our Backbone.js code base. Our overall strategy for testing Javascript is going to be:

  1. Compile JS into one file (just like prod).
  2. Compile tests into a single file.
  3. Combine them in a PhantomJS process.
  4. Let the tests do their thing.

While we have been super pleased with the results of Generative Testing, there have been some hurdles for getting it to work for us. In this post I’m going to go over how to setup Test.Check to work with your Javascript app, and how to dodge all the pitfalls I found.

Here are the challenges that lie between us and Generative Testing bliss.

  1. Picking the right library
  2. Setting up Leiningen & Cljsbuild
  3. Dodge PhantomJS issues
  4. Avoid mangling your app, and defeating dueling dependencies

Picking the Right Library

First of all, there are two libraries that exist, Test.Check and DoubleCheck. Because Test.Check is an official Clojure library it is Clojure (JVM) only, so I recommend DoubleCheck (maintained by Chas Emerick) which is capable of cross compiling to Clojure and Clojurescript.

The only catch with DoubleCheck is that it’s not currently possible to segregate tests via metadata for running in groups. But with the way we will be running these tests that won’t be an issue.

Setting up Leiningen

First step, install Leiningen and create a project.clj wherever you Javascript code is. We’re going to use Cljsbuild to compile our testing code for execution. I in put my test code in test/cljs (because I have clj and cljs based tests), and send the compiled output to tmp/tracker-cljs.js. Note: this guide only works for Clojurescript 0.0-2234, I need to figure out why the latest build of Clojurescript doesn’t work.

I highly recommend you send the output of the compilation process to either a temporary or gitignored location. The output will be fairly large, and it will bog down your repository with its size.

I don’t want to duplicate the Cljsbuild how-to, so if you don’t know how to make it work, you should check their docs. Our project.clj is reproduced at the bottom of this post if you have issues.

At this point, we can write the simplest test possible:

1
2
3
4
5
6
7
8
9
10
11
12
13
(ns tracker-cljs.simple-test
  (:require [clojure.test.check :as core]
            [clojure.test.check.generators :as gen]
            [clojure.test.check.properties :as prop]
            [cemerick.cljs.test :as t])
  (:require-macros [clojure.test.check.clojure-test :refer (defspec)]
                   [clojure.test.check.properties :refer (for-all)]
                   [cemerick.cljs.test :refer (is)]))

(defspec simple-test 10
  (for-all [v (gen/such-that gen/not-empty (gen/vector gen/int))]
    (println v)
    true))

PhantomJS issues.

In order to run the tests, you would typically have a :tests section in :cljsbuild that looks like this:

1
2
3
:test-commands {"unit-tests" ["phantomjs" :runner
                                          "compiled-application.js"
                                          "tmp/compiled-tests.js"]}

This will load our JS into the app, along with the tests, and then run them. But you might notice errors that look like this:

1
SECURITY_ERR: DOM Exception 18: An attempt was made to break through the security policy of the user agent.

That means that your app code is trying to access local storage, and PhantomJS does not like it when you do that without loading a webpage. The solution for this is to start a server so we have a page to visit, and visit it via PhantomJS. So on Tracker we use the following two bits of code.

generative_runner.js, to visit the actual page:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// reusable PhantomJS script for running clojurescript.test tests
// see http://github.com/cemerick/clojurescript.test for more info

var page = require('webpage').create();
page.onResourceError = function(resourceError) {
    page.reason = resourceError.errorString;
    page.reason_url = resourceError.url;
};
var fs = require('fs');
var sys = require('system');
var success;

page.onConsoleMessage = function (x) {
    var line = x.toString();
    if (line !== "[NEWLINE]") {
        sys.stdout.writeLine(line.replace(/\[NEWLINE\]/g, "\n"));
    }
};

page.open('http://localhost:4500', function(status) {
    if (status !== "success") {
        console.log("Couldn't load page");
        phantom.exit(1);
    }
    for (var i = 1; i < sys.args.length; i++) {
        if (fs.exists(sys.args[i])) {
            if (!page.injectJs(sys.args[i])) throw new Error("Failed to inject " + sys.args[i]);
        } else {
            page.evaluateJavaScript("(function () { " + sys.args[i] + ";" + " })");
        }
    }

    page.evaluate(function () {
        cemerick.cljs.test.set_print_fn_BANG_(function(x) {
            console.log(x.replace(/\n/g, "[NEWLINE]")); // since console.log *itself* adds a newline
        });
    });

    success = page.evaluate(function () {
        var results = cemerick.cljs.test.run_all_tests();
        console.log(results);
        return cemerick.cljs.test.successful_QMARK_(results);
    });
    phantom.exit(success ? 0 : 1);
});

And a rake task to stand up a server and run everything. If you don’t use Rails for your Javascript code you might need to use different commands to compile, but the intention remains. The WEBrick server provides a blank page for us to visit and run our tests on, which prevents PhantomJS from raising security errors.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
require 'fileutils'
require 'webrick'

namespace :test do
  desc "Run ClojureScript generative tests"
  task :generative do
    server = WEBrick::HTTPServer.new({:Port => 4500, :DocumentRoot => ".", :BindAddress => "0.0.0.0"})
    Thread.new do
      server.start
    end
    exitCode = 0

    begin
      Rake::Task['assets:clean'].invoke
      Rake::Task['assets:precompile'].invoke
      exitCode = system "lein cljsbuild test"
    ensure
      server.shutdown
    end
    exit exitCode
  end
end

To make all this work together, update project.clj to reference the generative_runner.js file instead of :runner, and use rake test:generative to kick off the run.

Don’t Mangle Your Code

If your application is anything like Tracker, you might use some Google Closure dependencies without using the entire Closure compiler. And even if you don’t need use Closure, you certainly have functions and classes in the global namespace that you don’t want mangled.

To get around this, I recommend the following settings:

Add :libs [ "compiled-application.js" ""] to the cljsbuild section in project.clj. This prevents DoubleCheck compiler errors due to classpath issues, and it allows the Closure compiler to see everything that your application provides. So if your tests and applications have overlapping Closure dependencies you won’t get double provide errors.

Secondly I recommend that you only use the simple compilation mode. This will prevent Closure from mangling global names, which will make debugging easier and prevent your tests from being able to find the production code. The space saving and code elimination that advanced mode provides is more of a problem than a benefit for testing, so it’s not worth fighting to get advanced to work.

You can fiddle with source maps if you wish, but I haven’t had much luck or use for them; simple compiled Clojurescript is easy to read, and most of the serious errors have come from the 43k application javascript file, not the test file.

Have Fun and Make More Tests

Once you have that going, it should be possible to open up and create increasingly complicated tests. As a teaser and a good example, the following code caught a tricky JS ordering bug.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
(ns tracker-cljs.panel-items-test
  (:require [clojure.test.check :as core]
            [clojure.test.check.generators :as gen]
            [clojure.test.check.properties :as prop]
            [cemerick.cljs.test :as t])
  (:require-macros [clojure.test.check.clojure-test :refer (defspec)]
                   [clojure.test.check.properties :refer (for-all)]
                   [cemerick.cljs.test :refer (is)]))

(defn ids [items]
  (if (seq? items)
    (map (fn [item] (.get item :id))
         items)
    (map (fn [item] (.get item :id))
         (.-models items))))

(defn create-models [ids]
  (map (fn [x] (Backbone.Model. (js-obj :id x)))
       ids))


(defspec sort-check 100
  (for-all [v (gen/such-that gen/not-empty (gen/vector gen/int))]
           (let [models (create-models v)
                 sorted (sort v)
                 subject (tracker.PanelItems.)]
             (.reset subject (apply array models))
             (.refresh subject (apply array (sort-by #(.get % :id) models)))
             (is (= sorted
                    (ids subject))))))

And the following is our project.clj, with unnecessary details elided for readability.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(defproject generative-testing "0.0.1-SNAPSHOT"
  :plugins [[lein-cljsbuild "1.0.3"]
            [com.cemerick/clojurescript.test "0.2.2"]]
  :dependencies [[org.clojure/clojurescript "0.0-2234"]
                 [com.cemerick/clojurescript.test "0.3.1"]
                 [com.cemerick/double-check "0.5.8-SNAPSHOT"]
                 [org.clojure/clojure "1.5.1"]]
  :cljsbuild {:builds [{:source-paths ["test/cljs"]
                        :compiler {:output-to "tmp/tracker-cljs.js"
                                   :libs [ "public/web/assets/application.js" ""]
                                   :optimizations :simple
                                   :pretty-print true}}]
              :test-commands {"unit-tests" ["phantomjs" "lib/generative_runner.js"
                                            "public/next/assets/next/next.js"
                                            "tmp/tracker-cljs.js"]}})

Also, the following function is helpful for converting from Cljs data structures to pure Javascript ones. It causes some compiler warnings, but they appear to be harmless.

1
2
3
4
5
6
7
8
9
10
11
(defn clj->js
  "Recursively transforms ClojureScript maps into Javascript objects,
   other ClojureScript colls into JavaScript arrays, and ClojureScript
   keywords into JavaScript strings."
  [x]
  (cond
    (string? x) x
    (keyword? x) (name x)
    (map? x) (apply js-obj (flatten (map (fn [[key val]] [(clj->js key) (clj->js val)]) x)))
    (coll? x) (apply array (map clj->js x))
    :else x))

Good luck, and don’t hesitate to reach out to me on Twitter if you have any questions!

Unusual Productivity Hacks

The internet is lousy with productivity ideas, mostly about how to work harder or longer. I personally believe that good productivity is about maximizing per hour results, not working harder. And the fastest way to improve your productivity is to eliminate some of the things slowing you down. So rather than going over the usual suspects, let’s take a look at eliminating some of the low hanging fruit.

1. Conquer Your Diet.

What you eat is the cornerstone of who you are and what you do. The proteins in your muscle, the fats in your cell walls and your brain, and the amino acids used throughout your body must all come from, or be synthesized from your food. Low quality food products like trans fats have been connected with apathy, depression, and might be related to ADD. In order for your brain and body to perform at peak levels you need to give it high quality food to repair and refuel.

2. Sleep

Sleep is massively underrated. Everyone knows at this point that Americans are typically not getting enough sleep on a nightly basis, but distressingly few people consider it to be an issue. After all, hours spent asleep aren’t hours spent working right? Unfortunately it isn’t that simple. Insufficient or low quality sleep is one of the fastest ways that we know of to destroy per hour productivity. Get sleep or waste your precious waking hours in a mind fog.

3. Know when to stop.

Athletes regularly destroy their bodies by “over training”; exercising to the point of injury or chronic fatigue. The results of over training are often slow growth or even frustrating setbacks despite hundreds of hours at the gym or on the track.

Mental workers regularly do the exact same thing to their mind by working well past the point of mental burnout. There is very little use in working when you are not going to be putting out at least average results, and you risk ruining your morale after hours of low-results work. Instead of ritualistically working even when you are spent gain an intuitive sense for when you will accomplish little and instead go recharge and try again later.

4. You must come out unharmed 1 time or 1000 times.

No productivity regime is worthwhile if you can only maintain it for 2 weeks at the start of each year. To be successful, you must find a pattern that you can maintain years. Whether that is a small amount per day or a cycle between intense work and relaxation, you must find a balance between work and play or you will be plagued by failures to meet your goals.

5. Know thyself.

Everyone wants to be successful, but most people overestimate how much money motivates them. Find ways to motivate yourself with something other than riches, or better yet find something that you love to do. Love of the work will make it easier to get up, or home, every day and work rather than abstract future financial rewards.

The Swordsman and the Software Engineer

One of the largest mistakes you can make as a knowledge worker is to focus 100% of your time on your craft. It’s easy to believe that specializing and focusing will make you better than your peers, but I do not think that is the case. Not only will specializing cause you to plateau earlier than your peers, it will cause you to be less happy and healthy than your diversified peers.

A bit of background at this point would probably be helpful. I’m a software engineer and I’ve coded both professionally and on a hobbyist basis for nearly a decade. After a particularly stressful few months at a previous job, my fiancé forced me to join a local gym to de-stress. The gym focused on various European martial arts and I ended up in a class for the Italian longsword circa 1409.

A few years ago I would have suspected that choosing to surrender 2-8 hours a week to swinging steel around instead of programming on hobbyist projects would slow down my growth. Now I believe that I owe a lot of my growth, professionally and as a human being to this practice. A lot of what I’m going to cover here will probably be old news to anyone who was heavily involved in sports, nor is it particularly unique to fencing. That being said I’m fairly confident that a lot of knowledge workers have lost the involvement with physical activities they might have had, or never were all that “athletic” even in school.

The biggest modern challenge is that humans are not wired for the type of work that we now do. We are a fairly clever species by nature, which is why we have been making art for hundreds of thousands of years. But we are still more or less genetically and mentally hunter gatherers from 100,000+ years ago who largely worked for their survival. Our bodies, genes, and minds are wired to expect a specific ratio of play and physical activity to signal that all is well. Unfortunately we work a lot longer than our ancestors did, and under very different conditions.

This high level of nothing but work, physical or mental, indicates to our bodies that times are tough and that it should release stress hormones to help us survive the coming hard times; these stress hormones tend to have serious detrimental effects on mental performance and long term health. The phrase “all work and no play” may be over-used, but it does have some truth. All work and no play leaves Jack pumped full of cortisol, short on sleep, and low on testosterone.

Secondly, most knowledge workers spend their entire time thinking only with their frontal cortex, or the analytical portion of the brain. This is very helpful if your job involves concentrating on difficult problems all day, but it is incredibly easy to let that portion of your brain become the only driving factor on your day to day life. The brain, like your muscles, should have ample opportunity to exercise all of its faculties and have recovery time between each heavy usage. Expecting it to be able to focus deeply on your work day in and day out without giving it time to relax is simply asking for lowered performance and burnout, and only working on one aspect of your mental performance is akin to only doing curls at the gym; the result is an odd shape with very little practical strength.

Thankfully exercise and hobbies help other parts of the brain. This is one of the things I love about fencing: it isn’t very analytical once you actually start using it. There are a ton of cuts and guards to be memorized, but you never have time to think about it when you are actually fencing. All of the drills are designed so that your mind and body learn to move with instinctual grace from one guard to another. There is very little conscious thought that happens mid-move in a match; there simply is not enough time to stop and think. Instead you learn to have an internalized notion of time and measure, and an ability to make new decisions as the fight progresses quickly and correctly.

And while none of this directly relates to software engineering, it has a positive effect on my daily work. The ability to move with a fight and think with my toes and fingertips have given me a greater appreciation to the importance of gut instinct in more situations. And the constant practice of excluding my analytical mind in a fast moving match have improved my ability to enter a flow state more easily. Combine these things with the general good effects of mental down time and exercise, and I think it’s hard to argue that my time would have been better spent working on hobbyist projects.

Confessions of a Language Snob

I am a language snob. In particular I fall head over heels for most functional languages, especially MLs and Lisps. Show me the latest and greatest Javascript framework and I will just wish I had immutable data types and a saner method dispatch system. I try to keep quiet about it at work with varying degrees of success, but it’s frustrating to work around one language’s problems when you know of other solutions.

The difficult reality that few admit is that every language has a weak point. Ruby is slow and the lack of import semantics and proper namespaces makes it difficult to determine what code will run. Python lacks a good lambda and whitespace sensitivity brings new difficulties. Javascript is just pure insanity. Lisp is poorly standardized and tends to have subpar documentation. The list goes on and on. No language is perfect, but it’s very easy to focus on the high points of one language while working through the low points of another.

The real bummer about being a language snob is that there’s really nothing to be done about it. For any given issue there will always be another language that exceeds in that area, but it’s almost always insane and impractical to convert your entire company over to it. So even if you think Go would solve every problem that your Python codebase has, and you know that the downsides wouldn’t be insurmountable, the simple reality is that convincing the organization to throw away their perfectly good Python code on a whim is insane at best. And that’s assuming that converting to your language wouldn’t come with downsides worse than the language you are coming from.

Thankfully, there is an upside to being a language snob. Polyglots have a far more flexible understanding of what a program should do, especially when they’re used to a wide range of paradigms. Someone who has done nothing but C or Java programming might have very little context into why mutable state can be so problematic, but someone who knows Clojure or Haskell will know the tradeoffs of mutable vs. immutable state intimately. Each new paradigm a programmer embraces means that they have more internal views on how a particular problem could be solved, a bit like having an experienced team in your head to discuss the merits of various techniques at lightning speed.

While it may be very frustrating to know of better solutions in other languages, there is a benefit to knowing about them. Being a language snob can help you evaluate your choices more effectively and discover solutions and strategies that might not be immediately obvious if you only knew one language. And between being a bit of a snob and attempting to shoe-horn every problem into a one-size fits all paradigm, I’ll take being a snob.

The Primacy of the Build Tool

No programming language stands alone. Besides the compiler, every programming language includes an ecosystem of libraries, build tools, analyzers, debuggers, and other utilities. Languages often rise and fall depending on the quality of these tools and libraries.

For every language there needs to be one central item upon which every other tool depends. In most languages, this is the compiler or interpreter. Your Rails project is entirely dependent on the version of Ruby provided by the current environment, and similarly Maven depends on the version of javac and java available on the path.

Unfortunately, this makes our code more fragile and dependent on the machine it was first created on. Someone cloning your code from a different machine must take care to ensure that their development environment is close to the original authors, and deployment must ship the correct compilers and interpeters for production to work well. We have created tools to help enforce the requirements of the code, but they are fragile and make upgrading dependencies a pain, as anyone who has had to fight with RVM can attest.

The one exception to this I have found is Clojure. Clojure inverts the normal order making the build tool the central item, with the compiler provided by the project definition file.

1
2
3
4
5
6
7
8
(defproject foo "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "All Rights Reserved."}
  :dependencies [[org.clojure/clojure "1.5.1"]
                 [org.clojure/clojure-contrib "1.2.0"]
                 [clj-time "0.6.0"]]
  :profiles {:dev {:dependencies [[midje "1.5.1"]]}})

The beauty of this change is that it makes setup trivial for another developer. All they need is the same build tool, and it will deal with the correct versions of both the compiler and any libraries for the project. Have other projects that depend on different versions of the compiler? The build tool only cares about the dependencies in front of it, and will call the correct version from the correct project.

This also makes upgrading trivial. Want to try Clojure 1.6.0? Change “1.5.1” to “1.6.0” in the above snippet. Want to write a library that supports multiple versions of the compiler? The build tool supports profiles which allow you to swap out compilers trivially because it’s just a dependency.

1
2
:profiles {:1.3 {:dependencies [[org.clojure/clojure "1.3.0"]]}
           :1.4 {:dependencies [[org.clojure/clojure "1.4.0-beta1"]]}}

Deployment gets easier as well. If you’re deploying an uberjar, the core libraries you tested against are also shipped to production in the same jar. No need to upgrade your deployment scripts when a new version of Clojure comes out, as everything is included automatically.

There is one catch to this wonderfulness, which is that Clojure depends on the JVM, and the build tool cannot change the JVM around. But Clojure has very simple requirements, Java 1.6 or greater, which makes it simple to deploy anywhere.

Disdain

One of my hobbies is fencing. Not modern Olympic fencing, but 14th century longsword fencing in the Italian school. In every class the instructor reminds us that we should act “like haughty Italian nobles, tall and relaxed” in the way that we stand, move, and handle the weapon.

The word “Sprezzatura” crops up a lot in these discussions. The simplest translation is literally “disdain”, but a more careful translation would be “studied carelessness”. To act with Sprezzatura means to make learned actions look easy and natural.

It turns out there is a really good reason for a fencer to act this way. A relaxed and calm fencer can move more rapidly and adjust their actions depending on whether they are winning or losing the bind. Their actions happen without any tell, catching their opponent by surprise and allowing them to act within their opponents tempo. In sword fighting the ultimate goal is to strike your opponent without being hit. Thus the shortest and simplest actions are favored, as large embellished actions only increase personal risk.

There is a similar grace in programming. While programming is not usually a competitive or dangerous hobby, there are practical benefits to acting quickly and gracefully. The most effective engineer completes their task with the minimal amount of time and added complexity. Showing off in the code only increases the risk of regressions and makes the code harder to modify later. The ultimate goal is a maintainable application that meets requirements in the minimal amount of time, and an accomplished engineer will take the shortest route to that goal.

Sword fighting is not about strength. It is never effective to swing a sword with all of your strength. Even if this wasn’t unsafe and an obvious tell, it isn’t a good way to cut with an edged weapon. Swords depend on a cutting edge to do damage, and thus a smooth arc that draws the weapon through the target will always cut more effectively than a ham-fisted baseball swing.

Similarly, the accomplished engineer knows that completing a task is not about the number of hours spent, but the quality. The mind is a tool that can be both sharpened and dulled with both work and rest, and programming is a task of the mind. Thus the quality engineer avoids excessive hours, as they are unhealthy and ineffective. Instead the engineer makes their limited productive hours as effective as possible without excessive strain.

It may not be the connection you expected, but humans haven’t changed in a very long time. Even if our circumstances have changed, there is always something to be learned from even the most esoteric sources.

Managing Is a Craft Too

I’m getting a little tired of seeing posts saying that the best managers must be an ex-engineer or a current one. I think coding skill is a very narrow minded way to judge both a human and a professional, and a terrible way to run a business.

Here’s the simple truth, a manager is a craftsperson just like a designer or engineer. The only difference is that their craft is organizing people, not designs or code. In their trade the best tools are flexibility, communication, empathy, and comprehension. My personal opinion is that a good manager is at least as hard to find as a good engineer, if not harder.

To say that a manager of engineers needs to know how code and needs to code daily is like saying that you have to be a doctor to manage a doctor’s office, and that you need to be practicing right now. While a bit of knowledge aids in communication, there are a wide range of tasks that need to be performed that are not coding. And to ignore these tasks would be just as disastrous as not writing the software that the company sells.

We’ve all heard the horror stories of non-engineer managers not knowing or caring what engineers do, or the ex-engineer manager thinking that they’re still on the team when their knowledge is 20 years out of date. Clearly both of these people are being ineffective managers, but not because of whether or not they can code. These people are being bad managers because they are not listening to their staff. Anyone, coder or not, will be a terrible manager without the requisite people skills.

So hire and keep managers that listen and communicate well. Managers that manage expectations of those uphill and divert shit rolling downhill. The best manager is the one that helps their team be the best, whether or not there are any commits with their name on it.

Internationalization Golf

Martin Grüner had a fun article about his experience writing an internationalized app. I thought it would be fun to share my own experiences.

My first job out of college was working on a Common Lisp (CL) web application. The application was only a few years younger than me, and had originally written in CL due to a particularly good HTML/XML library available in CL at the time. Unfortunately in the intervening years the HTML library stopped being state of the art, and the whims of enterprise software engineering had left CL behind for web development, resulting in a serious lack of common programming conveniences.

Right after joining the company, I was informed that the sales person had a potential lead with clients in a Spanish speaking country. The application at this point was English only, but the QA engineer was married to a native Spanish speaker who was willing to help translate the application. All we needed to do was wire the application up for internationalization and localization. I was tasked with picking or creating a library and interspersing it throughout the application. The only criteria was that it both looked good and was capable of displaying different languages to different users depending on their browser’s “accept-language” header. So compiling or packaging up a new application with hard-coded languages was not an option.

I eventually decided that all the existing libraries were insufficient and we needed to make our own. I’m still not sure if that was the right choice or not. CL doesn’t have the strongest library ecosystem around, but I was also a very young engineer and more susceptible to the “Not Invented Here” syndrome than I am now. Although a quick perusal through the current offerings involves libraries whose home pages are 404s, libraries who are nothing but FFI bindings to a GNU C library, and those whose list of defects includes “no documentation”, “undocumented code” and “slow PO parser”.

Compounding the issue was CL’s format function. CL has an exceedingly powerful formatter that is capable of unwrapping loops and interspersing the correct combination of “,” and “and” for a list of strings. This function was used with (reckless) abandon throughout the codebase, something for which I deserve some blame. The lack of dedicated template files compounded the issues; it’s a lot easier to reach for format when you’re producing HTML the handler-function itself.

There was no way I was going to explain the (non-technical) translator how to deal with format directives like this: “~#[NONE~;~a~;~a and ~a~:;~a, ~a~]~#[~; and ~a~:;, ~a, etc~].”, and I didn’t want to dig through 500kloc and unroll all the directives. So any translation system I made would need to support at least a subset of the CL formatting directives while hiding them for the translator’s sanity.

Worse still, CL’s formatter accepts positional arguments only, to my knowledge. Thus there’s no particular way to convince the formatter to modify the order of the parameters if the target language has different language structure than English. So my system would need to deal with that.

The final format I settled on would look something like this. The programmer (me) would change (format nil “~a” var) to (jibberish:format “<~a:variable-name>” “Descriptive sentence for translator” :variable-name var). We could then convince our code to print out a file for a language like this:

1
2
3
4
5
6
7
8
####################################
File: filename.cl
####################################

Original: <variable-name>
Translation: Translation goes here
Note for translator: Descriptive sentence for translator
Original Argument Order: [variable-name]

And so on. At runtime the language file would be parsed into a in-memory hash map, which would allow us to replace the format-string with the new one, re-order the argument list according to the translator’s needs, and strip the identifiers from the formatting string leaving the raw format directives.

So, how’d I do? Mixed results. The conversion was long and painful, requiring that each format directive and raw string be touched. A huge portion of the “Notes for translator” were either blank due to difficulty and fatigue, or were something along the line “Description, part 1/n” due to multiple calls being joined together in HTML. Changes in formatting calls required that the matching translation entry be hunted down in every translation file for the translation to still work.

But technical challenges are usually not a big deal, after all that’s a fairly large portion of what engineers do. Probably the worst problem with my system was how it worked for non-technical folk. The first attempt at giving this file to a translator resulted in her helpfully translating all the variable names into spanish, changing “It is <today>” to “que es <hoy>”, which resulted in some rather exciting errors. I think my own design sabotaged me in this particular instance, as it required way too much careful explanation to be usable. It was my first encounter with the major difference between using a program whose internals you are familiar and explaining its use to someone else who has never seen anything similar.

I think if I had to do it again, I would’ve probably spent more time and modified the way the program generated strings instead. I suspect that my custom library avoiding refactoring everything was a case of false economy, as the time saved up front would have been lost in the time required to train translators and maintain overly fragile translation files. I also learned that if you plan on selling an application in another language market, you need to think about that before you start writing code. Internationalization limits some of the choices you can make with your software and design, and it’s a lot easier to use that restricted set up front than it is to unwind them after the fact.

As a final irony, while attempting to write this post in Octopress (which uses Jekyll), it crashed several times because the SASS files in Octopress use unicode, which Octopress appears to hate out of the box. You have to change a few environment variables to convince it that Unicode is ok.

Thoughts on RubyMine

As part of my new job at Pivotal Labs I’ve been pair programming almost every day. The obvious challenge with pair programming, especially in a popular language like ruby, is in choosing what tools to work with. Vim, Emacs, RubyMine, TextMate, the choices are various and divisive.

To make peace among the engineers, it makes sense to dictate one set of tools to make peace among all your employees, and to make provisioning the machines easier. Pivotal has decided to standardize on RubyMine with a dark color scheme and a few custom configurations.

The Good

RubyMine, being set up for ruby in particular, works very well with navigating, indenting, and colorizing ruby code. With the exception of setting a variable as the result of an if expression, I have never seen RubyMine indent code incorrectly or get confused on coloration.

It also does a decent job of navigating to ruby classes and functions, something that is quite hard since ruby lacks explicit import semantics. It definitely makes hunting down odd test harness functions down a breeze, and has saved me in the past. It also is good at identifying the view that cooresponds to a controller method, but due to the way that tracker is laid out, I haven’t had a chance to use this feature often.

And as the cherry on top RubyMine includes a “Textmate like” quick find feature. In large code bases this will save you about a minute trying to find a particular file, so ling as you have an idea of what it’s called!

The Bad

If you only have unique controllers and models, you will absolutely love the ability to jump to a class definition. Since we use a lot of similarly named controllers inside namespaces to control API versions, it often gets confused about what version I want. The quick find feature sometimes ignores the path if you provide it, which makes copying files from stacktraces occasionally unreliable.

Javascript and less support is mediocre. No real complaints, but nothing to set it apart from other environments in my opinion. Maybe my colleagues who work on the front end would have a more nuanced opinion.

Also sometimes with large files the coloring or error checking can lag behind. This usually kicks in at files longer than 3000 lines, which is not unusual for test files. For the most part this is just an annoyance which doesn’t affect editing in any serious way.

It has git integration, which lags behind the offerings from Emacs, Vim, and Eclipse in my opinion. I end up using the command line instead of the built in tools.

The Ugly

RubyMine is incredibly dim witted when it comes to parenthesis and quotes. If you wish to put an escaped quote at the end of a string, RubyMine will let you escape your ending quote, then insert a matching pair right afterwords when you try and fix the mistake! To add insult to injury, you must move the cursor before fixing the issue lest RubyMine delete both of the extra quotes. Very frustrating.

RubyMine is also one of the most memory consuming programs I use. At least once a week it will grind my machine to a halt due to memory usage, which is impressive on a machine with 16G in memory.

Conclusions

Out of the box, RubyMine works very well. I think it is a good compromise for large teams. But if you have time to learn some more complicated tools, I believe you would be way better off learning a more customizable editor like Emacs or Vim. It might take some effort, but these editors can do just as much as their commercial cousins and will continue to do well no matter what language you decide to use in the future.