Trailing Whitespace in Commits

The Problem

You’re opening a pull request to do a code review. There are 34 changed files. You find this distressing, but you dig in and find only three changed files contain changes in logic: the rest are removing whitespace. You examine the three easily and go on to what’s next.

But suppose you had seen there were 34 changed files and cringed and gone on to another task instead? It would be better if there were a way to quickly distinguish real changes from formatting changes. Better still if we could prevent commits that add whitespace we’re only going to take out later in the first place.

Hiding but Not Solving the Problem

git diff, of course, shows you the differences between two code trees.

git diff -w / git diff --ignore-all-space shows you the differences excluding whitespace differences.

You can get github to ignore whitespace in its diffs by appending ?w=1 to the URL (say, of the pull request that you’re reviewing).

This solves the cognitive load problem – being able to tell at a glance that 31 of the 34 changed files are not significant is a huge plus – but all of those extra files have still been changed. Given a small team of great developers, no problem, but if you’ve got multiple people at multiple skill levels working on multiple branches, merges become more complicated and merge conflicts more likely than they need to be.

Solving the Problem: Going Forward

Any editor any developer on your team is likely to be using (Emacs, Vim, RubyMine, Sublime Text, TextMate) already has the ability to automatically remove trailing whitespace before saving a file. Make sure they all have it switched on. For Emacs, just add this to your config:

(add-hook ‘before-save-hook ‘delete-trailing-whitespace)

For other editors, there are instructions here.

Solving the Problem: Everything before That

If the project already has three years of commits before you switched that on, there will still be a problem with already-committed trailing whitespace.

What I would do in that case is find the least disruptive moment, grab something like rstrip (which is more flexible than one-liners from the command line because it starts by setting up a config file so you can decide which file extensions are processed), run it on the whole project, and commit the result. This will make you very unpopular with people on outlying branches, hence the importance of finding the least disruptive moment, but after that, you’ve fixed the trailing whitespace problem for good.

Teaser for Another Problem

Trailing whitespace is a single instance of the larger problem of automatically enforcing coding style conventions.

Back in Java this was a solved problem: write code with Eclipse, use the Checkstyle and Jalopy plugins (or their IntelliJ equivalents): Checkstyle creates warnings whenever a coding style convention is breached, and can also stop you committing code before resolving those warnings. Invoking Jalopy on Checkstyled code can automatically reformat it to follow the Checkstyle conventions to make it committable.

In Ruby? Thoughtbot has released hound which automatically comments on pull requests with Ruby / Javascript / Coffeescript / SCSS style coding style violations, based on Thoughtbot’s style guides.

Bozhidar Batsov (bbatsov) has a Ruby static code analyzer called rubocop which can be run locally and will produce warnings based on his community-driven Ruby style guide.

Yorick Peterse has another Ruby static code analyzer called ruby-lint.

Investigating all these is a large enough topic for another post.

Christopher Alexander Talks Back

Ever wonder how Christopher Alexander felt about being the poster child for software design patterns? In his foreword to Richard Gabriel’s book he answers the question, and poses another. Are our designs as good as Chartres? If not, why not?

In fact, this is so interesting and challenging that I’d like to make a fuller extract, which I hope will be within fair use rules:

In my life as an architect, I find that the single thing which inhibits young professionals, new students most severely, is their acceptance of standards that are too low. If I ask a student whether her design is as good as Chartres, she often smiles tolerantly at me as if to say, “Of course not, that isn’t what I am trying to do. … I could never do that.”

Then I express my disagreement, and tell her: “That standard must be our standard. If you are going to be a builder, no other standard is worthwhile. That is what I expect of myself in my own buildings, and it is what I expect of my students.” Gradually, I show the students that they have a right to ask this of themselves, and must ask this of themselves. Once that level of standard is in their minds, they will be able to figure out, for themselves, how to do better, how to make something that is as profound as that.

Two things emanate from this changed standard. First, the work becomes more fun. It is deeper, it never gets tiresome or boring, because one can never really attain this standard. One’s work becomes a lifelong work, and one keeps trying and trying. So it becomes very fulfilling, to live in the light of a goal like this.

But secondly, it does change what people are trying to do. It takes away from them the everyday, lower-level aspiration that is purely technical in nature, (and which we have come to accept) and replaces it with something deep, which will make a real difference to all of us that inhabit the earth.

I would like, in the spirit of Richard Gabriel’s searching questions, to ask the same of the software people who read this book. But at once I run into a problem. For a programmer, what is a comparable goal? What is the Chartres of programming? What task is at a high enough level to inspire people writing programs to reach for the stars? Can you write a computer program on the same level as Fermat’s last theorem? Can you write a program which has the enabling power of Dr. Johnson’s dictionary? Can you write a program which has the productive power of Watt’s steam engine? Can you write a program which overcomes the gulf between the technical culture of our civilization, and which inserts itself into our human life as deeply as Eliot’s poems of the wasteland or Virginia Woolf’s The Waves?

There’s no question that progress towards finishing a project in a week is more easily and more objectively measurable, and more obviously attractive to the business side. But for the longer-term but still important purpose of getting and keeping the best developers, I think we can’t afford to lose sight of this other axis either.