Trailing Whitespace in Commits

The Problem

You’re opening a pull request to do a code review. There are 34 changed files. You find this distressing, but you dig in and find only three changed files contain changes in logic: the rest are removing whitespace. You examine the three easily and go on to what’s next.

But suppose you had seen there were 34 changed files and cringed and gone on to another task instead? It would be better if there were a way to quickly distinguish real changes from formatting changes. Better still if we could prevent commits that add whitespace we’re only going to take out later in the first place.

Hiding but Not Solving the Problem

git diff, of course, shows you the differences between two code trees.

git diff -w / git diff --ignore-all-space shows you the differences excluding whitespace differences.

You can get github to ignore whitespace in its diffs by appending ?w=1 to the URL (say, of the pull request that you’re reviewing).

This solves the cognitive load problem – being able to tell at a glance that 31 of the 34 changed files are not significant is a huge plus – but all of those extra files have still been changed. Given a small team of great developers, no problem, but if you’ve got multiple people at multiple skill levels working on multiple branches, merges become more complicated and merge conflicts more likely than they need to be.

Solving the Problem: Going Forward

Any editor any developer on your team is likely to be using (Emacs, Vim, RubyMine, Sublime Text, TextMate) already has the ability to automatically remove trailing whitespace before saving a file. Make sure they all have it switched on. For Emacs, just add this to your config:

(add-hook ‘before-save-hook ‘delete-trailing-whitespace)

For other editors, there are instructions here.

Solving the Problem: Everything before That

If the project already has three years of commits before you switched that on, there will still be a problem with already-committed trailing whitespace.

What I would do in that case is find the least disruptive moment, grab something like rstrip (which is more flexible than one-liners from the command line because it starts by setting up a config file so you can decide which file extensions are processed), run it on the whole project, and commit the result. This will make you very unpopular with people on outlying branches, hence the importance of finding the least disruptive moment, but after that, you’ve fixed the trailing whitespace problem for good.

Teaser for Another Problem

Trailing whitespace is a single instance of the larger problem of automatically enforcing coding style conventions.

Back in Java this was a solved problem: write code with Eclipse, use the Checkstyle and Jalopy plugins (or their IntelliJ equivalents): Checkstyle creates warnings whenever a coding style convention is breached, and can also stop you committing code before resolving those warnings. Invoking Jalopy on Checkstyled code can automatically reformat it to follow the Checkstyle conventions to make it committable.

In Ruby? Thoughtbot has released hound which automatically comments on pull requests with Ruby / Javascript / Coffeescript / SCSS style coding style violations, based on Thoughtbot’s style guides.

Bozhidar Batsov (bbatsov) has a Ruby static code analyzer called rubocop which can be run locally and will produce warnings based on his community-driven Ruby style guide.

Yorick Peterse has another Ruby static code analyzer called ruby-lint.

Investigating all these is a large enough topic for another post.

Notes and Quotes from Conference Talks

What did we do before Confreaks let us catch up on all the conference talks? Here are notes on some that resonated with me recently.

TDD for Your Soul: Virtue and Web Development with Abraham Sangha at Madison+ Ruby

… more philosophy than tech, but thought-provoking.

citing Alasdair McIntyre, “After Virtue”:

“Who am I?

Who ought I to become?

How ought I to get there?”

citing Ralph Ellison:

“The blues is an impulse to keep the painful details and episodes of a brutal experience alive in one’s aching consciousness, to finger its jagged grain, and to transcend it, not by the consolation of philosophy but by squeezing from it a near-tragic, near-comic lyricism.”

Abraham Sangha:

“We’re inviting criticism of ourselves, we’re inviting evaluation of our weaknesses, by TDDing our soul, by writing these tests, seeing where we fail, and trying to implement habits to address these failures: that can be a shameful process. … If you don’t believe you’re good enough, you’re stuck, then why would you invite more pain into your life through this process, so why not just skip it. … But there’s a possibility that pursuing this will actually give us a sense of buoyancy.”

Alchemy and the Art of Software Development with Coraline Ada Ehmke at Madison+ Ruby

Going back in pattern languages before Christopher Alexander to Gottfried Wilhelm von Leibniz (1646-1716): “every complex idea is composed of sets of simpler ideas.”

“alchemy is about transforming base matter, like the stuff that we’re made of, into something more closely approaching divine perfection. That ‘lead into gold’ nonsense was actually part of the pitch that alchemists made to the VCs of their era. They called them royalty, I’m not quite sure why. And this idea that ‘I can take base metal and transmute it into gold? Here: I’ll give you all this money for your metaphysical experiments. I don’t care.’ So they were pretty smart.”

“The Divine Pymander, ascribed to Hermes Mercurius Trismegistus. … It contains seventeen tracts, which talk about things like divinity, humanity, idealism, even monads.”

“we impose our will on the universe, we create a structure that we want to impose on chaos, and the manifestation of that begins in harmony, but slides towards disharmony, because every object in every system is corruptible. … the code is corruptible.”

“all that is apparent, is what was generated or made. … the system does not contain information about the ideals that led to its creation. Unless we are very deliberate about recording our intention and our design, that information is lost, and all we are left with is a system that we don’t understand any more.”

“Christopher Alexander the architect said that a builder should start with the roughest sketch of a building, and by processes that are known to the brain through the pattern language, execute the construction of the building. All things that are are but imitations of truth. The systems we build are reflections of the world. Software system is not and cannot be a single source of truth.”

“Really, alchemy and software development are about identifying ideals, identifying particulars, creating taxonomies, studying them, creating a system for classifying every single thing in a limited or expansive universe.”

“The Divine Pymander ends with this: ‘The Image shall become thy Guide, because Sight hath this peculiar charm, it holdeth fast and draweth unto it those who succeed in opening their eyes.’ I would translate this as ‘The metaphors guide us. Information is there, and ideas want to be recognized.’ I think that the metaphor of alchemy holds true with what we do. Just like alchemy, software development is an inquiry into the essential nature of reality. We break it down, we salve it coagula, we break down complex things into simpler things, we recombine them in novel ways, and we produce an effect on the world, on ourselves.”

“Let’s … dive into disciplines that we have no idea what they even mean, explore them, mine them, find tenets and metaphors, tools that we can use in problem solving, that we can apply to enrich our art, our science, our own magnum opus. Maybe when we do that, we can find that the raw stuff of digital creation can be transmuted into something that more closely approaches perfection.”

Aesthetics and the Evolution of Code: Coraline Ada Ehmke, Nickle City Ruby 2014

Aesthetics of code: correctness, performance, conciseness, readability.

So how do we measure elegance? How about a graph with four lines/axes spreading from the origin, correctness “north”, performance “south”, conciseness “west”, readability “east”. If you can measure and grade those four aesthetics on a numeric scale, then the most elegant code will cover the largest space on the graph

Why does it matter? Einstein said:

“After a certain level of technological skill is achieved, science and art tend to coalesce in aesthetic plasticity and form. There greater scientists are artists as well.”

Eric Stiens’s Nickel City Ruby Conference 2014 talk “How to be an Awesome Junior Developer (and why to hire them)”

On Mentoring:

Don’t hire a junior developer you can’t mentor. It’s not fair to them. It’s not fair to your team. Everybody loses.

Sarah Mei’s Nickel City Ruby Conference 2014 talk “Keynote: Multitudes”

Should you always use attr_reader to access an instance variable rather than accessing it directly, as a good object-oriented principle, because by only accessing it by sending a message you have created a seam which makes it easier to change anything about the implementation later? A good application developer (e.g. Sandi Metz) would very likely say yes. A good library developer (e.g. Jim Weirich) might say no, because adding attr_reader to a class makes data manipulation public, and once it’s public it has to be supported for ever, because people will use it. (And if you do change it, and you’re using semantic versioning, you have to change the major version.)

“So people who write gems have developed a system where they have a very minimal interface, with a very rich set of internal abstractions, which they can use while providing this very minimal external interface.

“So these are just two different approaches to programming [application style at one end, gem/library style at the other], and what we’re starting to see here is there is a spectrum of rubyists, and there is a spectrum of projects: people will write different code at different points of the spectrum at different times, and what the spectrum is measuring is the surface area of our interface. And it seems like a fairly simple distinction but it does produce a huge difference in code structure. And what it means is that sometimes someone who is good at one side of this spectrum will not automatically be good at the other side, right away. … Why does this matter? … Having these endpoints helps us define what else there is. For example, there’s a middle here, and that middle is suddenly quite obvious, actually, and that’s external APIs on Rails apps. Many Rails apps now need some kind of programmatic interface that is versioned, and minimal, because once it’s out there, and it’s got users, you’re stuck supporting it, forever. … And the reason people struggle with external APIs for Rails apps is that it is partially application development and it is partially gem-style development, and there aren’t very many people that are good at both.”

Exciting that Sarah Mei and Sandi Metz are writing a book: Practical Rails Programming.

Christopher Alexander Talks Back

Ever wonder how Christopher Alexander felt about being the poster child for software design patterns? In his foreword to Richard Gabriel’s book he answers the question, and poses another. Are our designs as good as Chartres? If not, why not?

In fact, this is so interesting and challenging that I’d like to make a fuller extract, which I hope will be within fair use rules:

In my life as an architect, I find that the single thing which inhibits young professionals, new students most severely, is their acceptance of standards that are too low. If I ask a student whether her design is as good as Chartres, she often smiles tolerantly at me as if to say, “Of course not, that isn’t what I am trying to do. … I could never do that.”

Then I express my disagreement, and tell her: “That standard must be our standard. If you are going to be a builder, no other standard is worthwhile. That is what I expect of myself in my own buildings, and it is what I expect of my students.” Gradually, I show the students that they have a right to ask this of themselves, and must ask this of themselves. Once that level of standard is in their minds, they will be able to figure out, for themselves, how to do better, how to make something that is as profound as that.

Two things emanate from this changed standard. First, the work becomes more fun. It is deeper, it never gets tiresome or boring, because one can never really attain this standard. One’s work becomes a lifelong work, and one keeps trying and trying. So it becomes very fulfilling, to live in the light of a goal like this.

But secondly, it does change what people are trying to do. It takes away from them the everyday, lower-level aspiration that is purely technical in nature, (and which we have come to accept) and replaces it with something deep, which will make a real difference to all of us that inhabit the earth.

I would like, in the spirit of Richard Gabriel’s searching questions, to ask the same of the software people who read this book. But at once I run into a problem. For a programmer, what is a comparable goal? What is the Chartres of programming? What task is at a high enough level to inspire people writing programs to reach for the stars? Can you write a computer program on the same level as Fermat’s last theorem? Can you write a program which has the enabling power of Dr. Johnson’s dictionary? Can you write a program which has the productive power of Watt’s steam engine? Can you write a program which overcomes the gulf between the technical culture of our civilization, and which inserts itself into our human life as deeply as Eliot’s poems of the wasteland or Virginia Woolf’s The Waves?

There’s no question that progress towards finishing a project in a week is more easily and more objectively measurable, and more obviously attractive to the business side. But for the longer-term but still important purpose of getting and keeping the best developers, I think we can’t afford to lose sight of this other axis either.