Emacs Org-mode: Publishing to WordPress

The next step on this emacs org-mode kick? Instead of editing the blog in markdown and publishing it with jekyll, edit in org-mode and publish to WordPress. There’s a package for this, org2blog/wp, and very useful setup instructions here.

The only change I had to make is that using .netrc as described returned an

Wrong type argument: stringp, netrc-get

error. Looking at the org2blog/wp package itself, I discovered that a commit five months after the instructions came out recommended replacing ‘netrc with ‘auth-source. After making that change, everything worked.

The commit to my .emacs.d including this change is here. The ~/.authinfo file is in the form:

machine thewanderingcoder
  login {my-login}
  password {my-password}

One further note: on trying both the native emacs highlighting (org2blog/wp-use-sourcecode-shortcode nil) and the SyntaxHighlighter Evolved plugin highlighting (org2blog/wp-use-sourcecode-shortcode ‘t), I had to agree that the SyntaxHighlighter Evolved plugin highlighting looked much better.

The one difficulty, particularly for a post about testing, is that it didn’t recognize and highlight ert-deftest. So I edited my copy of shBrushLisp.js and added ert-deftest to the end of the list of macros:

var macros = ‘loop when dolist dotimes defun ert-deftest’;

and uploaded the modified file to wp-content/plugins/syntaxhighlighter, and now ert-deftest is highlighted accordingly.

Refactoring “Beginning Emacs Lisp”: I: Adding Tests

On Friday I sat down with Sacha Chua for some emacs coaching. We talked about org-mode and about the emacs lisp I’d written for reformatting citations. In this entry I’ll talk about refactoring that emacs lisp code.

Ah, Refactoring. We’ll Need Some Tests…

Refactoring, by definition, is improving the internal structure of code without altering the external behaviour. Equally by definition, before you start you need thorough automated tests, because that’s how you tell that you haven’t altered the external behaviour.

I wrote the reformatting citations emacs lisp as an exploratory spike, looking up commands as I went, and manually testing the results. Time to get more rigorous. What is emacs lisp’s equivalent of JUnit, or MiniTest or RSpec?

ERT: Emacs Lisp Regression Testing

In JUnit you annotate test methods with “@Test”:

@Test
public void formatingRemoved {

In MiniTest, you start the method definitions with “test_”:

def test_formatting_removed

In ERT, where you would define a normal lisp function, with “defun”:

(defun remove-formatting ()
)

you can define a lisp test with “ert-deftest”:

(ert-deftest remove-formatting ()
)

Inside the test, you can call the function under test and compare the actual and expected results with the “should” macro. For instance, given a function that takes a string and removes some formatting (using replace-regexp-in-string):

(defun remove-formatting (string)
(replace-regexp-in-string "^> " ""
(replace-regexp-in-string "\s*<br/?>" "" string)))

We could write a test that checks that it does what it says:

(ert-deftest remove-formatting ()
(should (string= (remove-formatting "> Elþeodigra eard<br/>")
"Elþeodigra eard")))

To run all the tests, we can type:

M-x ert RET RET

The second RET accepts the default, t, and runs all tests. In this case there’s only one: if we have a large suite and we only want to run a subset, say those with “formatting” in the test name, we would type instead:

M-x ert RET "*formatting*" RET

In either case, another buffer is opened up with the results, for instance

Selector: t
Passed:  1
Failed:  0
Skipped: 0
Total:   1/1

Started at:   2015-02-02 16:03:15-0500
Finished.
Finished at:  2015-02-02 16:03:15-0500

.

And yes, as you would expect from other languages, that’s a dot per passing test, and an F for any failing test, so with twelve tests and two failures you might instead see:

..FF........

In the ERT results buffer, with the cursor on any . or F test result you have several options, including:

. ;; jump to that test’s source code
l ;; list the assertions run by the test
h ;; see the description string for the test, if any
b ;; view backtrace
r ;; re-run this test

As Nic Ferrier notes, the runner doesn’t automatically recognize when you delete a test, and you need to delete it in the ERT results buffer. I had trouble with it recognizing added tests as well, and ended up closing and restarting emacs to make sure it picked up the latest list. This is obviously not scalable. There’s a separate tool called ert-runner which fixes this, as I discover here.

Additional Complications

If we were testing code that took a string and reformatted it, like the example above, we could just write (should (string= examples for all the edge cases we could think of using what we already know, and we’d be set. Unfortunately, the code to be put under test also makes changes to the buffer, varies its behaviour depending on whether a region is selected when it is called, and modifies the cursor position. How do we handle that?

The with-temp-buffer macro saves the current buffer, creates an (empty) temporary buffer, marks it current, uses it inside the body of the macro, and on exit switches back to the previous current buffer. You can return the contents of the temporary buffer by using (buffer-string) as the last form, and the position of the cursor within the temporary buffer by using (point) as the last form. This lets us write tests for (begin-end-quote) when a region is not selected like so:

(ert-deftest test-begin-end-quote-new-content ()
"Tests begin-end-quote without preselected text string"
(should (string= (with-temp-buffer
(begin-end-quote)
(buffer-string))
"#+begin_quote\n\n#+end_quote\n")))

(ert-deftest test-begin-end-quote-new-point ()
"Tests begin-end-quote without preselected text cursor position"
(should (equal (with-temp-buffer
(begin-end-quote)
(point))
(length "#+begin_quote\n\n"))))

There’s a further complication for the case where a region is selected before calling it. We can include text in the temporary buffer before calling begin-end-quote by using insert, and then (set-mark .N.) to set the mark at the nth character, and then either goto-char .N. to the select the region from the first mark up to character n, or just do end-of-buffer to select the region to the end of the buffer. So to insert the text “> Dear Sir, your astonishment’s odd;\n” into the temporary buffer and select the whole region, we could do the following:

(insert "> Dear Sir, your astonishment’s odd;\n")
(goto-char (point-min))
(set-mark-command nil)
(goto-char (point-max))
(transient-mark-mode 1)
(end-of-buffer)

With that extra information, the tests of behaviour with a selected region become simple too:

(ert-deftest test-begin-end-quote-region ()
"Tests begin-end-quote with selected region"
(should (string= (with-temp-buffer
(insert "> Dear Sir, your astonishment’s odd;\n")
(goto-char (point-min))
(set-mark-command nil)
(goto-char (point-max))
(transient-mark-mode 1)
(buffer-string))
"#+begin_quote\n Dear Sir, your astonishment’s odd;\n#+end_quote\n")))

The commit with the full set of twelve tests is here.

As well as adding tests this commit makes a code change, because in the three weeks since writing it I have discovered that the archive files don’t always have exactly two spaces between the end of the text and the “<br/>”, so I wrote tests to expose that (so that two of them were failing, as in the example above), and then changed the regexp so that they passed.

Now that we’ve got full automated tests, we can start refactoring the code.

Emacs Org-mode: Links and Exported Html

If you have an archive of files in org-mode and you want to link between them, say from today’s entry to the entry of 9 March 2013, you have several options, as laid out here:

http://orgmode.org/manual/Search-options.html

The Simplest Case, for .org

The simplest is to provide the text of the header in the link, like so: so:

[[file:2013.org::Saturday 9 March 2013][9 March 2013]]

which, on typing the final closing square bracket, will collapse on screen to “9 March 2013”, and when you’re on the link and type “C-c C-o”, it will open the “2013.org” file at the header “Saturday 9 March 2013”. If you find you’ve made an error in the link target or title, typing “C-c C-l” will let you edit it.

http://orgmode.org/manual/Handling-links.html

And this works perfectly, until you export it to html, and then, of course, it doesn’t.

The Extra Step, for .html

To get a link also to work in html, you need to set a custom_id on the header, which you do like this:

*** Saturday 9 March 2013
:PROPERTIES:
:CUSTOM_ID: 20130309
:END:

or, less manually, with the org-set-property command (keyboard shortcut C-c C-x p):

C-c C-x p RET CUSTOM_ID RET 20130309

This will also attach an id attribute to the header element in the exported html, so in both .org and .html it is recognized as #20130309, and if you then change the link to

[[file:2013.org::#20130309][9 March 2013]]

it will work as before in the *.org file, and also in the exported *.html file.

The Final Step

Having done this a few times, you may get tired of typing “CUSTOM_ID” each time, and build a function that lets you just type the id value:

(defun cid (custom-id)
(interactive "MCUSTOM_ID: ")
(org-set-property "CUSTOM_ID" custom-id))

after which you can type, even more briefly:

M-x cid RET 20130309 RET

commit

Beginning Emacs Lisp

The Problem

A while back I converted an archive of non-code-related files to org-mode. The files had citations in a markdown-like format, so:

> ADA  <br/>
COUNTESS OF LOVELACE  <br/>
1815-1852  <br/>
Pioneer of Computing  <br/>
lived here

which, processed through Calibre, produced a nice readable pdf with blockquotes and linebreaks.

For a while after that, I wrote in org-mode and viewed the files within emacs, so I just indented quotations, like so:

.
    Quandunque i colli fanno più nera ombra,
    Sotto il bel verde la giovane donna
    Gli fa sparir, come pietra sott’ erba.

and then one day I used org-mode’s export-to-html functionality (C-c C-e h o) and I lost all the blockquoting and line-breaks. org-mode needs prose citations surrounded by

#+begin_quote
#+end_quote

to be rendered with <blockquotes> in export-to-html, and if you want it to respect line breaks as well, you need to use

#+begin_verse
#+end_verse

So. Going forward I needed to insert begin/end blocks for new citations, and I had a bunch of older citations I would need to reformat. Here’s what I build with emacs lisp to solve the problem.

The Solution

The Simplest Case

In a new file, about to add a quotation, I want a shortcut to add either #+begin_quote / #+end_quote or #+begin_verse / #+end_verse and put the cursor on the empty line between. This sounds like what yasnippets is designed for, but I wasn’t sure how that would work when I got to converting existing quotations, so I broke out my ~/.emacs.d/mods-org.el file and added two new shortcuts:

(global-set-key (kbd "M-s M-q")
(lambda()
(interactive)
(insert "#+begin_quote")
(newline)
(newline)
(insert "#+end_quote")
(newline)
(previous-line)
(previous-line)))

(global-set-key (kbd "M-s M-v")
(lambda()
(interactive)
(insert "#+begin_verse")
(newline)
(newline)
(insert "#+end_verse")
(newline)
(previous-line)
(previous-line)))

This meets my simplest case requirement and is fairly self-explanatory. Going forward, I can type

M-s M-q ;; q for quote

to get

#+begin_quote
_
#+end_quote

and

M-s M-v ;; v for verse

to get

#+begin_verse
_
#+end_verse

commit

Narrowing the Scope

The fact that I put them in ~/.emacs.d/mods-org.el instead of ~/.emacs.d/key-bindings.el foreshadows the next step: they aren’t global key bindings, I’m only going to use them in org-mode, and in a ruby class they’d just be noise. So, since I’ve already got a hook for entering text-mode (which I’m also using for org-mode), let’s change them from global key bindings to key bindings which are added to org-mode specifically.

I’ve got an ~/.emacs.d/mode-hooks.el file which already has a custom text-mode-hook, so I can add two lines to that:

(defun my-text-mode-hook ()

(define-key org-mode-map (kbd "M-s M-q") ‘begin-end-quote)
(define-key org-mode-map (kbd "M-s M-v") ‘begin-end-verse)
)

(add-hook ‘text-mode-hook ‘my-text-mode-hook)

And change the functions, in the ~/.emacs.d/mods-org.el file, from anonymous functions in the global-set-key blocks to the named functions we have just referenced:

(defun begin-end-quote ()
(interactive)
(insert "#+begin_quote")
(newline)
(newline)
(insert "#+end_quote")
(newline)
(previous-line)
(previous-line))

(defun begin-end-verse ()
(interactive)
(insert "#+begin_verse")
(newline)
(newline)
(insert "#+end_verse")
(newline)
(previous-line)
(previous-line))

Now if we’re in org-mode, the key-bindings do what we expect, but if we’re in ruby-mode and type them, or type (C-h k) to get the definition of a key binding and then type (M-s M-v), we get

M-s M-v is undefined

commit

Reformat Existing

That handles the going forward case, but doesn’t handle the case where I’m in an older file and want to reformat an existing quotation. To handle both, I’d like to check when I use the shortcut to see if I’ve got a selected a region or not. If I haven’t, it’s the going forward case, and I should do what I was doing before, but if I have, then presumably I want to put the #+begin and #+end blocks around the selected region.

We can tell this because emacs lisp gives us a function use-region-p which returns true if there is a region selected. The most basic if block in emacs lisp looks like this:

(if (condition)
(do-true-thing)
(do-false-thing))

so in our case we have:

(if (use-region-p)
(begin-end-quote-for-region)
(begin-end-quote-for-new))

and begin-end-quote-for-new is the old begin-end-quote method, and begin-end-quote-for-region looks like this:

(defun begin-end-quote-for-region ()
(interactive)
(insert "#+end_quote")
(newline)
(goto-char (region-beginning))
(insert "#+begin_quote")
(newline))

An extra bit of inwardness here is that the cursor starts at the end of the selected region, so we can just insert “#+end_quote” and it will show up after the end, and (region-beginning) and (region-end) hold the beginning and end of the selected region, so (goto-char (region-beginning)) gets us back to the beginning so we can insert “#+begin_quote” before it.

commit

This gets us to the point where if we’d selected the first quotation and hit M-s M-v, we’d end up with

#+begin_verse
> ADA  <br/>
COUNTESS OF LOVELACE  <br/>
1815-1852  <br/>
Pioneer of Computing  <br/>
lived here
#+end_verse

which is a definite improvement, but it still has the old formatting codes. Can we get rid of those?

Remove Old Formatting

First off, we only want to do this in the reformat existing case. That’s fine, those two methods (begin-end-quote-for-region and begin-end-verse-for-region) are already separate, so in each of those methods include a (remove-old-formatting) method.

What we want to do in pseudo-code is take the selected region and apply

s/^> //
s/  <br/>$//

to it. (We only need the second transformation for verse, not quotes, and if these got any more complicated we might want two separate methods, but we can leave them in one for now.)

setq defines a variable.

filter-buffer-substring grabs the text from arg1 to arg2 (and we’re using (region-beginning) and (region-end) which return the start and end of the selected region), and with the optional third argument of t deletes the text after copying it.

After that, we’ve got the contents of the selected region in a variable, “in”, and we can run replace-regexp-in-string on it, taking as arguments search-value, replace-value, and string-to-search in, and using setq to define to variable we’re storing the result in.

Once we’ve made all the changes we need, we use insert the finally modified string back into the buffer.

(defun remove-old-formatting ()
(setq in (filter-buffer-substring (region-beginning) (region-end) t))
(setq out (replace-regexp-in-string "^> " "" in))
(setq out2 (replace-regexp-in-string " <br/>$" "" out))
(insert out2)
)

commit

And at the end of that, we get:

#+begin_verse
ADA
COUNTESS OF LOVELACE
1815-1852
Pioneer of Computing
lived here
#+end_verse

Which is very nearly there, but not indented. How about as a last step we indent it, so if we’re viewing it in org-mode it still looks like a blockquote?

Indenting

Let’s put this in our remove-old-formatting method, because again it’s something that we’ll only want in the reformat existing case. Given that it’s no longer just removing old formatting, let’s change the method name to fix-old-formatting, and to keep things at the same level of abstraction, let’s put the old remove-old-formatting lines in a new method called remove-old–formatting-code and add a method indent-if-not-indented. So we have:

(defun fix-old-formatting ()
(remove-old-formatting-code)
(indent-if-not-indented)
)

(defun remove-old-formatting-code ()
(setq in (filter-buffer-substring (region-beginning) (region-end) t))
(setq out (replace-regexp-in-string "^> " "" in))
(setq out2 (replace-regexp-in-string " <br/>$" "" out))
(insert out2)
)

Remember that we have some existing citations in the form

> ADA  <br/>

and some already indented as

.
    Quandunque i colli fanno più nera ombra,

Further, some of the already indented ones have multiple layers of indentation, and setting a single indentation would break that. So while indent-region itself is simple enough, once again using (region-beginning) and (region-end) to give us the selected region, and the third argument for the number of columns to indent:

(indent-region (region-beginning) (region-end) 4)

we need to make it conditional on it not already being indented, so we end up with:

(defun indent-if-not-indented ()
(setq firstFour (filter-buffer-substring (region-beginning) (+ (region-beginning) 4)))
(if (not (string= firstFour " "))
(indent-region (region-beginning) (region-end) 4)
)
)

using filter-buffer-substring again to grab the first four characters of the region (without the optional third argument so we don’t delete it), and if they aren’t spaces, do the indent. If they are, it’s one of the newer existing quotations and we should leave it as it is, as one of them for instance was pseudo-code

.
    count = 500 
    day.each do
      wrote_words?(count) ? 
        count += 100 : 
        count -= 100
      count = 100 if count < 100
      count = 1500 if count > 1500
    end

and not doing that check would have clobbered all the internal indenting.

commit

With that, we finally get the desired end result for the older existing quotations:

#+begin_verse
    ADA
    COUNTESS OF LOVELACE
    1815-1852
    Pioneer of Computing
    lived here
#+end_verse

without clobbering existing indentation for the more recent existing quotations:

#+begin_verse
    count = 500
    day.each do
      wrote_words?(count) ?
        count += 100 :
        count -= 100
      count = 100 if count < 100
      count = 1500 if count > 1500
    end
#+end_verse

End and Afterthoughts

The current state of my ~/.emacs.d/ is here. It is very much a work in progress. Also, having just looked at Sacha Chua’s more literate emacs config using org-babel, I’m quite tempted to try that out, instead of composing the two or three additional explanatory/introductory blog posts that occurred to me would be helpful as I was writing this up.