Embracing CommonMark as the One True Markdown

As you may know, CommonMark is a project aiming to create unified and unambiguous Markdown syntax specification. So, I'm in. I want to spread the word and even use it in my own blog.

The trouble number one is that Jekyll uses kramdown by default. So we find a gem and the gem is jekyll-commonmark. Oh hell, we lost syntax highlighting :(

The trouble number two is that CommonMark standard lacks support for server side syntax highlighting. That's bad, I don't want any JavaScript on my static pages. Let's try to wrap it somehow and enable syntax highligting.

The strong side of Ruby CommonMark implementation, CommonMarker is its ability to parse a document to the abstract syntax tree, so let's use it to extract our blocks and highlight them with Rouge for example.

# get our abstract syntax tree
ast = CommonMarker.render_doc('some markdown here')

# search code blocks
ast.walk do |node|
  if node.type == :code_block
    next if node.fence_info == '' # skip if we don't know how to highlight

    source = node.string_content  # get node content that's our source code

    # now try to use highlighter
    # I prefer Rouge which is also a default choice in Jekyll

    # `node.fence_info` will hold the language we want to highlight
    # for example,
    #   ```ruby
    #   # code here
    #   ```
    # will have node.fence_info == 'ruby'
    lexer     = Rouge::Lexer.find_fancy(node.fence_info)
    formatter = Rouge::Formatters::HTML.new # get most common formatter

    # format our html
    html = '<div class="highlighter-rouge">' +
        formatter.format(lexer.lex(source)) +
        '</div>'

    # and render it
    new_node = CommonMarker::Node.new(:html)
    new_node.string_content = html

    # insert our new parsed content and remove original code block
    node.insert_before(new_node)
    node.delete
  end
end

# done! we have a document with highlighted source blocks
puts ast.to_html

To avoid this headache in the future I released this scenario as a gem: commonmarker-rouge. And now by just forking jekyll-commonmark and changing CommonMarker to CommonMarker::Rouge I have a blog parsed as CommonMark with syntax highlighting intact.

Comments

Comments powered by Disqus