As you may know, CommonMark is a project aiming to create unified and unambiguous Markdown syntax specification. So, I’m in. I want to spread the word and even use it in my own blog.
The strong side of Ruby CommonMark implementation, CommonMarker is its ability to parse a document to the abstract syntax tree, so let’s use it to extract our blocks and highlight them with Rouge for example.
# get our abstract syntax tree ast = CommonMarker.render_doc('some markdown here') # search code blocks ast.walk do |node| if node.type == :code_block next if node.fence_info == '' # skip if we don't know how to highlight source = node.string_content # get node content that's our source code # now try to use highlighter # I prefer Rouge which is also a default choice in Jekyll # `node.fence_info` will hold the language we want to highlight # for example, # ```ruby # # code here # ``` # will have node.fence_info == 'ruby' lexer = Rouge::Lexer.find_fancy(node.fence_info) formatter = Rouge::Formatters::HTML.new # get most common formatter # format our html html = '<div class="highlighter-rouge">' + formatter.format(lexer.lex(source)) + '</div>' # and render it new_node = CommonMarker::Node.new(:html) new_node.string_content = html # insert our new parsed content and remove original code block node.insert_before(new_node) node.delete end end # done! we have a document with highlighted source blocks puts ast.to_html
To avoid this headache in the future I released this scenario as a gem:
commonmarker-rouge. And now by just forking jekyll-commonmark and
CommonMarker::Rouge I have a blog parsed
as CommonMark with syntax highlighting intact.