Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Indenting generated markup in Jekyll/Ruby

Well this is probably kind of a silly question but I'm wondering if there's any way to have the generated markup in Jekyll to preserve the indentation of the Liquid-tag. World doesn't end if it isn't solvable. I'm just curious since I like my code to look tidy, even if compiled. :)

For example I have these two:

base.html:

<body>
    <div id="page">
        {{content}}
    </div>
</body>

index.md:

---
layout: base
---
<div id="recent_articles">
    {% for post in site.posts %}
    <div class="article_puff">
        <img src="/resources/images/fancyi.jpg" alt="" />
        <h2><a href="{{post.url}}">{{post.title}}</a></h2>
        <p>{{post.description}}</p>
        <a href="{{post.url}}" class="read_more">Read more</a>
    </div>
    {% endfor %}    
</div>

Problem is that the imported {{content}}-tag is rendered without the indendation used above.

So instead of

<body>
    <div id="page">
        <div id="recent_articles">  
            <div class="article_puff">
                <img src="/resources/images/fancyimage.jpg" alt="" />
                <h2><a href="/articles/2012/11/14/gettin-down-with-rwd.html">Gettin' down with responsive web design</a></h2>
                <p>Everyone's talking about it. Your client wants it. You need to code it.</p>
                <a href="/articles/2012/11/14/gettin-down-with-rwd.html" class="read_more">Read more</a>
            </div>
        </div>
    </div>
</body>

I get

<body>
    <div id="page">
        <div id="recent_articles">  
<div class="article_puff">
<img src="/resources/images/fancyimage.jpg" alt="" />
    <h2><a href="/articles/2012/11/14/gettin-down-with-rwd.html">Gettin' down with responsive web design</a></h2>
    <p>Everyone's talking about it. Your client wants it. You need to code it.</p>
    <a href="/articles/2012/11/14/gettin-down-with-rwd.html" class="read_more">Read more</a>
</div>
</div>
    </div>
</body>

Seems like only the first line is indented correctly. The rest starts at the beginning of the line... So, multiline liquid-templating import? :)

like image 875
russinkungen Avatar asked Nov 30 '12 14:11

russinkungen


1 Answers

Using a Liquid Filter

I managed to make this work using a liquid filter. There are a few caveats:

  • Your input must be clean. I had some curly quotes and non-printable chars that looked like whitespace in a few files (copypasta from Word or some such) and was seeing "Invalid byte sequence in UTF-8" as a Jekyll error.

  • It could break some things. I was using <i class="icon-file"></i> icons from twitter bootstrap. It replaced the empty tag with <i class="icon-file"/> and bootstrap did not like that. Additionally, it screws up the octopress {% codeblock %}s in my content. I didn't really look into why.

  • While this will clean the output of a liquid variable such as {{ content }} it does not actually solve the problem in the original post, which is to indent the html in context of the surrounding html. This will provide well formatted html, but as a fragment that will not be indented relative to tags above the fragment. If you want to format everything in context, use the Rake task instead of the filter.

-

require 'rubygems'
require 'json'
require 'nokogiri'
require 'nokogiri-pretty'

module Jekyll
  module PrettyPrintFilter
    def pretty_print(input)
      #seeing some ASCII-8 come in
      input = input.encode("UTF-8")

      #Parsing with nokogiri first cleans up some things the XSLT can't handle
      content = Nokogiri::HTML::DocumentFragment.parse input
      parsed_content = content.to_html

      #Unfortunately nokogiri-pretty can't use DocumentFragments...
      html = Nokogiri::HTML parsed_content
      pretty = html.human

      #...so now we need to remove the stuff it added to make valid HTML
      output = PrettyPrintFilter.strip_extra_html(pretty)
      output
    end

    def PrettyPrintFilter.strip_extra_html(html)
      #type declaration
      html = html.sub('<?xml version="1.0" encoding="ISO-8859-1"?>','')

      #second <html> tag
      first = true
      html = html.gsub('<html>') do |match|
        if first == true
          first = false
          next
        else
          ''
        end
      end

      #first </html> tag
      html = html.sub('</html>','')

      #second <head> tag
      first = true
      html = html.gsub('<head>') do |match|
        if first == true
          first = false
          next
        else
          ''
        end
      end

      #first </head> tag
      html = html.sub('</head>','')

      #second <body> tag
      first = true
      html = html.gsub('<body>') do |match|
        if first == true
          first = false
          next
        else
          ''
        end
      end

      #first </body> tag
      html = html.sub('</body>','')

      html
    end
  end
end

Liquid::Template.register_filter(Jekyll::PrettyPrintFilter)

Using a Rake task

I use a task in my rakefile to pretty print the output after the jekyll site has been generated.

require 'nokogiri'
require 'nokogiri-pretty'

desc "Pretty print HTML output from Jekyll"
task :pretty_print do
  #change public to _site or wherever your output goes
  html_files = File.join("**", "public", "**", "*.html")

  Dir.glob html_files do |html_file|
    puts "Cleaning #{html_file}"

    file = File.open(html_file)
    contents = file.read

    begin
      #we're gonna parse it as XML so we can apply an XSLT
      html = Nokogiri::XML(contents)

      #the human() method is from nokogiri-pretty. Just an XSL transform on the XML.
      pretty_html = html.human
    rescue Exception => msg
      puts "Failed to pretty print #{html_file}: #{msg}"
    end

    #Yep, we're overwriting the file. Potentially destructive.
    file = File.new(html_file,"w")
    file.write(pretty_html)

    file.close
  end
end
like image 67
bwest Avatar answered Sep 24 '22 00:09

bwest