I have a page that will list news articles. To cut down on the page's length, I only want to display a teaser (the first 200 words / 600 letters of the article) and then display a "more..." link, that, when clicked, will expand the rest of the article in a jQuery/Javascript way. Now, I've all that figured out and even found the following helper method on some paste page, which will make sure, that the news article (string) is not chopped up right in the middle of a word:
def shorten (string, count = 30)
if string.length >= count
shortened = string[0, count]
splitted = shortened.split(/\s/)
words = splitted.length
splitted[0, words-1].join(" ") + ' ...'
else
string
end
end
The problem that I have is that the news article bodies that I get from the DB are formatted HTML. So if I'm unlucky, the above helper will chop up my article string right in the middle of an html tag and insert the "more..." string there (e.g. between ""), which will corrupt my html on the page.
Is there any way around this or is there a plugin out there that I can use to generate excerpts/teasers from an HTML string?
You can use a combination of Sanitize and Truncate.
truncate("And they found that many people were sleeping better.",
:omission => "... (continued)", :length => 15)
# => And they found... (continued)
I'm doing a similar task where I have blog posts and I just want to show a quick excerpt. So in my view I simply do:
sanitize(truncate(blog_post.body, length: 150))
That strips out the HTML tags, gives me the first 150 characters and is handled in the view so it's MVC friendly.
Good luck!
My answer here should do work. The original question (err, asked by me) was about truncating markdown, but I ended up converting the markdown to HTML then truncating that, so it should work.
Of course if your site gets much traffic, you should cache the excerpt (perhaps when the post is created/updated, you could store the excerpt in the database?), this would also mean you could allow the user to modify or enter their own excerpt
Usage:
>> puts "<p><b><a href=\"hi\">Something</a></p>".truncate_html(5, at_end = "...")
=> <p><b><a href="hi">Someth...</a></b></p>
..and the code (copied from the other answer):
require 'rexml/parsers/pullparser'
class String
def truncate_html(len = 30, at_end = nil)
p = REXML::Parsers::PullParser.new(self)
tags = []
new_len = len
results = ''
while p.has_next? && new_len > 0
p_e = p.pull
case p_e.event_type
when :start_element
tags.push p_e[0]
results << "<#{tags.last}#{attrs_to_s(p_e[1])}>"
when :end_element
results << "</#{tags.pop}>"
when :text
results << p_e[0][0..new_len]
new_len -= p_e[0].length
else
results << "<!-- #{p_e.inspect} -->"
end
end
if at_end
results << "..."
end
tags.reverse.each do |tag|
results << "</#{tag}>"
end
results
end
private
def attrs_to_s(attrs)
if attrs.empty?
''
else
' ' + attrs.to_a.map { |attr| %{#{attr[0]}="#{attr[1]}"} }.join(' ')
end
end
end
Thanks a lot for your answers! However, in the meantime I stumbled upon the jQuery HTML Truncator plugin, which perfectly fits my purposes and shifts the truncation to the client-side. It doesn't get any easier :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With