Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby: substring to a certain length and also to last whitespace within substring

I am trying to truncate a long string of text to a certain length, but want to also make sure that the truncated result ends at a whitespace. I am also going to append an ellipsis afterwards.

For example this:

"This is a very long string that has more characters than I want in it."

becomes this:

"This is a very long string that..."

I am starting with this but obviously this doesn't deal with the problem of ending the string on whitespace.

<%= item.description[0..30] %>&hellip;
like image 363
e_r Avatar asked Feb 29 '12 17:02

e_r


People also ask

What does .strip do in Ruby?

The . strip method removes the leading and trailing whitespace on strings, including tabs, newlines, and carriage returns ( \t , \n , \r ).

How do you strip leading and trailing spaces in Ruby?

If you want to remove only leading and trailing whitespace (like PHP's trim) you can use . strip , but if you want to remove all whitespace, you can use . gsub(/\s+/, "") instead .

What's the difference between #upcase and #capitalize strings built in methods?

The only difference between capitalize and capitalize! method is that capitalize! method will return nil if no changes are made. Syntax:str.

How do you find the substring of a string in Ruby?

There is no substring method in Ruby, and hence we rely upon ranges and expressions. If we want to use the range, we have to use periods between the starting and ending index of the substring to get a new substring from the main string.


5 Answers

If you're using Rails 4+ you should just use the built-in truncate helper method, e.g.:

<%= truncate item.description, length: 30, separator: /\w+/ %> 

The string "…" will be appended to truncated text; to specify a different string, use the :omission option, e.g. omission: "xxx".

For Rails 3.x the :separator option must be a string. Giving :separator => " " will be fine in many cases, but only catches spaces and not other whitespace. One compromise would be to use String#squish, which replaces all sequences of whitespace with a single space (and also trims leading and trailing whitespace), e.g. "foo\n\tbar ".squish yields "foo bar". It would look like this:

<%= truncate item.description.squish, :length => 30, :separator => /\w/,                                       :omission => "&hellip;" %> 
like image 85
Jordan Running Avatar answered Sep 24 '22 19:09

Jordan Running


s[0..30].gsub(/\s\w+\s*$/, '...')

The original answer didn't work in the case where the 30 character substring ended on a whitespace character. This solves that.

>> desc="This is some text it is really long"

>> desc[0..30].gsub(/\s\w+$/,'...')
"This is some text it is really "

>> desc[0..30].gsub(/\s\w+\s*$/,'...')
"This is some text it is..."
like image 28
evfwcqcg Avatar answered Sep 22 '22 19:09

evfwcqcg


@evfwcqcg's answer is very good. I found it did not work well when

  1. The string contained other characters that are non-space not alphanumerical.
  2. The string is shorter than the desired length.

Demonstration:

>> s = "How about we put some ruby method Class#Method in our string"
=> "How about we put some ruby method Class#Method in our string"
>> s[0..41].gsub(/\s\w+\s*$/, '...')
=> "How about we put some ruby method Class#Me"
>> s[0..999].gsub(/\s\w+\s*$/, '...')
=> "How about we put some ruby method Class#Method in our..."

This is not what I expected.

Here is what I am using to fix this:

def truncate s, length = 30, ellipsis = '...'
  if s.length > length
    s.to_s[0..length].gsub(/[^\w]\w+\s*$/, ellipsis)
  else
    s
  end
end

When doing tests, here is the output:

>> s = "This is some text it is really long"
=> "This is some text it is really long"
>> truncate s
=> "This is some text it is..."

Still behave as expected.

>> s = "How about we put some ruby method Class#Method in our string"
=> "How about we put some ruby method Class#Method in our string"
>> truncate s, 41
=> "How about we put some ruby method Class..."
>> truncate s, 999
=> "How about we put some ruby method Class#Method in our string"

This is more like it.

like image 26
roychri Avatar answered Sep 23 '22 19:09

roychri


desc.gsub(/([\w\s]{30}).+/,'\1...')

Expanding on the answer by @evfwcqcg, this is a pure regex that solves the problem of trailing whitespace.

irb(main):031:0> desc="This is some text it is really long"
irb(main):033:0> desc.gsub(/([\w\s]{30}).+/,'\1...')
=> "This is some text it is really..."
irb(main):034:0> desc="This is some text it is really"
=> "This is some text it is really"
irb(main):035:0> desc.gsub(/([\w\s]{30}).+/,'\1...')
=> "This is some text it is really"
irb(main):036:0> desc="This is some text it is real"
=> "This is some text it is real"
irb(main):037:0> desc.gsub(/([\w\s]{30}).+/,'\1...')
=> "This is some text it is real"
like image 30
Ethan Avatar answered Sep 23 '22 19:09

Ethan


I am surprised that none of the answers is really correct (or limited by using rails helper) although this is very old question, so here is the solution.

Lets clearly formulate what it the goal first. We want truncate string s to 30 characters and cut the last word out as well if it can not entirely fit in. We also want to truncate trailing spaces from the result and add ellipsis, if the text was shortened.

If the text is longer then limit, than the shortening is as easy as

s[0,s.rindex(/\s/,30)].rstrip + '...'

If we wanted the entire result to be max 30 characters, than it is as simple as subtracting the length of ellipse from 30. So because we use three dots (and not one three-dot character) than we need

s[0,s.rindex(/\s/,27)].rstrip + '...'

And the final result (with the test whether we need to truncate at all) is:

if s.length<=30
  s
else
  s[0,s.rindex(/\s/,27)].rstrip + '...'
end

Thats it.


Note: There are some shady cases, when the desired result is not obvious. Here they are:

  • If the string ends with lots of spaces (s= "Helo word ") but is shorter than 30. Should the spaces be preserved? - Currently they are.
  • The same as above, but the spaces at the end cross the limit o 30. Like in (s= "Twentyseven chars long text ") - Currently all spaces ad the end are truncated and ellipsis added.
like image 39
gorn Avatar answered Sep 23 '22 19:09

gorn