Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ellipsizing a set of names

OK, I'm sure somebody, somewhere must have come up with an algorithm for this already, so I figured I'd ask before I go off to (re)invent it myself.

I have a list of arbitrary (user-entered) non-empty text strings. Each string can be any length (except 0), and they're all unique. I want to display them to the user, but I want to trim them to some fixed length that I decide, and replace part of them with an ellipsis (...). The catch is that I want all of the output strings to be unique.

For example, if I have the strings:

  • Microsoft Internet Explorer 6
  • Microsoft Internet Explorer 7
  • Microsoft Internet Explorer 8
  • Mozilla Firefox 3
  • Mozilla Firefox 4
  • Google Chrome 14

then I wouldn't want to trim the ends of the strings, because that's the unique part (don't want to display "Microsoft Internet ..." 3 times), but it's OK to cut out the middle part:

  • Microsoft...rer 6
  • Microsoft...rer 7
  • Microsoft...rer 8
  • Mozilla Firefox 3
  • Mozilla Firefox 4
  • Google Chrome 14

Other times, the middle part might be unique, and I'd want to trim the end:

  • Minutes of Company Meeting, 5/25/2010 -- Internal use only
  • Minutes of Company Meeting, 6/24/2010 -- Internal use only
  • Minutes of Company Meeting, 7/23/2010 -- Internal use only

could become:

  • Minutes of Company Meeting, 5/25/2010...
  • Minutes of Company Meeting, 6/24/2010...
  • Minutes of Company Meeting, 7/23/2010...

I guess it should probably never ellipsize the very beginning of the strings, even if that would otherwise be allowed, since that would look weird. And I guess it could ellipsize more than one place in the string, but within reason -- maybe 2 times would be OK, but 3 or more seems excessive. Or maybe the number of times isn't as important as the size of the chunks that remain: less than about 5 characters between ellipses would be rather pointless.

The inputs (both number and size) won't be terribly large, so performance is not a major concern (well, as long as the algorithm doesn't try something silly like enumerating all possible strings until it finds a set that works!).

I guess these requirements seem pretty specific, but I'm actually fairly lenient -- I'm just trying to describe what I have in mind.

Has something like this been done before? Is there some existing algorithm or library that does this? I've googled some but found nothing quite like this so far (but maybe I'm just bad at googling). I have to believe somebody somewhere has wanted to solve this problem already!

like image 520
Ken Avatar asked Feb 14 '11 20:02

Ken


1 Answers

It sounds like an application of the longest common substring problem.

Replace the longest substring common to all strings with ellipsis. If the string is still too long and you are allowed to have another ellipsis, repeat.

You have to realize that you might not be able to "ellipsize" a given set of strings enough to meet length requirements.

like image 116
erickson Avatar answered Oct 06 '22 09:10

erickson