Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trim a string based on the string length

Tags:

java

string

People also ask

How do you trim the length of a string?

Java String trim()The Java String class trim() method eliminates leading and trailing spaces. The Unicode value of space character is '\u0020'. The trim() method in Java string checks this Unicode value before and after the string, if it exists then the method removes the spaces and returns the omitted string.

How do you trim a string to a specific length in python?

Use syntax string[x:y] to slice a string starting from index x up to but not including the character at index y. If you want only to cut the string to length in python use only string[: length].

What does string's trim () method do?

Java String trim() Method The trim() method removes whitespace from both ends of a string. Note: This method does not change the original string.

How do I truncate a string in Salesforce?

String str = 'sfdcblog'; String trucatedStr = str. substring(0,3);


s = s.substring(0, Math.min(s.length(), 10));

Using Math.min like this avoids an exception in the case where the string is already shorter than 10.


Notes:

  1. The above does simple trimming. If you actually want to replace the last characters with three dots if the string is too long, use Apache Commons StringUtils.abbreviate; see @H6's solution. If you want to use the Unicode horizontal ellipsis character, see @Basil's solution.

  2. For typical implementations of String, s.substring(0, s.length()) will return s rather than allocating a new String.

  3. This may behave incorrectly1 if your String contains Unicode codepoints outside of the BMP; e.g. Emojis. For a (more complicated) solution that works correctly for all Unicode code-points, see @sibnick's solution.


1 - A Unicode codepoint that is not on plane 0 (the BMP) is represented as a "surrogate pair" (i.e. two char values) in the String. By ignoring this, we might trim the string to fewer than 10 code points, or (worse) truncate it in the middle of a surrogate pair. On the other hand, String.length() is not a good measure of Unicode text length, so trimming based on that property may be the wrong thing to do.


StringUtils.abbreviate from Apache Commons Lang library could be your friend:

StringUtils.abbreviate("abcdefg", 6) = "abc..."
StringUtils.abbreviate("abcdefg", 7) = "abcdefg"
StringUtils.abbreviate("abcdefg", 8) = "abcdefg"
StringUtils.abbreviate("abcdefg", 4) = "a..."

Commons Lang3 even allow to set a custom String as replacement marker. With this you can for example set a single character ellipsis.

StringUtils.abbreviate("abcdefg", "\u2026", 6) = "abcde…"

There is a Apache Commons StringUtils function which does this.

s = StringUtils.left(s, 10)

If len characters are not available, or the String is null, the String will be returned without an exception. An empty String is returned if len is negative.

StringUtils.left(null, ) = null
StringUtils.left(
, -ve) = ""
StringUtils.left("", *) = ""
StringUtils.left("abc", 0) = ""
StringUtils.left("abc", 2) = "ab"
StringUtils.left("abc", 4) = "abc"

StringUtils.Left JavaDocs

Courtesy:Steeve McCauley


As usual nobody cares about UTF-16 surrogate pairs. See about them: What are the most common non-BMP Unicode characters in actual use? Even authors of org.apache.commons/commons-lang3

You can see difference between correct code and usual code in this sample:

public static void main(String[] args) {
    //string with FACE WITH TEARS OF JOY symbol
    String s = "abcdafghi\uD83D\uDE02cdefg";
    int maxWidth = 10;
    System.out.println(s);
    //do not care about UTF-16 surrogate pairs
    System.out.println(s.substring(0, Math.min(s.length(), maxWidth)));
    //correctly process UTF-16 surrogate pairs
    if(s.length()>maxWidth){
        int correctedMaxWidth = (Character.isLowSurrogate(s.charAt(maxWidth)))&&maxWidth>0 ? maxWidth-1 : maxWidth;
        System.out.println(s.substring(0, Math.min(s.length(), correctedMaxWidth)));
    }
}

s = s.length() > 10 ? s.substring(0, 9) : s;


Or you can just use this method in case you don't have StringUtils on hand:

public static String abbreviateString(String input, int maxLength) {
    if (input.length() <= maxLength) 
        return input;
    else 
        return input.substring(0, maxLength-2) + "..";
}