Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

splitting string in java into fixed length chunks

Tags:

java

string

split

I have seven strings in a program named string1 through string7.

The size of each of the string will be 30 characters.

I will get a input string of unknown length.

I have to split this input string in 30 char strings and then set first substring into string1, 2nd in string2 and so on until possible. If input string is greater then 210 characters then remaining string at the end will be ignored.

How to handle the case when the input string is of size smaller then 210 char.

For e.g. 145 in which case string1 through string4 will be full and string5 will be made of remaining 15 char.

How to handle this nicely ?

I can do it reading char by char and putting first 30 char and string1, next in string2, etc until all char are consumed.

But is there a better way to do this ?

like image 211
Vicky Avatar asked Jun 28 '12 10:06

Vicky


People also ask

How can I split a string into segments of N characters in Java?

Using the String#split Method As the name implies, it splits a string into multiple parts based on a given delimiter or regular expression. As we can see, we used the regex (? <=\\G. {” + n + “}) where n is the number of characters.

How do I split a string into multiple places?

To split a string by multiple spaces, call the split() method, passing it a regular expression, e.g. str. trim(). split(/\s+/) . The regular expression will split the string on one or more spaces and return an array containing the substrings.

How do you break apart a string in Java?

split("-"); We can simply use a character/substring instead of an actual regular expression. Of course, there are certain special characters in regex which we need to keep in mind, and escape them in case we want their literal value. Once the string is split, the result is returned as an array of Strings.


5 Answers

private static Collection<String> splitStringBySize(String str, int size) {
    ArrayList<String> split = new ArrayList<>();
    for (int i = 0; i <= str.length() / size; i++) {
        split.add(str.substring(i * size, Math.min((i + 1) * size, str.length())));
    }
    return split;
}
like image 70
Matthias Avatar answered Oct 17 '22 00:10

Matthias


If you can use third-party libraries, with Guava this is just

Iterable<String> chunks = Splitter.fixedLength(30).split(string);

This can be converted to a List<String> with e.g. Lists.newArrayList.

(Disclosure: I contribute to Guava.)

like image 42
Louis Wasserman Avatar answered Oct 17 '22 01:10

Louis Wasserman


I ran into an issue with a specific usage of this technique. A user was copy/pasting M$ Word content into an HTML field that eventually was picked up by this technique to be split into multiple database fields.

The technique broke against M$ Word's use of carriage returns and other ASCII characters. The REGEX would split off each carriage return instead of a specified number of characters. To correct the issue, I modified Michael Besteck's code to the following:

Matcher m = Pattern.compile(".{1,30}", Pattern.DOTALL).matcher(s);
String s1 = m.find() ? s.substring(m.start(), m.end()) : "";
String s2 = m.find() ? s.substring(m.start(), m.end()) : "";
String s3 = m.find() ? s.substring(m.start(), m.end()) : "";
String s4 = m.find() ? s.substring(m.start(), m.end()) : "";
String s5 = m.find() ? s.substring(m.start(), m.end()) : "";
String s6 = m.find() ? s.substring(m.start(), m.end()) : "";
String s7 = m.find() ? s.substring(m.start(), m.end()) : "";

This accounts for the ASCII characters correctly.

like image 21
Justin Bahn Avatar answered Oct 07 '22 18:10

Justin Bahn


Since your Strings are not in an array or List you need to assign them explicitely.

    Matcher m = Pattern.compile(".{1,30}").matcher(s);
    String s1 = m.find() ? s.substring(m.start(), m.end()) : "";
    String s2 = m.find() ? s.substring(m.start(), m.end()) : "";
    String s3 = m.find() ? s.substring(m.start(), m.end()) : "";
    String s4 = m.find() ? s.substring(m.start(), m.end()) : "";
    String s5 = m.find() ? s.substring(m.start(), m.end()) : "";
    String s6 = m.find() ? s.substring(m.start(), m.end()) : "";
    String s7 = m.find() ? s.substring(m.start(), m.end()) : "";
like image 4
Michael Besteck Avatar answered Oct 17 '22 00:10

Michael Besteck


How about using a char array for splitting the string, create a general-use method receiving the chunk size and maximum size to consider, and returning a String array?

public class SplitStringIntoFixedSizeChunks {

    public static String[] Split(String text, int chunkSize, int maxLength) { 
        char[] data = text.toCharArray();       
        int len = Math.min(data.length,maxLength);
        String[] result = new String[(len+chunkSize-1)/chunkSize];
        int linha = 0;
        for (int i=0; i < len; i+=chunkSize) {
            result[linha] = new String(data, i, Math.min(chunkSize,len-i));
            linha++;
        }
        return result;
    }

    public static void main(String[] args) { 
        String x = "flskdafsld~fdsakçkçfsda sfdaldsak~çfdskkfadsçlkçfldskçlflçfdskçldksçlkfdslçakafdslçdsklçfdskçlafdskçkdfsçlkfds~çlkfasdçlçfdls~kçlf~dksçlsakdçlkfç";
        System.out.println("x length: "+x.length());
        String[] lines = Split(x, 30, 210);
        for (int i=0; i < lines.length; i++) {
            System.out.println("lines["+i+"]: (len: "+lines[i].length()+") : "+lines[i]);
        }
    }
}

This example results:

x length: 145
lines[0]: (len: 30) : flskdafsld~fdsakçkçfsda sfdald
lines[1]: (len: 30) : sak~çfdskkfadsçlkçfldskçlflçfd
lines[2]: (len: 30) : skçldksçlkfdslçakafdslçdsklçfd
lines[3]: (len: 30) : skçlafdskçkdfsçlkfds~çlkfasdçl
lines[4]: (len: 25) : çfdls~kçlf~dksçlsakdçlkfç
like image 2
Jose Tepedino Avatar answered Oct 17 '22 00:10

Jose Tepedino