Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to interleave a String with a character sequence

Tags:

java

string

What is the best way to interleave a java String with a given character sequence. The interleave interval should be changeable.

Example:

String s = " .... 0000000000000 ..."; // length random
String b = interleave(s, 3, "-");

Result:

... 000-000-000-000-000 ...

another example:

String s = " .... we all we all we all ...";
String b = interleave(s, 7, "rock ");

Result:

... we all rock we all rock we all rock ...

The function should also work if the string length isn't a multiple of the the interleave distance. Any suggestions? Is there (again) a 'commons' way to do it?

like image 768
Chris Avatar asked Nov 05 '09 11:11

Chris


People also ask

What is interleaved string?

An interleaving of two strings s and t is a configuration where they are divided into non-empty substrings such that: s = s1 + s2 + ...

What is interleaving string in Java?

C is said to be interleaving A and B, if it contains all and only characters of A and B and order of all characters in individual strings is preserved.

What is interleaving in C?

If all characters of C match either with a character of A or a character of B and length of C is sum of lengths of A and B, then C is an interleaving A and B.


1 Answers

Here's quite simple and fairly readable implementation (I call it StringBuilder in the benchmark below):

public static String interleave(String s, int interval, String separator)
{
    StringBuilder sb = new StringBuilder(s);
    for (int pos = (s.length()-1) / interval; pos > 0; pos--)
    {
        sb.insert(pos * interval, separator);
    }
    return sb.toString();
}

If you're concerned with the efficiency of the simple StringBuilder implementation then maybe this implementation would suit your needs better (I call it Arrays in the benchmark below):

public static String interleave(String string, int interval, String separator)
{
    char[] src = string.toCharArray();
    char[] sep = separator.toCharArray();
    int count = (src.length-1)/interval;
    char[] dst = new char[src.length + count * sep.length];
    int srcpos = 0, dstpos = 0;
    for (int i = 0; i < count; i++)
    {
        System.arraycopy(src, srcpos, dst, dstpos, interval);
        srcpos += interval;
        dstpos += interval;
        System.arraycopy(sep, 0, dst, dstpos, sep.length);
        dstpos += sep.length;
    }
    if (dstpos < dst.length)
    {
        System.arraycopy(src, srcpos, dst, dstpos, dst.length - dstpos);
    }
    return String.valueOf(dst);
}

Note: I would probably use this kind of implementation only when under J2ME environment but it should be seriously faster on huge strings. The readability is quite poor though...

There of course always is the RegExp way of doing things which is surprisingly quite fast after you climb past the length where compiling the RegExp itself stops being a problem (you can't pre-compile a RegExp, because it's generated on the fly depending on interval, thanks to Rubens Farias for pointing this out, somehow missed it myself). So here it goes (I call this RegExp in the benchmark below):

public static String interleave(String string, int interval, String separator)
{
    return string.replaceAll("(.{"+interval+"})", "$1"+Matcher.quoteReplacement(separator));
}

Note: This implementation inserts separator at the end if the length of string is at a multiple of interval (whereas the other implementations don't). I don't like RegExps because they are un-readable and not too fast either. Oh and you can easily forget the "quoteReplacement" part and put yourself into a big trouble if the separator contains "$1" or even worse - if it comes from the user.

Benchmark

At this point I've done some benchmarking, so the first implementation at string length 100000 takes 0.002643 seconds, the second one - 0.000010, the third one - 0.000071, but everything depends on string length.

Length    StringBuilder   Arrays       RegExp
  10000     0.000012     0.000001     0.000054
 100000     0.002643     0.000010     0.000071
1000000     0.315413     0.000026     0.000199

It's by no means a serious benchmarking, but it still shows the trends and complexities of the algorithms involved.

Note: Although it is fun to play with these ideas, we're still talking about sub-second improvements when working with strings that are less than 1M in size. So it doesn't matter too much which way you go if you're only working with strings that are up to 1K in size (it will be 0ms vs 0ms). The most important thing is that it should be readable, straightforward and not take too much time to write as I'm sure you have more important problems to solve unless you're writing an universal library for everyone to use in the weirdest cases. Remember - your time is much more valuable than CPU time.

Left- and right-aligned interleaving

I'll take the Arrays implementation for that as it seems easiest to change for this:

public static String interleave(String string, int interval, String separator, boolean fromRight)
{
    char[] src = string.toCharArray();
    char[] sep = separator.toCharArray();
    int count = (src.length-1)/interval;
    char[] dst = new char[src.length + count * sep.length];
    int srcpos = 0, dstpos = 0;
    if (fromRight)
    {
        srcpos = dstpos = src.length - count * interval;
        if (srcpos > 0) System.arraycopy(src, 0, dst, 0, srcpos);
        if (count > 0)
        {
            System.arraycopy(sep, 0, dst, dstpos, sep.length);
            dstpos += sep.length;
            count--;
        }
    }
    for (int i = 0; i < count; i++)
    {
        System.arraycopy(src, srcpos, dst, dstpos, interval);
        srcpos += interval;
        dstpos += interval;
        System.arraycopy(sep, 0, dst, dstpos, sep.length);
        dstpos += sep.length;
    }
    if (dstpos < dst.length)
    {
        System.arraycopy(src, srcpos, dst, dstpos, dst.length - dstpos);
    }
    return String.valueOf(dst);
}
like image 181
inkredibl Avatar answered Sep 21 '22 20:09

inkredibl