Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement a LIMITLESS String and StringBuilder

Tags:

java

Java's String and StringBuilder are limited to a length of Integer.MAX_VALUE. In most use cases this is more than adequate, but I have just encountered a use case in which I need to handle and return a String greater than 2,684,354,560 characters.

This is required for capturing an incoming stream of characters, in which I do not have control over the size of the stream, nor do I have the option of re-architecting the solution. What I can do at most is replace a method in an existing module, or introduce a new class that replaces String and StringBuilder in that method.

As a temporary workaround, to prevent the OutOfMemory exception thrown when the StringBuilder length exceeds Integer.MAX_VALUE, I implemented the follow safeAppend():

    private void safeAppend(StringBuilder ret, String current) {
        if ((long)ret.length() + current.length() > Integer.MAX_VALUE) {
            String truncateLeadingPart;
            if (current.length() < ret.length()) {
                truncateLeadingPart = ret.substring(current.length());
            }
            else {
                int startIndex = (int)((long)ret.length()+current.length()-Integer.MAX_VALUE);
                truncateLeadingPart = ret.substring(Math.min(ret.length(), startIndex));
            }
            ret.setLength(0);
            ret.append(truncateLeadingPart);
        }
        ret.append(current);
    }

This methods truncates the leading part and always keeps the trailing 2,147,483,647 characters part. However, this workaround/safeguard proved to be inadequate for the task at hand because we cannot afford losing any data captured from the stream.

What is a recommended approach to implementing a String and StringBuilder that are NOT limited by an int max size?

A limit of a long max size could be sufficient. Also, a single LimitlessString class that can be appended efficiently like StringBuilder is also adequate.

like image 724
datsb Avatar asked Jun 12 '26 15:06

datsb


2 Answers

You wont be able to String or StringBuffer as the 32-bit length is baked into the interface. That's also true of arrays and NIO buffers, unfortunately (there have been proposals to fix this, but nothing at the time of writing).

Obviously streaming or using random file access would be a good solution if that is possible.

You are left with implementing something else. Ropes use a binary tree to represent composition of string parts. More common is to use an array of arrays, or for better GC an array of directly allocated (or memory-mapped file) NIO buffers. Someone remarked a few years ago that this area of Computer Science still has scope for more PhDs.

like image 52
Tom Hawtin - tackline Avatar answered Jun 15 '26 07:06

Tom Hawtin - tackline


Well, if you Really-Really need to extend String/StringBuilder classes in such way you have to either create new class, that won't extend String/StringBuilder, because thay are marked as final, or you can change JRE binaries to make String/StringBuilder non-final. Anyway, both solutions sucks and will lead to huge support effort and will generate a lot of WTFs in future.

like image 34
DzianisH Avatar answered Jun 15 '26 06:06

DzianisH



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!