I'm trying to create a data structure that holds all the possible substring combinations that add up to the original string. For example, if the string is "java" the valid results would be "j", "ava", "ja", "v", "a", an invalid result would be "ja", "a" or "a", "jav"
I had it very easy in finding all the possible substrings
String string = "java";
List<String> substrings = new ArrayList<>();
for( int c = 0 ; c < string.length() ; c++ )
{
for( int i = 1 ; i <= string.length() - c ; i++ )
{
String sub = string.substring(c, c+i);
substrings.add(sub);
}
}
System.out.println(substrings);
and now I'm trying to construct a structure that holds only the valid substrings. But its not nearly as easy. I'm in the mist of a very ugly code, fiddling around with the indexes, and no where near of finishing, most likely on a wrong path completely. Any hints?
Here's one approach:
static List<List<String>> substrings(String input) {
// Base case: There's only one way to split up a single character
// string, and that is ["x"] where x is the character.
if (input.length() == 1)
return Collections.singletonList(Collections.singletonList(input));
// To hold the result
List<List<String>> result = new ArrayList<>();
// Recurse (since you tagged the question with recursion ;)
for (List<String> subresult : substrings(input.substring(1))) {
// Case: Don't split
List<String> l2 = new ArrayList<>(subresult);
l2.set(0, input.charAt(0) + l2.get(0));
result.add(l2);
// Case: Split
List<String> l = new ArrayList<>(subresult);
l.add(0, input.substring(0, 1));
result.add(l);
}
return result;
}
Output:
[java]
[j, ava]
[ja, va]
[j, a, va]
[jav, a]
[j, av, a]
[ja, v, a]
[j, a, v, a]
It seems like this is the problem of finding the compositions of the length of the string, and using those compositions to make substrings. So there are 2^n-1 compositions of a number n, which could make it a bit time-consuming for long strings...
Probably somebody would like another solution which is non-recursive and takes no memory to hold a list:
public static List<List<String>> substrings(final String input) {
if(input.isEmpty())
return Collections.emptyList();
final int size = 1 << (input.length()-1);
return new AbstractList<List<String>>() {
@Override
public List<String> get(int index) {
List<String> entry = new ArrayList<>();
int last = 0;
while(true) {
int next = Integer.numberOfTrailingZeros(index >> last)+last+1;
if(next == last+33)
break;
entry.add(input.substring(last, next));
last = next;
}
entry.add(input.substring(last));
return entry;
}
@Override
public int size() {
return size;
}
};
}
public static void main(String[] args) {
System.out.println(substrings("java"));
}
Output:
[[java], [j, ava], [ja, va], [j, a, va], [jav, a], [j, av, a], [ja, v, a], [j, a, v, a]]
It just calculates the next combination based on its index.
Just in case someone will look for the same algorithm in python, here is implementation in Python:
from itertools import combinations
def compositions(s):
n = len(s)
for k in range(n):
for c in combinations(range(1, n), k):
yield tuple(s[i:j] for i, j in zip((0,) + c, c + (n,)))
Example how it works:
>>> for x in compositions('abcd'):
... print(x)
('abcd',)
('a', 'bcd')
('ab', 'cd')
('abc', 'd')
('a', 'b', 'cd')
('a', 'bc', 'd')
('ab', 'c', 'd')
('a', 'b', 'c', 'd')
With a small modification you can generate compositions in different order:
def compositions(s):
n = len(s)
for k in range(n):
for c in itertools.combinations(range(n - 1, 0, -1), k):
yield tuple(s[i:j] for i, j in zip((0,) + c[::-1], c[::-1] + (n,)))
It will give you this:
>>> for x in compositions('abcd'):
... print(x)
('abcd',)
('abc', 'd')
('ab', 'cd')
('a', 'bcd')
('ab', 'c', 'd')
('a', 'bc', 'd')
('a', 'b', 'cd')
('a', 'b', 'c', 'd')
And with another small addition, you can generate only specified number of splits:
def compositions(s, r=None):
n = len(s)
r = range(n) if r is None else [r - 1]
for k in r:
for c in itertools.combinations(range(n - 1, 0, -1), k):
yield tuple(s[i:j] for i, j in zip((0,) + c[::-1], c[::-1] + (n,)))
>>> for x in compositions('abcd', 3):
... print(x)
('ab', 'c', 'd')
('a', 'bc', 'd')
('a', 'b', 'cd')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With