I have the following string:A:B:1111;domain:80;a;b
The A
is optional so B:1111;domain:80;a;b
is also valid input.
The :80
is optional as well so B:1111;domain;a;b
or :1111;domain;a;b
are also valid input
What I want is to end up with a String[]
that has:
s[0] = "A";
s[1] = "B";
s[2] = "1111";
s[3] = "domain:80"
s[4] = "a"
s[5] = "b"
I did this as follows:
List<String> tokens = new ArrayList<String>();
String[] values = s.split(";");
String[] actions = values[0].split(":");
for(String a:actions){
tokens.add(a);
}
//Start from 1 to skip A:B:1111
for(int i = 1; i < values.length; i++){
tokens.add(values[i]);
}
String[] finalResult = tokens.toArray();
I was wondering is there a better way to do this? How else could I do this more efficiently?
As the name suggests, a Java String Split() method is used to decompose or split the invoking Java String into parts and return the Array. Each part or item of an Array is delimited by the delimiters(“”, “ ”, \\) or regular expression that we have passed. The return type of Split is an Array of type Strings.
To split a string without removing the delimiter: Use the str. split() method to split the string into a list.
In javascript, we can split a string in 3 ways.
There are not many efficiency concerns here, all I see is linear.
Anyway, you could either use a regular expression or a manual tokenizer.
You can avoid the list. You know the length of values
and actions
, so you can do
String[] values = s.split(";");
String[] actions = values[0].split(":");
String[] result = new String[actions.length + values.length - 1];
System.arraycopy(actions, 0, result, 0, actions.legnth);
System.arraycopy(values, 1, result, actions.length, values.length - 1);
return result;
It should be reasonably efficient, unless you insist on implementing split
yourself.
Untested low-level approach (make sure to unit test and benchmark before use):
// Separator characters, as char, not string.
final static int s1 = ':';
final static int s2 = ';';
// Compute required size:
int components = 1;
for(int p = Math.min(s.indexOf(s1), s.indexOf(s2));
p < s.length() && p > -1;
p = s.indexOf(s2, p+1)) {
components++;
}
String[] result = new String[components];
// Build result
int in=0, i=0, out=Math.min(s.indexOf(s1), s.indexOf(s2));
while(out < s.length() && out > -1) {
result[i] = s.substring(in, out);
i++;
in = out + 1;
out = s.indexOf(s2, in);
}
assert(i == result.length - 1);
result[i] = s.substring(in, s.length());
return result;
Note: this code is optimized in the crazy way of that it will consider a :
only in the first component. Handling the last component is a bit tricky, as out
will have the value -1
.
I would usually not use this last approach, unless performance and memory is extremely crucial. Most likely there are still some bugs in it, and the code is fairly unreadable, in particulare compare to the one above.
With some assumptions about acceptable characters, this regex provides validation as well as splitting into the groups you desire.
Pattern p = Pattern.compile("^((.+):)?(.+):(\\d+);(.+):(\\d+);(.+);(.+)$");
Matcher m = p.matcher("A:B:1111;domain:80;a;b");
if(m.matches())
{
for(int i = 0; i <= m.groupCount(); i++)
System.out.println(m.group(i));
}
m = p.matcher("B:1111;domain:80;a;b");
if(m.matches())
{
for(int i = 0; i <= m.groupCount(); i++)
System.out.println(m.group(i));
}
Gives:
A:B:1111;domain:80;a;b // ignore this
A: // ignore this
A // This is the optional A, check for null
B
1111
domain
80
a
b
And
B:1111;domain:80;a;b // ignore this
null // ignore this
null // This is the optional A, check for null
B
1111
domain
80
a
b
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With