I would like to get the words that are around a certain position in a string. For example two words after and two words before.
For example consider the string:
String str = "Hello my name is John and I like to go fishing and hiking I have two sisters and one brother.";
String find = "I";
for (int index = str.indexOf("I"); index >= 0; index = str.indexOf("I", index + 1))
{
System.out.println(index);
}
This writes out the index of where the word "I" is. But I want to be able to get a substring of the words around these positions.
I want to be able to print out "John and I like to" and "and hiking I have two".
Not only single word strings should be able to be selected. Search for "John and" will return " name is John and I like".
Is there any neat, smart way of doing this?
Extract a specific word from a string using find() method. If we want to extract a specific word from the string and we do not know the exact position of the word, we can first find the position of the word using find() method and then we can extract the word using string slicing.
Method #1 : Using re. findall() + index() This is one of the way in which we can find the location where word exists.
Java String indexOf() Method The indexOf() method returns the position of the first occurrence of specified character(s) in a string. Tip: Use the lastIndexOf method to return the position of the last occurrence of specified character(s) in a string.
To split a string at a specific index, use the slice method to get the two parts of the string, e.g. str. slice(0, index) returns the part of the string up to, but not including the provided index, and str. slice(index) returns the remainder of the string. Copied!
You can achiveve that using String
's split()
method. This solution is O(n).
public static void main(String[] args) {
String str = "Hello my name is John and I like to go fishing and "+
"hiking I have two sisters and one brother.";
String find = "I";
String[] sp = str.split(" +"); // "+" for multiple spaces
for (int i = 2; i < sp.length; i++) {
if (sp[i].equals(find)) {
// have to check for ArrayIndexOutOfBoundsException
String surr = (i-2 > 0 ? sp[i-2]+" " : "") +
(i-1 > 0 ? sp[i-1]+" " : "") +
sp[i] +
(i+1 < sp.length ? " "+sp[i+1] : "") +
(i+2 < sp.length ? " "+sp[i+2] : "");
System.out.println(surr);
}
}
}
Output:
John and I like to
and hiking I have two
Regex is a great and clean solution for case when find
is a multi-word. Due to its nature, though, it misses the cases when the the words around also match find
(see the an example of this below).
The algorithm below takes care of all cases (all solutions' space). Bear in mind that, due to the nature of the problem, this solution in the worst case is O(n*m) (with n
being str
's length and m
being find
's length).
public static void main(String[] args) {
String str = "Hello my name is John and John and I like to go...";
String find = "John and";
String[] sp = str.split(" +"); // "+" for multiple spaces
String[] spMulti = find.split(" +"); // "+" for multiple spaces
for (int i = 2; i < sp.length; i++) {
int j = 0;
while (j < spMulti.length && i+j < sp.length
&& sp[i+j].equals(spMulti[j])) {
j++;
}
if (j == spMulti.length) { // found spMulti entirely
StringBuilder surr = new StringBuilder();
if (i-2 > 0){ surr.append(sp[i-2]); surr.append(" "); }
if (i-1 > 0){ surr.append(sp[i-1]); surr.append(" "); }
for (int k = 0; k < spMulti.length; k++) {
if (k > 0){ surr.append(" "); }
surr.append(sp[i+k]);
}
if (i+spMulti.length < sp.length) {
surr.append(" ");
surr.append(sp[i+spMulti.length]);
}
if (i+spMulti.length+1 < sp.length) {
surr.append(" ");
surr.append(sp[i+spMulti.length+1]);
}
System.out.println(surr.toString());
}
}
}
Output:
name is John and John and
John and John and I like
Here is another way I found out using Regex:
String str = "Hello my name is John and I like to go fishing and hiking I have two sisters and one brother.";
String find = "I";
Pattern pattern = Pattern.compile("([^\\s]+\\s+[^\\s]+)\\s+"+find+"\\s+([^\\s]+\\s[^\\s]+\\s+)");
Matcher matcher = pattern.matcher(str);
while (matcher.find())
{
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
}
Output:
John and
like to
and hiking
have two
Use String.split() to split the text into words. Then search for "I" and concatenate the words back together:
String[] parts=str.split(" ");
for (int i=0; i< parts.length; i++){
if(parts[i].equals("I")){
String out= parts[i-2]+" "+parts[i-1]+ " "+ parts[i]+ " "+parts[i+1] etc..
}
}
Ofcourse you need to check if i-2 is a valid index, and using a StringBuffer would be handy performance wise, if you have a lot of data ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With