I want to search a DOM for a specific keyword, and when it is found, I want to know which Node in the tree it is from.
static void search(String segment, String keyword) {
if (segment == null)
return;
Pattern p=Pattern.compile(keyword,Pattern.CASE_INSENSITIVE);
StringBuffer test=new StringBuffer (segment);
matcher=p.matcher(test);
if(!matcher.hitEnd()){
total++;
if(matcher.find())
//what to do here to get the node?
}
}
public static void traverse(Node node) {
if (node == null || node.getNodeName() == null)
return;
search(node.getNodeValue(), "java");
check(node.getFirstChild());
System.out.println(node.getNodeValue() != null &&
node.getNodeValue().trim().length() == 0 ? "" : node);
check(node.getNextSibling());
}
Consider using XPath (API):
// the XML & search term
String xml = "<foo>" + "<bar>" + "xml java xpath" + "</bar>" + "</foo>";
InputSource src = new InputSource(new StringReader(xml));
final String term = "java";
// search expression and term variable resolver
String expression = "//*[contains(text(),$term)]";
final QName termVariableName = new QName("term");
class TermResolver implements XPathVariableResolver {
@Override
public Object resolveVariable(QName variableName) {
return termVariableName.equals(variableName) ? term : null;
}
}
// perform the search
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setXPathVariableResolver(new TermResolver());
Node node = (Node) xpath.evaluate(expression, src, XPathConstants.NODE);
If you want to do more complex matching via regular expressions, you can provide your own function resolver.
Breakdown of the XPath expression //*[contains(text(),$term)]:
//* the asterisk selects any element; the double-slash means any parent[contains(text(),$term)] is a predicate that matches the texttext() is a function that gets the element's text$term is a variable; this can be used to resolve the term "java" via the variable resolver; a resolver is preferred to string concatenation to prevent injection attacks (similar to SQL injection issues)contains(arg1,arg2) is a function that returns true if arg1 contains arg2XPathConstants.NODE tells the API to select a single node; you could use NODESET to get all matches as a NodeList.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With