Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return HTML tag value in Java

Tags:

java

html

regex

I am trying to write java code that will return the value in a HTML tag in java. below is the method I been trying to get working.. can someone please help me out

import java.util.regex.Matcher;
import java.util.regex.Pattern;

import com.seoreport.exceptions.DataNotFoundException;

public class utils {

    public String tagValue(String inHTML, String tag) throws DataNotFoundException
    {
        String value = null;

        String searchFor = "/<" + tag + ">(.*?)<\\/" + tag + "\\>/";

        Pattern pattern = Pattern.compile(searchFor);
        Matcher matcher = pattern.matcher(inHTML);

        return value;

    }

}
like image 567
Johnathan Smith Avatar asked Mar 20 '26 05:03

Johnathan Smith


1 Answers

why don't yo try to use an XML parser and access to the block using xpath? you may do something like:

// Parse the XML file and build the Document object in RAM
Document doc = docBuilder.parse(new File(fileName));

// Normalise text representation.
// Collapses adjacent text nodes into one node.
doc.getDocumentElement().normalize();

// get tag
xpath = ".//*/"+yourTag;
NodeList content= XPathAPI.selectNodeList(doc, xpath);

doing in this way you will have all the content in the content variable.

you can use it as a text using:

content.tostring();
like image 104
Stefano Avatar answered Mar 21 '26 19:03

Stefano



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!