How to read a text from a web page with Java?

I want to read the text from a web page. I don't want to get the web page's HTML code. I found this code:

    try {
        // Create a URL for the desired page
        URL url = new URL("http://www.uefa.com/uefa/aboutuefa/organisation/congress/news/newsid=1772321.html#uefa+moving+with+tide+history");       

        // Read all the text returned by the server
        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
        String str;
        while ((str = in.readLine()) != null) {
            str = in.readLine().toString();
            System.out.println(str);
            // str is one line of text; readLine() strips the newline character(s)
        }
        in.close();
    } catch (MalformedURLException e) {
    } catch (IOException e) {
    }

but this code gives me the HTML code of the web page. I want to get the whole text inside this page. How can I do this with Java?

Can you read websites HTML with Java?

Java has built-in tools and third-party libraries for reading/downloading web pages. In the examples, we use HttpClient, URL, JSoup, HtmlCleaner, Apache HttpClient, Jetty HttpClient, and HtmlUnit. In the following examples, we download HTML source from the webcode.me tiny web page.

You may want to have a look at jsoup for this:

String html = "<p>An <a href='http://example.com/'><b>example</b></a> link.</p>";
Document doc = Jsoup.parse(html); 
String text = doc.body().text(); // "An example link"

This example is an extract from one on their site.

Use JSoup.

You will be able to parse the content using css style selectors.

In this example you can try

Document doc = Jsoup.connect("http://www.uefa.com/uefa/aboutuefa/organisation/congress/news/newsid=1772321.html#uefa+moving+with+tide+history").get(); 
String textContents = doc.select(".newsText").first().text();

How to read a text from a web page with Java?

Tags:

java

Rigor Mortis

People also ask

2 Answers

Fabian Barney

Nitzan Volman

Recent Activity

Donate For Us

How to read a text from a web page with Java?

Tags:

java

Rigor Mortis

People also ask

2 Answers

Fabian Barney

Nitzan Volman

Related questions

Recent Activity

Donate For Us