Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing the url from text using java

Tags:

java

regex

How to remove the URLs present in text example

String str="Fear psychosis after #AssamRiots - http://www.google.com/LdEbWTgD http://www.yahoo.com/mksVZKBz";

using a regular expression?

I want to remove all the URLs in the text. But it's not working, my code is :

String pattern = "(http(.*?)\\s)";
Pattern pt = Pattern.compile(pattern);
Matcher namemacher = pt.matcher(input);
if (namemacher.find()) {
  str=input.replace(namemacher.group(0), "");
}
like image 749
NLP JAVA Avatar asked Sep 11 '12 09:09

NLP JAVA


People also ask

How do you remove URL from text?

To remove a hyperlink but keep the text, right-click the hyperlink and click Remove Hyperlink. To remove the hyperlink completely, select it and then press Delete.

How would you extract the URL in Java?

In Java, this can be done by using Pattern. matcher(). Find the substring from the first index of match result to the last index of the match result and add this substring into the list. After completing the above steps, if the list is found to be empty, then print “-1” as there is no URL present in the string S.

How to remove from a URL?

Sign in to your Google Search Console account. Select the right property. Click the Removals button in the right-column menu. Choose Remove this URL only , enter the URL you want to remove and hit the Next button.


1 Answers

Input the String that contains the url

private String removeUrl(String commentstr)
    {
        String urlPattern = "((https?|ftp|gopher|telnet|file|Unsure|http):((//)|(\\\\))+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*)";
        Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(commentstr);
        int i = 0;
        while (m.find()) {
            commentstr = commentstr.replaceAll(m.group(i),"").trim();
            i++;
        }
        return commentstr;
    }
like image 186
NLP JAVA Avatar answered Oct 20 '22 05:10

NLP JAVA