How can I get a string inside double quotes using regular expression?
I have the following string:
<img src="http://yahoo.com/img1.jpg" alt="">
I want to get the string http://yahoo.com/img1.jpg alt=""
outside.
How can I do this using regular expression?
I don't know why you want the alt tag as well, but this regexp does what you want: Group 1 is the url and group 2 is the alt tag. I would possibly modify the regexp a bit if there can be several spaces between img and src, and if there can be spaces around '='
Pattern p = Pattern.compile("<img src=\"([^\"]*)\" (alt=\"[^\"]*\")>");
Matcher m =
p.matcher("<img src=\"http://yahoo.com/img1.jpg\" alt=\"\"> " +
"<img src=\"http://yahoo.com/img2.jpg\" alt=\"\">");
while (m.find()) {
System.out.println(m.group(1) + " " + m.group(2));
}
Output:
http://yahoo.com/img1.jpg alt=""
http://yahoo.com/img2.jpg alt=""
You can do it like this:
Pattern p = Pattern.compile("<img src=\"(.*?)\".*?>");
Matcher m = p.matcher("<img src=\"http://yahoo.com/img1.jpg\" alt=\"\">");
if (m.find())
System.out.println(m.group(1));
However, if you're parsing HTML consider using some library: regex are not a good idea to parse HTML. I had good experiences with jsoup: here's an example:
String fragment = "<img src=\"http://yahoo.com/img1.jpg\" alt=\"\">";
Document doc = Jsoup.parseBodyFragment(fragment);
Element img = doc.select("img").first();
String src = img.attr("src");
System.out.println(src);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With