Let's say you're given http://nytimes.com How would you pull out the "main" image?
The reason I'm asking is because Flipboard is able to grab the main image from a website, just using the URL.
You could parse out all the image tags. But then what?
I don't believe there's a standard method. You could start by looking for an Open Graph Protocol image tag. Facebook uses these to select images for urls posted in status updates and comments.
<meta property="og:image" content="http://ia.media-imdb.com/rock.jpg"/>
If you're prepared to use a third party, Embedly offer this as a chargeable service.
Embedly provides a powerful API to convert standard URLs into embedded videos, images, and rich article previews from 218 leading providers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With