Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting title and meta tags from external website

I want to try figure out how to get the

<title>A common title</title> <meta name="keywords" content="Keywords blabla" /> <meta name="description" content="This is the description" /> 

Even though if it's arranged in any order, I've heard of the PHP Simple HTML DOM Parser but I don't really want to use it. Is it possible for a solution except using the PHP Simple HTML DOM Parser.

preg_match will not be able to do it if it's invalid HTML?

Can cURL do something like this with preg_match?

Facebook does something like this but it's properly used by using:

<meta property="og:description" content="Description blabla" /> 

I want something like this so that it is possible when someone posts a link, it should retrieve the title and the meta tags. If there are no meta tags, then it it ignored or the user can set it themselves (but I'll do that later on myself).

like image 959
MacMac Avatar asked Sep 14 '10 17:09

MacMac


People also ask

How do I extract meta tags from a website?

If you want to find out whether a given page is using meta tags, just right-click anywhere on the page and select “View Page Source.” A new tab will open in Chrome (in Firefox, it'll be a pop-up window). The part at the top, or “head” of the page, is where the meta tags would be.

How do I find the title tag of a website?

On web browsers, the website title appears at the top of the tab or window, and in search results website titles display as bold hyperlinked texts. A good rule of thumb is to make website titles 50 to 65 characters long and ensure they are clear, as well as descriptive without being truncated.

How do I find title tags and meta descriptions?

The Meta Description tag is not visible to people viewing your website, but it serves an important role in search because it often (but not always) appears directly under the page Title on a SERP. As such, it is read by the searcher and will affect the decision to click — or not.


2 Answers

This is the way it should be:

function file_get_contents_curl($url) {     $ch = curl_init();      curl_setopt($ch, CURLOPT_HEADER, 0);     curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);     curl_setopt($ch, CURLOPT_URL, $url);     curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);      $data = curl_exec($ch);     curl_close($ch);      return $data; }  $html = file_get_contents_curl("http://example.com/");  //parsing begins here: $doc = new DOMDocument(); @$doc->loadHTML($html); $nodes = $doc->getElementsByTagName('title');  //get and display what you need: $title = $nodes->item(0)->nodeValue;  $metas = $doc->getElementsByTagName('meta');  for ($i = 0; $i < $metas->length; $i++) {     $meta = $metas->item($i);     if($meta->getAttribute('name') == 'description')         $description = $meta->getAttribute('content');     if($meta->getAttribute('name') == 'keywords')         $keywords = $meta->getAttribute('content'); }  echo "Title: $title". '<br/><br/>'; echo "Description: $description". '<br/><br/>'; echo "Keywords: $keywords"; 
like image 61
shamittomar Avatar answered Sep 19 '22 20:09

shamittomar


<?php // Assuming the above tags are at www.example.com $tags = get_meta_tags('http://www.example.com/');  // Notice how the keys are all lowercase now, and // how . was replaced by _ in the key. echo $tags['author'];       // name echo $tags['keywords'];     // php documentation echo $tags['description'];  // a php manual echo $tags['geo_position']; // 49.33;-86.59 ?> 
like image 21
Bob Jeey Avatar answered Sep 17 '22 20:09

Bob Jeey