Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get content of <script type="application/ld+json"> using PHP

I can't find an API for Vine to get the page content's title, description and image. The JSON is in the body of the page itself in a script tag: . How do I get the content (the JSON) of this script tag using PHP so it can be parsed?

Vine page:

https://vine.co/v/igO3EbIXDlI

From page source

<script type="application/ld+json">
            {
              "@context": "http://schema.org",
              "@type": "SocialMediaPosting",
              "url": "https://vine.co/v/igO3EbIXDlI",
              "datePublished": "2016-03-01T00:58:35",
              "author": {
                "@type": "Person",
                "name": "MotorAddicts\u2122",
                "image": "https://v.cdn.vine.co/r/avatars/39FEFED72B1242718633613316096_pic-r-1439261422661708f3e9755.jpg.jpg?versionId=LPjQUQ4KmTIPLu3iDbXw4FipgjEpC6fw",
                "url": "https://vine.co/u/989736283540746240"
              },
              "articleBody": "Mmm...  Black black blaaaaack!! \ud83d\ude0d ( Drift \u53d1 )",
              "image": "https://v.cdn.vine.co/r/videos/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.webm.jpg?versionId=wPuaQvDxnpwF7KjSGao21hoddooc3eCl",
              "interactionCount": [{
                "@type": "UserInteraction",
                "userInteractionType": "http://schema.org/UserLikes",
                "value": "1382"
              }, {
                "@type": "UserInteraction",
                "userInteractionType": "http://schema.org/UserShares",
                "value": "368"
              }, {
                "@type": "UserInteraction",
                "userInteractionType": "http://schema.org/UserComments",
                "value": "41"
              }, {
                "@type": "UserInteraction",
                "userInteractionType": "http://schema.org/UserViews",
                "value": "80575"
              }],

              "sharedContent": {
                "@type": "VideoObject",
                "name" : "Mmm...  Black black blaaaaack!! \ud83d\ude0d ( Drift \u53d1 )",
                "description" : "",
                "thumbnailUrl" : "https://v.cdn.vine.co/r/videos/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.webm.jpg?versionId=wPuaQvDxnpwF7KjSGao21hoddooc3eCl",
                "uploadDate" : "2016-03-01T00:58:35",
                "contentUrl" : "https://v.cdn.vine.co/r/videos_h264high/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.mp4?versionId=w7ugLPYtj5LWeVUsXaH1bt2VuK8QE0qv",
                "embedUrl" : "https://vine.co/v/igO3EbIXDlI/embed/simple",
                "interactionCount" : "82366"
              }
            }
          </script>

What to do after this?

$html = 'https://vine.co/v/igO3EbIXDlI';
$dom = new DOMDocument;
$dom->loadHTML($html);

Update:

I found the instructions for the Vine API here:

https://dev.twitter.com/web/vine/oembed

To query the Vine API for JSON, get the request from:

https://vine.co/oembed.json?url=https%3A%2F%2Fvine.co%2Fv%2F[videoid]

Example:

https://vine.co/oembed.json?url=https%3A%2F%2Fvine.co%2Fv%2FMl16lZVTTxe
like image 858
Patriotec Avatar asked Mar 13 '16 19:03

Patriotec


1 Answers

You can use DOMDocument and DOMXpath for this:

$html = file_get_contents( $url );
$dom  = new DOMDocument();
libxml_use_internal_errors( 1 );
$dom->loadHTML( $html );
$xpath = new DOMXpath( $dom );
$jsonScripts = $xpath->query( '//script[@type="application/ld+json"]' );
$json = trim( $jsonScripts->item(0)->nodeValue );

$data = json_decode( $json );

phpFiddle demo

With this xPath pattern you search for all <script> nodes with attribute type as “application/ld+json”:

//                              Following path no matter where they are in the document
script                          Elements <script>
[@type="application/ld+json"]   with attribute “tipe” as “application/ld+json”

Then you retrieve your JSON string getting ->nodeValue of first returned <script> node.

If you don't know in advance the node existence and/or its position, use this:

$jsonScripts = $xpath->query( '//script[@type="application/ld+json"]' );
if( $jsonScripts->length < 1 )
{
    die( "Error: No script node found" );
}
else
{
    foreach( $jsonScripts as $node )
    {
        $json = json_decode( $node->nodeValue );

        // your stuff with JSON ...
    }
}
like image 106
fusion3k Avatar answered Nov 03 '22 10:11

fusion3k