Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using the cURL output

I'm planning to record the live tweets on a particular topic. For the same, I'm using the twitter stream API with cURL in PHP.

Here is the code:

<?php

$username = "xxxxx";
$password = "xxxxx";


$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'https://stream.twitter.com/1/statuses/filter.json?         track=SEARCH_PARAMETER');
curl_setopt($ch, CURLOPT_USERPWD, $username.":".$password);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$result = curl_exec($ch);
$jsonOBJ = json_decode ($result);

curl_close($ch);

print_r($jsonOBJ);
?>

My problem is, if I set CURLOPT_RETURNTRANSFER to 0, I can see the tweets on the terminal. But I'm not able to store in the variable $jsonOBJ and print it.

Please Help!

like image 933
Gooner Avatar asked Apr 26 '12 16:04

Gooner


3 Answers

UPDATE: See new code at the end of the message, this was actually pretty easy to do with cURL but I went about it incorrectly the first time.

I wasn't able to get the Twitter stream API to work using cURL in conjunction with a CURLOPT_READFUNCTION but have had success using fsockopen() and fread(). I'm not sure why the readfunction wasn't working as I have used it before with success, but it must have something to do with the fact that the response data is "streaming" and is not sent using HTTP chunked encoding. Essentially, my read function never got called so I couldn't process the data.

The method I used that is working now:

  • Connect using fsockopen to ssl://stream.twitter.com
  • Issue the basic HTTP request for stream data using fputs
  • Consume the HTTP response headers and make sure there were no errors
  • Read an amount data using fread in an infinite loop
  • Each time a chunk of data is read, I call an internal buffer function
  • The buffer function appends the new data to a buffer
  • Buffer function then tries to process all messages in the buffer (if we have 1 or more complete messages)
  • As it processes each message, the buffer is reduced until it is empty and then the function returns and data is read again

I've had it running for a couple of hours now and haven't had a dropped connection and I've processed over 30,000 messages with no errors yet.

Basically I implemented a callback system so that each time a full message is read from the buffer, it calls the user-defined callback with the json message so the application can do whatever it needs to do with the message (e.g. insert to database).

I don't have any short snippets to post here yet, but if you want, message me by going to the website listed on my profile and filling in the contact form and I'd be happy to share. Maybe we can work together if anyone is interested. I only did this for fun, I have no interest in Twitter and am not using it for financial reasons. I'll put it on GitHub eventually perhaps.

EDIT:

Here is some cURL code that will connect to the streaming API and pass the JSON messages to a callback function as they are available. This example uses gzip encoding to save bandwidth.

<?php

$USERNAME = 'youruser';
$PASSWORD = 'yourpass';
$QUERY    = 'nike';

/**
 * Called every time a chunk of data is read, this will be a json encoded message
 * 
 * @param resource $handle The curl handle
 * @param string   $data   The data chunk (json message)
 */
function writeCallback($handle, $data)
{
    /*
    echo "-----------------------------------------------------------\n";
    echo $data;
    echo "-----------------------------------------------------------\n";
    */

    $json = json_decode($data);
    if (isset($json->user) && isset($json->text)) {
        echo "@{$json->user->screen_name}: {$json->text}\n\n";
    }

    return strlen($data);
}

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'https://stream.twitter.com/1/statuses/filter.json?track=' . urlencode($QUERY));
curl_setopt($ch, CURLOPT_USERPWD, "$USERNAME:$PASSWORD");
curl_setopt($ch, CURLOPT_WRITEFUNCTION, 'writeCallback');
curl_setopt($ch, CURLOPT_TIMEOUT, 20); // disconnect after 20 seconds for testing
curl_setopt($ch, CURLOPT_VERBOSE, 1);  // debugging
curl_setopt($ch, CURLOPT_ENCODING,  'gzip, deflate'); // req'd to get gzip
curl_setopt($ch, CURLOPT_USERAGENT, 'tstreamer/1.0'); // req'd to get gzip

curl_exec($ch); // commence streaming

$info = curl_getinfo($ch);

var_dump($info);
like image 127
drew010 Avatar answered Oct 23 '22 03:10

drew010


I'm working on the same kind of thing as well :) The problem is that when you do it in terminal, it's a stream, so the connection stays alive until you kill it. (ie the curl_exec() doesn't finish)

Try looking at CURLOPT_PROGRESSFUNCTION and CURLOPT_READFUNCTION. They might give you some hints.

like image 45
Reza S Avatar answered Oct 23 '22 05:10

Reza S


@Reza Sanaie and others who may find this helpful.

I used the SEARCH TWITTER API and get the live tweets as well. So this could be helpful. Here is the code:

<?php
$query = "SEARCH_PARAMETER";
$request = "http://search.twitter.com/search.json?q=".urlencode($query);
$response = file_get_contents($request);
$jsonobj = json_decode($response);
 print_r($jsonobj);
?>

I also have the MySQL connection setup to push it into the database and this script is added in the crontab to automate the whole process.

like image 35
Gooner Avatar answered Oct 23 '22 03:10

Gooner