Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cURL request in a loop sometimes returning nothing at all

The issue:

I'm working with PHP, cURL and a public API to fetch json strings. These json strings are formatted like this (simplified, average size is around 50-60 kB):

{
   "data": {},
   "previous": "url",
   "next": "url"
}

What am trying to do is fetch all the json strings starting from the first one by checking for the "next" attribute. So I have a while loop and as long as there's a "next" attribute, I fetch the next URL.

The problem is sometimes, randomly the loop stops before the end and I cannot figure out why after many tests.

I say randomly because sometimes the loop goes through to the end and no problem occurs. Sometimes it crashes after N loops.

And so far I couldn't extract any information to help me debug it.

I'm using PHP 7.3.0 and launching my code from the CLI.

What I tried so far:

Check the headers:

No headers are returned. Nothing at all.

Use curl_errno() and curl_error():

I tried the following code right after executing the request (curl_exec($ch)) but it never triggers.

if(curl_errno($ch)) {
   echo 'curl error ' . curl_error($ch) . PHP_EOL;
   echo 'response received from curl error :' . PHP_EOL;
   var_dump($response); // the json string I should get from the server.
}

Check if the response came back null:

if(is_null($response))

or if my json string has an error:

if(!json_last_error() == JSON_ERROR_NONE)

Though I think it's useless because it will never be valid if the cURL response is null or empty. When this code triggers, the json error code is 3 (JSON_ERROR_CTRL_CHAR)

The problematic code:

function apiCall($url) {
   ...
   $ch = curl_init();
   curl_setopt($ch, CURLOPT_URL, $url);
   curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
   $response = curl_exec($ch);
}
$inc = 0;
$url = 'https://api.example.com/' . $id;
$jsonString = apiCall($url);

if(!is_null($jsonString)) {
file_put_contents('pathToDirectory/' . $id + $inc, $jsonString);
$nextUrl = getNextUrl($jsonString);

    while ($nextUrl) {
        $jsonString = apiCall($url . '?page=' . $nextUrl);

        if(!is_null($jsonString)) {
            $inc++;
            file_put_contents('pathToDirectory/' . $id + $inc, $jsonString);
            $nextUrl = getNextUrl($jsonString);
        }
    }
}

What I'm expecting my code to do:

Not stop randomly, or at least give me a clear error code.

like image 517
KeksimusTotalus Avatar asked Mar 12 '26 23:03

KeksimusTotalus


1 Answers

The problem is that your API could be returning an empty response, a malformed JSON, or even a status code different of 200 and you would stop execution imediately.

Since you do not control the API responses, you know that it can fail randomly, and you do not have access to the API server logs (because you don't, do you?); you need to build some kind of resilience in your consumer.

Something very simple (you'd need to tune it up) could be

function apiCall( $url, $attempts = 3 ) {
    // ..., including setting "$headers"
    $ch = curl_init();
    curl_setopt( $ch, CURLOPT_URL, $url );
    curl_setopt( $ch, CURLOPT_HTTPHEADER, $headers );
    curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );

    for ( $i = 0; $i < $attempts; $i ++ ) {
        $response  = curl_exec( $ch );
        $curl_info = curl_getinfo( $ch );

        if ( curl_errno( $ch ) ) {
            // log your error & try again
            continue;
        }

        // I'm only accepting 200 as a status code. Check with your API if there could be other posssible "good" responses
        if ( $curl_info['http_code'] != 200 ) {
            // log your error & try again
            continue;
        }

        // everything seems fine, but the response is empty? not good.
        if ( empty( $response ) ) {
            // log your error & and try again
            continue;
        }

        return $response;
    }

    return null;
}

This would allow you to do something like (pulled from your code):

do {
    $jsonString = apiCall($url . '?page=' . $nextUrl);
    $nextUrl    = false;

    if(!is_null($jsonString)) {
        $inc++;
         file_put_contents('pathToDirectory/' . $id + $inc, $jsonString);
         $nextUrl = getNextUrl($jsonString);
    }
}
while ($nextUrl);

I'm not checking if the return from the API is non-empty, not a connection error, a status different from '200' and yet an invalid JSON.

You may want to check for these things as well, depending on how brittle the API you are consuming is.

like image 150
yivi Avatar answered Mar 15 '26 11:03

yivi