Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

curl_multi_wakeup doesn't seem to wakeup the associated curl_multi_poll - Android (but may not be limited to)

Tags:

c++

c

android

curl

Curl version: 7.71.0 with c-ares

Background

We are building a library that's being integrated into mobile apps. We are targeting both iOS and Android. Curl initialisation happens in a static block inside the library.

The iOS version of the library is bundled into a framework, which is loaded at app startup, if i'm not mistaken. The Android version of the library is bundled in a module, which is lazily loaded. (I know this is an issue, especially since we link against OpenSSL, but it's probably important for context).

We built a small HTTP client with curl, that allows use to download some data blob from trusted servers.

Quick architecture review

The HTTP client is running on its own thread. It holds a curl_multi_handle, and any transfer started append a curl_easy_handle to it, and return a handle to a Response that contains a buffer to read the received bytes from, and is used to control the transfer if needed.

Since cURL handles are not thread safe, any action (referred to as Tasks from now on) to the handle is dispatched to the HTTP client's thread, and a boost::shared_future is returned (we might want to block or not depending on the use case).

Here is a rough idea of how the main loop is structured:

while (!done) {
    deal_with_transfer();
    check_transfer_status();
    cleanup_any_orphan_transfer();
    execute_all_queue_tasks();
    curl_multi_poll(multi, nullptr, 0, very_large_number, nullptr);
}

Appending to the task queue also performs a curl_multi_wakeup(multi) to make sure that task is executed (e.g. adding a new download is also a dispatched task).

The issue

We've only thus far tested on Android, and we've seen in some cases, HTTP client tasks that are blocking are sometimes never returning.

Logs and stacktraces show that we wait on a task being executed on by the HTTP client, but the client is still polling. Everything seems to indicate that it was't woken up when appending a task.

I can't seem to replicate the issue locally, on a device, but it happens often enough to be a blocker issue.

I'm a bit at a loss here, and I don't really know where to start looking to find a way to reproduce the issue, let alone fixing it.

I hope I gave enough context to start making educated guess, or even find a the source of error!

Thanks for reading!

like image 927
Pulo Avatar asked Mar 05 '21 06:03

Pulo


1 Answers

Limitations on network activities for background processes

Mobile operating systems such as Android and iOS have a different scheduling strategies for background processes compared to traditional *nix operating systems. The mobile OS tends to starve the background processes in order to preserve the battery time. This is know as background process optimization and will be applied to the processes/threads of the application the moment application enters in background.

As of Android 7, the background processes are no longer informed about the network events with the CONNECTIVITY_ACTION broadcasts unless they register in the manifest that they want to receive the events.

Although the lobcurl is used in android native code, the threads created from the native library will be subject of the entitlements that the application declared in the manifest (which need to be granted).

Workaround to try

I know how frustrating a blocking issue can be so I can offer you a quick workaround to try until the problem is resolved.

curl_multi_poll() can receive a timeout that in your code is set to a very_large_number. In addition, the last parameter of the function call is a pointer to an integer numfds that will be populated with the number of file descriptors on which an event occurred while the curl_multi_pool() was pooling.

You can use this in your favor to construct a workaround in the following way:

  1. Make the very_large_number a reasonably_small_number
  2. replace the nullptr with &numfds
  3. Surround the curl_multi_poll with a do ... while loop

So you will have something like this:

int numfds;
while (!done) {
    deal_with_transfer();
    check_transfer_status();
    cleanup_any_orphan_transfer();
    execute_all_queue_tasks();
    numfds = 0;
    do {
        curl_multi_poll(multi, nullptr, 0, reasonably_small_number, &numfds);
        numfds += check_for_other_conditions();
   } while ( numfds == 0 );
}

Select the timeout to be a reasonable number (ex. 1s, 10s, 60s ...) that will allow you to break the pooling forcefully, at the same time, not to drain the battery.

I am adding the check_for_other_conditions() so you can use is to check additional conditions. By example the size of the task queue, assuming that there are situations in which the curl_multi_poll() can miss an event, although the event occurred, this extra check can help to break the loop and start executing the tasks.

like image 162
jordanvrtanoski Avatar answered Nov 18 '22 18:11

jordanvrtanoski