Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Fit data pattern change since Google fit App update, implementation apparently broken

We have identified in our user base that since the last google fit app update there's been a dramatic drop in data, and since it began we have tried to identify the issue in our code. Giving the timing, we thought the version we were using ( 18.0 at the time ) was the problem. Upgrading to SDK 20.0 did not improve the results, but stopped the data from stalling. currently we can assume 50-60% of the users connected to google fit trough the SDK are no longer corretcly retrieving data according to the (previously working) implementation. They are not lost, and they still send some bits here and there, but it's no longer what it used to be.

This graph showcases the timeline of events that lead us the conclusion that one of the sides must be doing something wrong.

google fit data loss graph

The code examples below have been stripped of most data processing code for readability, but it is there.

Our Fitness client requests FitnessOptions.ACCESS_READ for all the types mentioned below, plus others depending on the App, every time it's initialised, either in foreground or background, making sure we only request those accepted by the user.

We can confirm the next data types no longer return any value when requesting daily total or local device daily total, but do return data chunks of the same period when requested in a non-aggregated read:

DataType.TYPE_STEP_COUNT_DELTA
DataType.TYPE_CALORIES_EXPENDED
DataType.TYPE_HEART_RATE_BPM

we also tried changing those possible to their aggregate counterparts, with no avail:

DataType.AGGREGATE_CALORIES_EXPENDED
DataType.AGGREGATE_STEP_COUNT_DELTA

This is our current getDailyTotal implementation, working before the update, and is written straight out as the examples on the developer site show:

    Fitness.getHistoryClient(context, account)
                .readDailyTotal(type)
                .addOnSuccessListener {
                    Logger.i("${type.name}::DailyTotal::Success")
                    onResponse(it)
                }

This currently returns 0 no matter the time of the day it's asked.

Then we have our complementary code, which emulates what getDailyTotal does in the insides, also as per developer site examples: from: day start at 00:00:00, UTC+1 to: day end at 23:59:59, UTC+1 type: any DataType.

    val readRequest = DataReadRequest.Builder()
                    .enableServerQueries()
                    .aggregate(type)
                    .bucketByTime(1, TimeUnit.DAYS)
                    .setTimeRange(from.time, to.time, TimeUnit.MILLISECONDS)
                    .build()
            val account = GoogleSignIn
                    .getAccountForExtension(context, fitnessOptions!!)
            GFitClient.request(context, account, readRequest) {
                if (it == null) {
                    aggregatedRequestError(type)
                } else {
                    Logger.i(TAG, "Aggregated ${type.name} received.")
                }
            }

The common result here is either 1) a null or empty result, 2) actually getting the result ( in the case of DataType.TYPE_STEP_COUNT_DELTA sometimes it happens ) or 3) a APIException code 5012, this datatype can't be aggregated.

We are using the single aggregate since the double, that could be called by (type, type.aggregate) has been deprecated since a couple versions already, although some developer site examples still use it.

The use ( or not ) of .enableServerQueries() does not modify the final result.

Finally we assume the worst and we request anything for that day no matter what and then we aggregate manually. This usually reports results, wether others did not. sadly those results are never conclusive enough to feel comfortable.

    val readRequest = DataReadRequest.Builder()
                        .enableServerQueries()
                        .read(type)
                        .bucketByTime(1, TimeUnit.DAYS)
                        .setTimeRange(from.time, to.time, TimeUnit.MILLISECONDS)
                        .build()
                val account = GoogleSignIn
                        .getAccountForExtension(context, fitnessOptions!!)

This tends to work but the manual processing of the data is complex given the intricate nested nature of datasets, buckets and the overall dataset structure.

We have also noticed issues when retrieving data that is clearly seen on the fit app, but doesn't appear on the SDK, for example, Huawei Health activities appearing on the App while the SDK returns only a subset of them, and the other way around, the SDK returning us data ( for example, a whole night worth of sleep sessions ( light, rem, deep... ), while the fit app shows that same sleep as a single Sleep block without any sessions.

Sleep session as shown in a third party app, with the same data the SDK returns us: third party sleep session

The same sleep session shown in the Google fit app: google fit sleep session

As far as the documentation says:

For the Android APIs, read by data type and the Fit platform will return the merged stream by default. This automatically includes all data available to your app, including data written by other apps. You won't be able to see a list of which apps or devices the data came from with the Android APIs.

We believe that the merged stream is not behaving properly, not in real time ( which could be explained by a delay between the App showing the data directly from the backend and the SDK not having the data yet written ), but also not in a matter of minutes or hours of difference, sometimes never showing up.

To understand how we retrieve this data, we have a background WorkerManager CouroutineJob that every once in a while ( when the system lets so, given doze mode permissions, but what we would prefer (and ask so via WorkerManager configuration ) is once every hour or couple of hours, to keep the data up to date with the one displayed in the fitness app ), we request data from last update to last day's end day or/and we request today's daily total ( or up to the current time, depends on how far the "doesn't work" funnel we go, and also on the last update's date).

  • Is there anything wrong in our implementation?
  • has google fit changed the way it reports its data to connected apps?
  • can we somehow get more truthful data?
  • is there any way to request the same data differently, more efficiently? we are deeply interested mostly in getting daily summaries, totals and averages, rather than time buckets / sessions. We request both but they go to different data funnels covering different use cases.
like image 275
CptEric Avatar asked Nov 06 '22 03:11

CptEric


1 Answers

There is no answer yet.

Our solution has ended up having a rowdy succession of checks for data and on every failure we try a different way.

like image 181
CptEric Avatar answered Nov 15 '22 04:11

CptEric