Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Firebase BigQuery server offset time

Background: I'm having the Firebase analytics data exported to BigQuery. And I'm using cron jobs to crunch data in BigQuery for getting insight.

Problem: To be able to only crunch delta data i.e. the data that has arrived since last time I ran my cron job I need a way to figure out the time when the data arrived at server, since the event_timestamp is generated at client and can be cached at client before sent.

Insights: I have laborated with event_server_timestamp_offset (offset) which I thought I could use together with event_timestamp. But I was expecting the offset to only be positive but it can also be negative. And when I look at the MAX and MIN for the offset in the entire exported Firebase analytics dataset and re-calculate it to years instead of microseconds I can get more than 18 years offset.

Query:

SELECT
  MAX(event_server_timestamp_offset)/(1000000*60*60*24) max_days, 
  MIN(event_server_timestamp_offset)/(1000000*60*60*24) min_days
FROM
  `analytics_<project_id>.events_*`

Result: max_days=6784.485790436655, min_days=-106.95833052104166

Question:

  • How can I figure out the server arrival time for my Firebase exported BigQuery data so I can run cron jobs crunching only delta data?
  • Can I use event_server_timestamp_offset together with event_timestamp? If so, how?

Best regards, Daniel

like image 715
Daniel Petersson Avatar asked Nov 10 '18 12:11

Daniel Petersson


1 Answers

Surprisingly enough, this question not having a clear answer for almost 2 years, I am leaving here the answers I got from the Firebase support team. The format is - question asked followed by the answer of the support staff.

Q1. event_date - The date on which the event was logged (YYYYMMDD format in the registered timezone of your app). Does it mean that the event occurred on that date, or that it was actually collected on that date?

A1. Per documentation, event_date refers to the date on which the event is logged/occurred. Note that event_date is based on the Analytics timezone setting of your Firebase Project.

Q2. event_timestamp - The time (in microseconds, UTC) at which the event was logged on the client. Is it safe to assume that this is the exact timestamp the event occurred on client side (in the app timezone of course)?

A2. Yes, this is based on the device timezone setting. However, event_timestamp may be skewed if the device time is incorrect.

Q3. event_server_timestamp_offset - Timestamp offset between collection time and upload time in micros. This is the main field that causes all the misunderstandings - in our BigQuery table for the year 2020 this field takes values in a range between 5 days and -2 days. I mean how can the colleciton time be 2 days ahead?

A3. The event_server_timestamp_offset field in the export schema is the time difference between when the event took place and the app uploaded it to our server. In other words, this is the estimated difference between the client's local time and the actual time, according to our servers. The values of this field are usually positive, but can be negative as well if the device time setting is incorrect.

Q4. One last question is very important - can we ignore the event_server_timestamp_offset field and just rely on event_timestamp - as the exact date and time the event occurred on the clientside (not collected, not uplaoded, etc). If not- please explain how we can get the exact datetime of the event occuring on the clientside. But if yes please let me know why do we need the event_server_timestamp_offset field?

A4. Yes, you may actually ignore it and use event_timestamp alone. However, as mentioned earlier, event_timestamp could be off if the device time setting incorrect, but it shouldn't really affect the bigger picture of your analytics data as cases like this are usually one-off.

like image 119
Dark Templar Avatar answered Nov 16 '22 17:11

Dark Templar