Is there a column or combination of columns within BigQuery GA4 events data that uniquely identifies a record? I'm asking because the event_params
, user_properties
and items
fields contain nested json and when parsing them, I'm trying to ensure the related records are kept together.
I've appended a picture to the description below which shows the output when checking for multiple rows using those 4 columns you mentioned.
Notice how, for those event_name
, event_timestamp
, user_pseudo_id
, event_bundle_sequence_id
values, there are multiple rows with differing event_parms values?
Should we be combining the event_parms data from the multiple rows into one? I guess we were expecting all related event_parms values to display within one event_name
, event_timestamp
, user_pseudo_id
, event_bundle_sequence_id
record, if that makes sense?
select d.cnt,
g.event_date
, g.event_timestamp
, g.event_name
, g.event_previous_timestamp
, g.event_value_in_usd
, g.event_bundle_sequence_id
, g.event_server_timestamp_offset
, g.user_id
, g.user_pseudo_id
, g.user_first_touch_timestamp
, g.event_params
from dbo.GA4Events g
inner join (
select
event_date
, event_timestamp
, event_name
, event_previous_timestamp
, event_value_in_usd
, event_bundle_sequence_id
, event_server_timestamp_offset
, user_id
, user_pseudo_id
, user_first_touch_timestamp
, count(*) as cnt
from dbo.GA4Events
group by
event_date
, event_timestamp
, event_name
, event_previous_timestamp
, event_value_in_usd
, event_bundle_sequence_id
, event_server_timestamp_offset
, user_id
, user_pseudo_id
, user_first_touch_timestamp
having count(*) > 1) d on
isnull(g.event_date,'') = isnull(d.event_date,'')
and isnull(g.event_timestamp,'') = isnull(d.event_timestamp,'')
and isnull(g.event_name ,'') = isnull(d.event_name,'')
and isnull(g.event_previous_timestamp,'') = isnull(d.event_previous_timestamp,'')
and isnull(g.event_value_in_usd,'') = isnull(d.event_value_in_usd,'')
and isnull(g.event_bundle_sequence_id,'') = isnull(d.event_bundle_sequence_id,'')
and isnull(g.event_server_timestamp_offset,'') = isnull(d.event_server_timestamp_offset,'')
and isnull(g.user_id,'') = isnull(d.user_id,'')
and isnull(g.user_pseudo_id,'') = isnull(d.user_pseudo_id,'')
and isnull(g.user_first_touch_timestamp,'') = isnull(d.user_first_touch_timestamp,'')
order by
g.event_date
, g.event_timestamp
, g.event_name
, g.event_previous_timestamp
, g.event_value_in_usd
, g.event_bundle_sequence_id
, g.event_server_timestamp_offset
, g.user_id
, g.user_pseudo_id
, g.user_first_touch_timestamp
Would you try below ? Documentation says,
The user fields contain information that uniquely identifies the user associated with the event.
Field name | Data type | Description |
---|---|---|
user_id | STRING | The user ID set via the setUserId API. |
user_pseudo_id | STRING | The pseudonymous id (e.g., app instance ID) for the user. |
user_first_touch_timestamp | INTEGER | The time (in microseconds) at which the user first opened the app or visited the site. |
The event fields contain information that uniquely identifies an event.
Field name | Data type | Description |
---|---|---|
event_date | STRING | The date when the event was logged (YYYYMMDD format in the registered timezone of your app). |
event_timestamp | INTEGER | The time (in microseconds, UTC) when the event was logged on the client. |
event_previous_timestamp | INTEGER | The time (in microseconds, UTC) when the event was previously logged on the client. |
event_name | STRING | The name of the event. |
event_value_in_usd | FLOAT | The currency-converted value (in USD) of the event's "value" parameter. |
event_bundle_sequence_id | INTEGER | The sequential ID of the bundle in which these events were uploaded. |
event_server_timestamp_offset | INTEGER | Timestamp offset between collection time and upload time in micros. |
Google Analytics added batch_event_index
in the schema on July 16, 2024. With the addition of this, the combination of these five fields should be unique for each record in the GA4 BigQuery event export table: event_name
, event_timestamp
, user_pseudo_id
, event_bundle_sequence_id
, batch_event_index
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With