In order to find the country for a specific IP address, I'm using the Maxmind IP address database. I've downloaded the database and imported it into Google BigQuery, so I can query it. In a separate table, I'm capturing IP-addresses from certain events in our systems. I would now like to join those two data sources.
The columns in the Maxmind database are as follows:
The columns in my event table are:
As documented here (https://dev.maxmind.com/geoip/legacy/csv/) there is a way to get an integer representation of the ip address, so I can use it to query the ip address and retrieve the country_code or country_name.
I have now constructed the following query:
SELECT
p.*,
g.country_code AS country_code
FROM
`dev.event_v1` p
INNER JOIN
`dev.geo_ip_countries` g
ON
SAFE_CAST(SPLIT(p.ip_address, ".")[OFFSET(0)] AS NUMERIC)*16777216 +
SAFE_CAST(SPLIT(p.ip_address, ".")[OFFSET(1)] AS NUMERIC)*65536 +
SAFE_CAST(SPLIT(p.ip_address, ".")[OFFSET(2)] AS NUMERIC)*256 +
SAFE_CAST(SPLIT(p.ip_address, ".")[OFFSET(3)] AS NUMERIC)
BETWEEN
SAFE_CAST(g.start_ip_num AS INT64)
AND
SAFE_CAST(g.end_ip_num AS INT64)
LIMIT 100
And however this works when using a limit, it doesn't for constructing a view.
Two questions: 1. Is there a way to simplify the query 2. Google BigQuery throws an error when I try to return a large result set:
Error: Query exceeded resource limits. 28099.974050246612 CPU seconds were used, and this query must use less than 5600.0 CPU seconds.
Any help is appreciated!
Solution Rewriting the query to the following worked and solved the resource limit issue too:
SELECT
p.*,
g.country_code
FROM
`dev.event_v1` p
INNER JOIN
`dev.geo_ip_countries` g
ON
NET.IP_TRUNC(NET.SAFE_IP_FROM_STRING(p.ip_address),16) = NET.IP_TRUNC(NET.SAFE_IP_FROM_STRING(g.start_ip_range),16)
WHERE
NET.SAFE_IP_FROM_STRING(p.ip_address)
BETWEEN
NET.SAFE_IP_FROM_STRING(g.start_ip_range)
AND
NET.SAFE_IP_FROM_STRING(g.end_ip_range)
Try below (BigQuery Standard SQL)
#standardSQL
SELECT
p.* EXCEPT(ip_address_num),
g.country_code AS country_code
FROM (
SELECT *,
SAFE_CAST(SPLIT(p.ip_address, ".")[OFFSET(0)] AS NUMERIC)*16777216 +
SAFE_CAST(SPLIT(p.ip_address, ".")[OFFSET(1)] AS NUMERIC)*65536 +
SAFE_CAST(SPLIT(p.ip_address, ".")[OFFSET(2)] AS NUMERIC)*256 +
SAFE_CAST(SPLIT(p.ip_address, ".")[OFFSET(3)] AS NUMERIC) ip_address_num
FROM `dev.event_v1`
) p
INNER JOIN (
SELECT
SAFE_CAST(g.start_ip_num AS INT64) start_ip_num,
SAFE_CAST(g.end_ip_num AS INT64) end_ip_num,
country_code
FROM `dev.geo_ip_countries`
) g
ON ip_address_num BETWEEN g.start_ip_num AND g.end_ip_num
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With