Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identifying Duplicate Values - Google BigQuery

I'm simply trying to identify duplicate values within BigQuery.

My code looks like:

SELECT
  address,
  title_1,
  COUNT(*)
FROM
  `target.querytable`
GROUP BY
  1,2
HAVING
  COUNT (*) > 1

I'm trying to identify duplicate records in the title_1 field and select their corresponding url from the address column along with the sum of the duplication. Ideally the output would look like:

enter image description here

like image 971
Jordan Lowry Avatar asked Dec 05 '25 04:12

Jordan Lowry


1 Answers

Below is for BigQuery Standard SQL

#standardSQL
SELECT * FROM (
  SELECT *, COUNT(1) OVER(PARTITION BY title_1) dup_count
  FROM `target.querytable`
)
WHERE dup_count > 1
like image 132
Mikhail Berlyant Avatar answered Dec 09 '25 00:12

Mikhail Berlyant



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!