Syntax to count the number of UNIQUE matching values

Question

I'm working with a dataset that looks like this:

    | Host      | Risk           | Name      |
    | 10.1.1.1  | Critical       | ValueA    |
    | 10.1.1.1  | Critical       | ValueA    |
    | 10.1.1.1  | Critical       | ValueA    |
    | 10.1.1.1  | Critical       | ValueA    |
    | 10.1.1.1  | Critical       | ValueA    |
    | 10.1.1.1  | Critical       | ValueA    |
    | 10.1.1.1  | High           | ValueB    |
    | 10.1.1.1  | High           | ValueB    |
    | 10.1.1.1  | High           | ValueB    |
    | 10.1.1.1  | Critical       | ValueC    |
    | 10.1.1.1  | Critical       | ValueC    |
    | 10.1.1.1  | Critical       | ValueC    |
    | 10.1.1.1  | Critical       | ValueC    |

I'm trying to figure out a query that will produce a summary that looks like this:

    | Host      | Critical     | High      |
    | 10.1.1.1  | 2            | 1         |

Critical has a "2" underneath because there are only 2 DISTINCT values of the Name field ("ValueA" and "ValueC"). High has a "1" underneath because there is only 1 distinct Name value ("ValueB"). The tricky part for me is that I'm not trying to count the number of rows, but just the distinct values that match. In case you're wondering, the data is duplicated because there are other columns that contain unique values, but they're irrelevant to this query.

The closest I was able to get on my own is as follows, but that only produced the "Critical" column, and I can't figure out how to add the logic to get the "High" column:

    select Host, COUNT(DISTINCT Name) as Critical
    from [table]
    WHERE Risk = 'Critical'
    group by 1;

Any advice would be much appreciated. I tried a "COUNTIF" function, but kept getting the error "Unrecognized function countif," which seemed odd since "COUNTIF" is listed in the BigQuery documentation (https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#countif). Also tried to make CASE work, but didn't make a lot of progress.

Thanks!

Mikhail Berlyant · Accepted Answer

Below is for BigQuery Standard SQL

#standardSQL
WITH `project.dataset.table` AS (
  SELECT '10.1.1.1' Host, 'Critical' Risk, 'ValueA' Name UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
  SELECT '10.1.1.1', 'High', 'ValueB' UNION ALL
  SELECT '10.1.1.1', 'High', 'ValueB' UNION ALL
  SELECT '10.1.1.1', 'High', 'ValueB' UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueC' UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueC' UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueC' UNION ALL
  SELECT '10.1.1.1', 'Critical', 'ValueC' 
)
SELECT 
  Host, 
  COUNT(DISTINCT IF(Risk='Critical', Name, NULL)) Critical,
  COUNT(DISTINCT IF(Risk='High', Name, NULL)) High
FROM `project.dataset.table`
GROUP BY Host

with result

Row Host        Critical    High     
1   10.1.1.1    2           1

Syntax to count the number of UNIQUE matching values

Tags:

sql

google-bigquery

Josh Rogan

1 Answers

Mikhail Berlyant

Recent Activity

Donate For Us

Syntax to count the number of UNIQUE matching values

Tags:

sql

google-bigquery

Josh Rogan

1 Answers

Mikhail Berlyant

Related questions

Recent Activity

Donate For Us