I'm working with a dataset that looks like this:
| Host | Risk | Name |
| 10.1.1.1 | Critical | ValueA |
| 10.1.1.1 | Critical | ValueA |
| 10.1.1.1 | Critical | ValueA |
| 10.1.1.1 | Critical | ValueA |
| 10.1.1.1 | Critical | ValueA |
| 10.1.1.1 | Critical | ValueA |
| 10.1.1.1 | High | ValueB |
| 10.1.1.1 | High | ValueB |
| 10.1.1.1 | High | ValueB |
| 10.1.1.1 | Critical | ValueC |
| 10.1.1.1 | Critical | ValueC |
| 10.1.1.1 | Critical | ValueC |
| 10.1.1.1 | Critical | ValueC |
I'm trying to figure out a query that will produce a summary that looks like this:
| Host | Critical | High |
| 10.1.1.1 | 2 | 1 |
Critical has a "2" underneath because there are only 2 DISTINCT values of the Name field ("ValueA" and "ValueC"). High has a "1" underneath because there is only 1 distinct Name value ("ValueB"). The tricky part for me is that I'm not trying to count the number of rows, but just the distinct values that match. In case you're wondering, the data is duplicated because there are other columns that contain unique values, but they're irrelevant to this query.
The closest I was able to get on my own is as follows, but that only produced the "Critical" column, and I can't figure out how to add the logic to get the "High" column:
select Host, COUNT(DISTINCT Name) as Critical
from [table]
WHERE Risk = 'Critical'
group by 1;
Any advice would be much appreciated. I tried a "COUNTIF" function, but kept getting the error "Unrecognized function countif," which seemed odd since "COUNTIF" is listed in the BigQuery documentation (https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#countif). Also tried to make CASE work, but didn't make a lot of progress.
Thanks!
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT '10.1.1.1' Host, 'Critical' Risk, 'ValueA' Name UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueA' UNION ALL
SELECT '10.1.1.1', 'High', 'ValueB' UNION ALL
SELECT '10.1.1.1', 'High', 'ValueB' UNION ALL
SELECT '10.1.1.1', 'High', 'ValueB' UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueC' UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueC' UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueC' UNION ALL
SELECT '10.1.1.1', 'Critical', 'ValueC'
)
SELECT
Host,
COUNT(DISTINCT IF(Risk='Critical', Name, NULL)) Critical,
COUNT(DISTINCT IF(Risk='High', Name, NULL)) High
FROM `project.dataset.table`
GROUP BY Host
with result
Row Host Critical High
1 10.1.1.1 2 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With