Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count a feature x if y condition is true (bigquery)

I am new in machine learning and I am running one of my first machine learning problems. I do a lot of research by myself but could not find the solution for this question.

I am creating new variables from my dataset in Bigquery. One of the variables consists in count a variable (x) if a condition in variable (y) is met. I cannot use WHERE or HAVING since the condition is only related to the calculation of this specific variable.

The code is more or less like this:

COUNT(DISTINCT sessionid IF(date > “2018-06-01” && date < “2018-06-30”)) 
AS sessions_lastmonth

I know that this code is not correct and could not be but is just a way to express more or less what I need. The goal is to know the number of sessions in a certain period. I cannot use the count for date because in one day you can have more than one session and not every row is a different session (a session can have several lines because the user can go through many pages in the same session).

Thank you!

like image 564
Bruna Aranha Avatar asked Dec 05 '22 11:12

Bruna Aranha


1 Answers

Below for BigQuery Standard SQL

COUNT(DISTINCT IF(date >= '2018-06-01' AND date <= '2018-06-30', sessionid, NULL)) AS sessions_lastmonth
like image 187
Mikhail Berlyant Avatar answered Dec 16 '22 22:12

Mikhail Berlyant