Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting based on string and null

Here is my sample data:

id     FirstName      LastName     HouseNo     MyCount
1      A                  C          1-1         2
2      B                  C          1-1         2
4      D                  A                      3
5      F                  A                      3
6      J                  A                      3
7      Q                  X          1-2         3
8      D                  X          1-2         3
9      D                  X          1-2         3
10     A                  C          1-3         3
11     B                  C          1-3         3
12     C                  C          1-3         3
14     F                  K                      2
15     J                  K                      2
16     Q                  X          1-5         1

With the above data I want to take count of records with the same HouseNo and LastName.

For this I am using

SELECT COUNT(ID) AS _COUNT FROM MYTABLE GROUP BY LASTNAME, HOUSENO

but the statement above has one issue. In the data some of the records don't have HouseNo. In the example above ID 4,5,6 and 14,15 don't have HouseNo. So, the above statement is returning 5 but it should return 3 and 2 separately.

Main Aim

  1. Take the count based on LastName and HouseNo
  2. Take the count of those records that don't have HouseNo (They will comes in series).
  3. The coming count should be updated in MyCount

How do I get this count?

Edit For Bounty:

Sample Data

id  FirstName   LastName    HouseNo     MyCount     CountId
1   Imran       Khan        1-1         
2   Waseem      Khan        1-1         
3   Rihan       Khan        1-1         
4   Moiz        Shaikh      1-2         
5   Zbair       Shaikh      1-2         
6   Sultan      Shaikh      1-2         
7   Zaid        Khan                    
10  Parvez      Patel       1-3         
11  Ahmed       Patel       1-3         
12  Rahat       Syed        1-4         
13  Talha       Khan                    
14  Zia         Khan                    
15  Arshad      Patel       1-3         
16  Samad       Patel       1-3         
17  Raees       Syed        1-4         
18  Azmat       Khan                    
19  Imran       Khan                    

Expected Result :

id  FirstName   LastName    HouseNo     MyCount     CountId
1   Imran       Khan        1-1         3           1
2   Waseem      Khan        1-1         3           1
3   Rihan       Khan        1-1         3           1
4   Moiz        Shaikh      1-2         3           2
5   Zbair       Shaikh      1-2         3           2
6   Sultan      Shaikh      1-2         3           2
7   Zaid        Khan                    1           3
10  Parvez      Patel       1-3         2           4   
11  Ahmed       Patel       1-3         2           4
12  Rahat       Syed        1-4         1           5   
13  Talha       Khan                    2           6
14  Zia         Khan                    2           6   
15  Arshad      Patel       1-3         2           7   
16  Samad       Patel       1-3         2           7
17  Raees       Syed        1-4         1           8   
18  Azmat       Khan                    2           9
19  Imran       Khan                    2           9   
  1. In the sample data MyCount and CountId are blank and should be filled.
  2. MyCount will be based on HouseNo and LastName, please see ID 1 to 3, its last name is khan with house no 1-1 so the MyCount of ID 1 to 3 will be 3, and CountId will be 1.
  3. In the sample data there are many records that don't have HouseNo, so for that case same last name in a series will be counted. Please see ID 7, its count will be 1. See also ID 18 and 19, its count will be 2.
  4. CountId is a Serial no of id count. Please see ID 1 to 3, it is 1 due to same house no and same last name.
like image 821
Imran Ali Khan Avatar asked Dec 19 '22 13:12

Imran Ali Khan


1 Answers

It looks like the main confusion is caused by your SQL statement at the beginning of the question where you simply GROUP BY LASTNAME, HOUSENO.

If you wanted a simple grouping, your query would be correct. But, then you show us a more detailed sample data with expected result and it becomes clear that you want not just a grouping (which doesn't care about the order of rows in the data), but you want to group rows based on their sequence.

It is a classic problem called gaps-and-islands. In SQL Server 2008 it can be done using few calls to ROW_NUMBER function.

Sample data

DECLARE @T TABLE 
    (id int PRIMARY KEY
    ,FirstName nvarchar(50)
    ,LastName nvarchar(50)
    ,HouseNo nvarchar(50)
    ,MyCount int
    ,CountId int);

INSERT INTO @T (id, FirstName, LastName, HouseNo) VALUES
(1 , 'Imran ', 'Khan  ', '1-1'),
(2 , 'Waseem', 'Khan  ', '1-1'),
(3 , 'Rihan ', 'Khan  ', '1-1'),
(4 , 'Moiz  ', 'Shaikh', '1-2'),
(5 , 'Zbair ', 'Shaikh', '1-2'),
(6 , 'Sultan', 'Shaikh', '1-2'),
(7 , 'Zaid  ', 'Khan  ',  NULL),
(10, 'Parvez', 'Patel ', '1-3'),
(11, 'Ahmed ', 'Patel ', '1-3'),
(12, 'Rahat ', 'Syed  ', '1-4'),
(13, 'Talha ', 'Khan  ',  NULL),
(14, 'Zia   ', 'Khan  ',  NULL),
(15, 'Arshad', 'Patel ', '1-3'),
(16, 'Samad ', 'Patel ', '1-3'),
(17, 'Raees ', 'Syed  ', '1-4'),
(18, 'Azmat ', 'Khan  ',  NULL),
(19, 'Imran ', 'Khan  ',  NULL);

SELECT query

WITH
CTE_RN
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,ROW_NUMBER() OVER (PARTITION BY LastName, HouseNo ORDER BY ID) AS rn1
        ,ROW_NUMBER() OVER (ORDER BY ID) AS rn2
    FROM @T AS T
)
,CTE_GRoups
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,rn1
        ,rn2
        ,rn2-rn1 AS GroupNumber
        ,COUNT(ID) OVER (PARTITION BY LastName, HouseNo, rn2-rn1) AS NewMyCount
        ,MIN(ID) OVER (PARTITION BY LastName, HouseNo, rn2-rn1) AS GroupMinID
    FROM CTE_RN
)
SELECT
    id
    ,FirstName
    ,LastName
    ,HouseNo
    ,rn1
    ,rn2
    ,GroupNumber
    ,NewMyCount
    ,GroupMinID
    ,DENSE_RANK() OVER (ORDER BY GroupMinID) AS NewCountId
FROM CTE_GRoups
ORDER BY ID;

Result

+----+-----------+----------+---------+-----+-----+-------------+------------+------------+------------+
| id | FirstName | LastName | HouseNo | rn1 | rn2 | GroupNumber | NewMyCount | GroupMinID | NewCountId |
+----+-----------+----------+---------+-----+-----+-------------+------------+------------+------------+
|  1 | Imran     | Khan     | 1-1     |   1 |   1 |           0 |          3 |          1 |          1 |
|  2 | Waseem    | Khan     | 1-1     |   2 |   2 |           0 |          3 |          1 |          1 |
|  3 | Rihan     | Khan     | 1-1     |   3 |   3 |           0 |          3 |          1 |          1 |
|  4 | Moiz      | Shaikh   | 1-2     |   1 |   4 |           3 |          3 |          4 |          2 |
|  5 | Zbair     | Shaikh   | 1-2     |   2 |   5 |           3 |          3 |          4 |          2 |
|  6 | Sultan    | Shaikh   | 1-2     |   3 |   6 |           3 |          3 |          4 |          2 |
|  7 | Zaid      | Khan     | NULL    |   1 |   7 |           6 |          1 |          7 |          3 |
| 10 | Parvez    | Patel    | 1-3     |   1 |   8 |           7 |          2 |         10 |          4 |
| 11 | Ahmed     | Patel    | 1-3     |   2 |   9 |           7 |          2 |         10 |          4 |
| 12 | Rahat     | Syed     | 1-4     |   1 |  10 |           9 |          1 |         12 |          5 |
| 13 | Talha     | Khan     | NULL    |   2 |  11 |           9 |          2 |         13 |          6 |
| 14 | Zia       | Khan     | NULL    |   3 |  12 |           9 |          2 |         13 |          6 |
| 15 | Arshad    | Patel    | 1-3     |   3 |  13 |          10 |          2 |         15 |          7 |
| 16 | Samad     | Patel    | 1-3     |   4 |  14 |          10 |          2 |         15 |          7 |
| 17 | Raees     | Syed     | 1-4     |   2 |  15 |          13 |          1 |         17 |          8 |
| 18 | Azmat     | Khan     | NULL    |   4 |  16 |          12 |          2 |         18 |          9 |
| 19 | Imran     | Khan     | NULL    |   5 |  17 |          12 |          2 |         18 |          9 |
+----+-----------+----------+---------+-----+-----+-------------+------------+------------+------------+

Here I included in the result all intermediary steps, so you can see how it works. The main part is two sets of ROW_NUMBERs. The rn1 sequence restarts for each LastName, HouseNo. It is partitioned by LastName, HouseNo. The rn2 is a simple increasing sequence without gaps. We need it, because original ID defines the order, but can have gaps.

Then we subtract these two sequences and difference gives us the GroupNumber.

Counting the number of elements in a group is simple COUNT, which gives us NewMyCount.

Enumerating groups with sequential numbers without gaps is done in two steps. At first MIN gives an identifier for a group, then DENSE_RANK generates a sequence of NewCountId without gaps.


If you want to actually update original table with the calculated NewMyCount and NewCountId, it is easy to turn the SELECT query above into UPDATE query:

UPDATE query

WITH
CTE_RN
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,ROW_NUMBER() OVER (PARTITION BY LastName, HouseNo ORDER BY ID) AS rn1
        ,ROW_NUMBER() OVER (ORDER BY ID) AS rn2
    FROM @T AS T
)
,CTE_GRoups
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,rn1
        ,rn2
        ,rn2-rn1 AS GroupNumber
        ,COUNT(ID) OVER (PARTITION BY LastName, HouseNo, rn2-rn1) AS NewMyCount
        ,MIN(ID) OVER (PARTITION BY LastName, HouseNo, rn2-rn1) AS GroupMinID
    FROM CTE_RN
)
,CTE_Update
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,rn1
        ,rn2
        ,GroupNumber
        ,NewMyCount
        ,GroupMinID
        ,DENSE_RANK() OVER (ORDER BY GroupMinID) AS NewCountId
    FROM CTE_GRoups
)
UPDATE CTE_Update
SET
    MyCount = NewMyCount
    ,CountId = NewCountId
;

Result

SELECT *
FROM @T
ORDER BY ID;

+----+-----------+----------+---------+---------+---------+
| id | FirstName | LastName | HouseNo | MyCount | CountId |
+----+-----------+----------+---------+---------+---------+
|  1 | Imran     | Khan     | 1-1     |       3 |       1 |
|  2 | Waseem    | Khan     | 1-1     |       3 |       1 |
|  3 | Rihan     | Khan     | 1-1     |       3 |       1 |
|  4 | Moiz      | Shaikh   | 1-2     |       3 |       2 |
|  5 | Zbair     | Shaikh   | 1-2     |       3 |       2 |
|  6 | Sultan    | Shaikh   | 1-2     |       3 |       2 |
|  7 | Zaid      | Khan     | NULL    |       1 |       3 |
| 10 | Parvez    | Patel    | 1-3     |       2 |       4 |
| 11 | Ahmed     | Patel    | 1-3     |       2 |       4 |
| 12 | Rahat     | Syed     | 1-4     |       1 |       5 |
| 13 | Talha     | Khan     | NULL    |       2 |       6 |
| 14 | Zia       | Khan     | NULL    |       2 |       6 |
| 15 | Arshad    | Patel    | 1-3     |       2 |       7 |
| 16 | Samad     | Patel    | 1-3     |       2 |       7 |
| 17 | Raees     | Syed     | 1-4     |       1 |       8 |
| 18 | Azmat     | Khan     | NULL    |       2 |       9 |
| 19 | Imran     | Khan     | NULL    |       2 |       9 |
+----+-----------+----------+---------+---------+---------+
like image 71
Vladimir Baranov Avatar answered Jan 02 '23 12:01

Vladimir Baranov