Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding each customer group's most recent account

I have a table that contains customers information. Each customer is assigned a Customer ID (their SSN) that they retain as they open more accounts. Two customers may be on the same account, each with their own ID. The account numbers are not ordered by date.

I would like to find the most recent account of each customer or group of customers. If two customers have ever been on an account together, I want to return the most recent account either customer has been on.

Here is a sample table with some of the possible cases.

Example table ACCT:

acctnumber  date            Cust1ID     Cust2ID 
10000       '2016-02-01'    1110        NULL    --Case0-customer has only ever had
                                                --one account

10001       '2016-02-01'    1111        NULL    --Case1-one customer has multiple
10050       '2017-02-01'    1111        NULL    --accounts
400050      '2017-06-01'    1111        NULL
10089       '2017-12-08'    1111        NULL

10008       '2016-02-01'    1120        NULL    --Case2-customer has account(s) and later
10038       '2016-04-01'    1120        NULL
10058       '2017-02-03'    1120        1121    --gets account(s) with another customer

10002       '2016-02-01'    1112        NULL    --Case3-customer has account(s) and later
10052       '2017-02-02'    1113        1112    --becomes the second customer on another
10152       '2017-05-02'    1113        1112    --account(s)

10003       '2016-02-02'    1114        1115    --Case4-customer and second customer
7060        '2017-02-04'    1115        1114    --switch which is first and second

10004       '2016-02-02'    1116        1117    --Case5-second customer later gets
10067       '2017-02-05'    1117        NULL    --separate account(s)
10167       '2018-02-05'    1117        NULL

50013       '2016-01-01'    2008        NULL    --Case5b -customer has account(s) & later
50014       '2017-02-02'    2008        2009    --gets account(s) with second customer &
50015       '2017-04-04'    2008        NULL    --later still first customer gets
100015      '2018-05-05'    2008        NULL    --separate account(s)

30005       '2015-02-01'    1118        NULL    --Case6-customer has account(s) 
10005       '2016-02-01'    1118        NULL
10054       '2017-02-02'    1118        1119    --gets account(s) with another
40055       '2017-03-03'    1118        1119
10101       '2017-04-04'    1119        NULL    --who later gets separate account(s)
10201       '2017-05-05'    1119        NULL
30301       '2017-06-06'    1119        NULL
10322       '2018-01-01'    1119        NULL

10007       '2016-02-01'    1122        1123    --Case7-customers play musical chairs
10057       '2017-02-03'    1123        1124
10107       '2017-06-02'    1124        1125

50001       '2016-01-01'    2001        NULL    --Case8a-customers with account(s)
50002       '2017-02-02'    2001        2002    --together each later get separate
50003       '2017-03-03'    2001        NULL    --account(s)
50004       '2017-04-04'    2002        NULL

50005       '2016-01-01'    2003        NULL    --Case8b-customers with account(s)
50006       '2017-02-02'    2003        2004    --together each later get separate
50007       '2017-03-03'    2004        NULL    --account(s)
50008       '2017-04-04'    2003        NULL
50017       '2018-03-03'    2004        NULL
50018       '2018-04-04'    2003        NULL

50009       '2016-01-01'    2005        NULL    --Case9a-customer has account(s) & later
50010       '2017-02-02'    2005        2006    --gets account(s) with a second customer
50011       '2017-03-03'    2005        2007    --& later still gets account(s) with a
                                                --third customer

50109       '2016-01-01'    2015        NULL    --Case9b starts the same as Case9a, but
50110       '2017-02-02'    2015        2016    
50111       '2017-03-03'    2015        2017    
50112       '2017-04-04'    2015        NULL    --after all accounts with other customers
50122       '2017-05-05'    2015        NULL    --are complete, the original primary
                                                --customer begins opening individual
                                                --accounts again

Desired Results:

acctnumber  date            Cust1ID     Cust2ID 
10000       '2016-02-01'    1110        NULL    --Case0    
10089       '2017-12-08'    1111        NULL    --Case1
10058       '2017-02-03'    1120        1121    --Case2
10152       '2017-05-02'    1113        1112    --Case3
7060        '2017-02-04'    1115        1114    --Case4
10167       '2018-02-05'    1117        NULL    --Case5
100015      '2018-05-05'    2008        NULL    --Case5b
10322       '2018-01-01'    1119        NULL    --Case6
10107       '2017-06-02'    1124        1125    --Case7
50003       '2017-03-03'    2001        NULL    --Case8a result 1
50004       '2017-04-04'    2002        NULL    --Case8a result 2
50017       '2018-03-03'    2004        NULL    --Case8b result 1
50018       '2018-04-04'    2003        NULL    --Case8b result 2
50011       '2017-03-03'    2005        2007    --Case9a
50122       '2017-05-05'    2015        NULL    --Case9b

Alternatively, I would accept Case 7 outputting the two separate customer groups:

10007       '2016-02-01'    1122        1123    --Case7 result 1
10107       '2017-06-02'    1124        1125    --Case7 result 2

Because Cases 8a & 8b would represent the company acknowledging the customers are worthy of holding separate accounts, we would want to then consider their group as splitting, so it has separate sets of results.

Also, in most scenarios the customers have many accounts, and mix and matching the above cases overtime is common. For example, a single customer can have five accounts (Case 1), then later opens one or more accounts with another customer (Case 3) sometimes switching the primary account holder (Case 4) then afterwards the first customer begins opening individual accounts again (Case 5b).


I have attempted joining the table to a copy of itself whenever acctnumbers are unique and any of the Cust IDs match. However, this removes customers who have only had one account so I added a union of cust that have no matches on the custid or account number and groups by custid.

Unfortunately, the second piece does not only include custids from case 0 and there are some custids which are excluded all together that shouldn't be.

select
    max(date1) as date,
    cust1id1 as cust1id
from
(
select
    acctnumber as [acctnumber1],
    date as [date1],
    cust1id as [cust1id1],
    cust2id as [cust2id1]
from 
    acct
) t1
join
(
select
    acctnumber as [acctnumber2],
    date as [date2],
    cust1id as [cust1id2],
    cust2id as [cust2id2]
from 
    acct
) t2
on t1.date1 > t2.date2 and
(t1.cust1id1 = t2.cust1id2 or
t1.cust1id1 = t2.cust2id2 or
t1.cust2id1 = t2.cust2id2)
Group by
cust1id1
union
select
    max(date1) as date,
    cust1id1 as cust1id
from
(
select
    acctnumber as [acctnumber1],
    date as [date1],
    cust1id as [cust1id1],
    cust2id as [cust2id1]
from 
    acct
) t1
join
(
select
    acctnumber as [acctnumber2],
    date as [date2],
    cust1id as [cust1id2],
    cust2id as [cust2id2]
from 
    acct
) t2
on (t1.acctnumber1 != t2.acctnumber2 and
t1.cust1id1 != t2.cust1id2 and
t1.cust1id1 != t2.cust2id2 and
t1.cust2id1 != t2.cust2id2)
group by
cust1id1

Update

Thank you for all the great answers and comments so far. I have been trying out the queries and comparing results.

@VladimirBaranov has brought up a rare case that I had not previously considered in comments to other answers.

Similarly to case 7, it will be a bonus if Case8 is handled, but not expected.

Case 9 is important and the result for 9a and 9b should be handled.

Update 2

I noticed issues with my original set of 7 cases.

In more recent accounts, when a customer is no longer on the account, it was always the second borrower that remained. This was entirely unintentional, you can look at any of those examples and either customer can potentially be the remaining customer on the most recent account.

Also, each case had the minimum number of accounts to display exactly what the case was testing, but this is not common. Usually in each step of each case there can be 5, 10, 15 or more accounts before a customer switches to adding on a second customer, and those two can then have many accounts together.

Reviewing the answers I see many have index, create, update and other clauses specific to being able to edit the database. Unfortunately, I am on the consumer side of this database so I have read only access, and the program I can use to interact with the database automatically rejects them.

like image 566
Golden Ratio Avatar asked Feb 08 '17 16:02

Golden Ratio


People also ask

How do I get the most recent record in SQL?

Here is the syntax that we can use to get the latest date records in SQL Server. Select column_name, .. From table_name Order By date_column Desc; Now, let's use the given syntax to select the last 10 records from our sample table.

How do I find the oldest date in SQL query?

First, create an aggregate query that has two fields: GroupID and RecordDate. Group by the GroupID field, and choose the "Min" option for the RecordDate, to return the earliest date for each query. Now, create a new query, doing a left-join from your original table to your aggregate query, joining on the GroupID field.


3 Answers

I'd like to thank Jeff Breadner for the DDL with sample data.

You'll have to run the below query step-by-step, CTE-by-CTE and examine intermediate results to understand what it does. It assumes that AcctNumber is unique in the given table.

At first I want to find the latest account for each individual customer. It is a simple top-n-per-group query and I'm using a ROW_NUMBER approach here.

CTE_Customers makes a plain list of all individual customers by putting together Cust1ID and Cust2ID. CTE_RN assigns them row numbers. CTE_LatestAccounts gives latest account for each individual customer:

+------------------+------------+--------+
| LatestAcctNumber |  LatestDT  | CustID |
+------------------+------------+--------+
|            10000 | 2016-02-01 |   1110 |
|            10050 | 2017-02-01 |   1111 |
|            10052 | 2017-02-02 |   1112 |
|            10052 | 2017-02-02 |   1113 |
|             7060 | 2017-02-04 |   1114 |
|             7060 | 2017-02-04 |   1115 |
|            10004 | 2016-02-02 |   1116 |
|            10067 | 2017-02-05 |   1117 |
|            10054 | 2017-02-03 |   1118 |
|            10101 | 2017-06-02 |   1119 |
|            10058 | 2017-02-03 |   1120 |
|            10058 | 2017-02-03 |   1121 |
|            10007 | 2016-02-01 |   1122 |
|            10057 | 2017-02-03 |   1123 |
|            10107 | 2017-06-02 |   1124 |
|            10107 | 2017-06-02 |   1125 |
+------------------+------------+--------+

The task is complicated by having customer pairs which "propagates" the latest account to another customer.

Customer pairs are defined in the original table, so CTE_MaxLatestAccounts takes each row from the original table and joins latest accounts to it twice - for Cust1D and Cust2ID. For each pair I'm picking one of the two latest accounts - the most recent one. Thus a customer that belongs to a pair may get an account from its partner.

+---------+---------+-------------+---------------------+
| Cust1ID | Cust2ID | MaxLatestDT | MaxLatestAcctNumber |
+---------+---------+-------------+---------------------+
|    1110 | NULL    | 2016-02-01  |               10000 |
|    1111 | NULL    | 2017-02-01  |               10050 |
|    1111 | NULL    | 2017-02-01  |               10050 |
|    1120 | NULL    | 2017-02-03  |               10058 |
|    1120 | 1121    | 2017-02-03  |               10058 |
|    1112 | NULL    | 2017-02-02  |               10052 |
|    1113 | 1112    | 2017-02-02  |               10052 |
|    1114 | 1115    | 2017-02-04  |                7060 |
|    1115 | 1114    | 2017-02-04  |                7060 |
|    1116 | 1117    | 2017-02-05  |               10067 |
|    1117 | NULL    | 2017-02-05  |               10067 |
|    1118 | NULL    | 2017-02-03  |               10054 |
|    1118 | 1119    | 2017-06-02  |               10101 |
|    1119 | NULL    | 2017-06-02  |               10101 |
|    1122 | 1123    | 2017-02-03  |               10057 |
|    1123 | 1124    | 2017-06-02  |               10107 |
|    1124 | 1125    | 2017-06-02  |               10107 |
+---------+---------+-------------+---------------------+

The MaxLatestAcctNumber here is applicable to both Cust1ID and Cust2ID. The same customer may be listed several times here and we need to choose an entry again with the latest account. Here this is the latest account of a pair, not for individual customer.

The approach is the same as in the beginning. Put both Cust1ID and Cust2ID customers in a list: CTE_CustomersWithLatestAccountFromPair. Assign row numbers in CTE_CustomersWithLatestAccountFromPairRN and pick final account in CTE_FinalAccounts.

+---------------------+
| MaxLatestAcctNumber |
+---------------------+
|               10000 |
|               10050 |
|               10052 |
|               10052 |
|                7060 |
|                7060 |
|               10067 |
|               10067 |
|               10101 |
|               10101 |
|               10058 |
|               10058 |
|               10057 |
|               10107 |
|               10107 |
|               10107 |
+---------------------+

Now we just need to filter the original table and leave only those rows (accounts) that appear in this list. See the final result below.

Sample data

declare @ACCT table (
    AcctNumber int,
    dt date,
    Cust1ID int,
    Cust2ID int
);

insert into @ACCT values 
(10000, '2016-02-01', 1110, null),
(10001, '2016-02-01', 1111, null),
(10050, '2017-02-01', 1111, null),
(10008, '2016-02-01', 1120, null),
(10058, '2017-02-03', 1120, 1121),
(10002, '2016-02-01', 1112, null),
(10052, '2017-02-02', 1113, 1112),
(10003, '2016-02-02', 1114, 1115),
(7060,  '2017-02-04', 1115, 1114),
(10004, '2016-02-02', 1116, 1117),
(10067, '2017-02-05', 1117, null),
(10005, '2016-02-01', 1118, null),
(10054, '2017-02-03', 1118, 1119),
(10101, '2017-06-02', 1119, null),
(10007, '2016-02-01', 1122, 1123),
(10057, '2017-02-03', 1123, 1124),
(10107, '2017-06-02', 1124, 1125);

Query

WITH
CTE_Customers
AS
(
    SELECT
        AcctNumber
        ,dt
        ,Cust1ID AS CustID
    FROM @ACCT
    WHERE Cust1ID IS NOT NULL

    UNION ALL

    SELECT
        AcctNumber
        ,dt
        ,Cust2ID AS CustID
    FROM @ACCT
    WHERE Cust2ID IS NOT NULL
)
,CTE_RN
AS
(
    SELECT
        AcctNumber
        ,dt
        ,CustID
        ,ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY dt DESC) AS rn
    FROM CTE_Customers
)
,CTE_LatestAccounts
-- this gives one row per CustID
AS
(
    SELECT
        AcctNumber AS LatestAcctNumber
        ,dt AS LatestDT
        ,CustID
    FROM CTE_RN
    WHERE rn = 1
)
,CTE_MaxLatestAccounts
AS
(
    SELECT
        A.Cust1ID
        ,A.Cust2ID
        ,CASE WHEN ISNULL(A1.LatestDT, '2000-01-01') > ISNULL(A2.LatestDT, '2000-01-01')
        THEN A1.LatestDT ELSE A2.LatestDT END AS MaxLatestDT
        ,CASE WHEN ISNULL(A1.LatestDT, '2000-01-01') > ISNULL(A2.LatestDT, '2000-01-01')
        THEN A1.LatestAcctNumber ELSE A2.LatestAcctNumber END AS MaxLatestAcctNumber
    FROM
        @ACCT AS A
        LEFT JOIN CTE_LatestAccounts AS A1 ON A1.CustID = A.Cust1ID
        LEFT JOIN CTE_LatestAccounts AS A2 ON A2.CustID = A.Cust2ID
)
,CTE_CustomersWithLatestAccountFromPair
AS
(
    SELECT
        Cust1ID AS CustID
        ,MaxLatestDT
        ,MaxLatestAcctNumber
    FROM CTE_MaxLatestAccounts
    WHERE Cust1ID IS NOT NULL

    UNION ALL

    SELECT
        Cust2ID AS CustID
        ,MaxLatestDT
        ,MaxLatestAcctNumber
    FROM CTE_MaxLatestAccounts
    WHERE Cust2ID IS NOT NULL
)
,CTE_CustomersWithLatestAccountFromPairRN
AS
(
    SELECT
        CustID
        ,MaxLatestDT
        ,MaxLatestAcctNumber
        ,ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY MaxLatestDT DESC) AS rn
    FROM CTE_CustomersWithLatestAccountFromPair
)
,CTE_FinalAccounts
AS
(
    SELECT MaxLatestAcctNumber
    FROM CTE_CustomersWithLatestAccountFromPairRN
    WHERE rn = 1
)
SELECT *
FROM @ACCT AS A
WHERE A.AcctNumber IN (SELECT MaxLatestAcctNumber FROM CTE_FinalAccounts)
;

Result

+------------+------------+---------+---------+
| AcctNumber |     dt     | Cust1ID | Cust2ID |
+------------+------------+---------+---------+
|      10000 | 2016-02-01 |    1110 | NULL    |
|      10050 | 2017-02-01 |    1111 | NULL    |
|      10058 | 2017-02-03 |    1120 | 1121    |
|      10052 | 2017-02-02 |    1113 | 1112    |
|       7060 | 2017-02-04 |    1115 | 1114    |
|      10067 | 2017-02-05 |    1117 | NULL    |
|      10101 | 2017-06-02 |    1119 | NULL    |
|      10057 | 2017-02-03 |    1123 | 1124    |
|      10107 | 2017-06-02 |    1124 | 1125    |
+------------+------------+---------+---------+

This result matches your desired result, except the last case 7.

My query doesn't attempt to follow the chain of linked customers of arbitrary length and is limited to processing one pair at a time. That's why the case 7 result is not one row. The query will always pick row/account with the very last date (10107) and it may also pick account(s) in the middle of the chain. In this case it picked a row 10057, not 10007, because this is a later account for customers 1122 and 1123.


When I looked at the execution plan I saw that the query behind CTE_LatestAccounts is run essentially four times.

It is likely that if you save result of CTE_LatestAccounts into a temp table with proper indexes the overall performance would be better.

Something like this:

DECLARE @LatestAccounts TABLE 
    (LatestAcctNumber int, LatestDT date, CustID int PRIMARY KEY);

WITH
CTE_Customers
AS
(
    SELECT
        AcctNumber
        ,dt
        ,Cust1ID AS CustID
    FROM @ACCT
    WHERE Cust1ID IS NOT NULL

    UNION ALL

    SELECT
        AcctNumber
        ,dt
        ,Cust2ID AS CustID
    FROM @ACCT
    WHERE Cust2ID IS NOT NULL
)
,CTE_RN
AS
(
    SELECT
        AcctNumber
        ,dt
        ,CustID
        ,ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY dt DESC) AS rn
    FROM CTE_Customers
)
,CTE_LatestAccounts
-- this gives one row per CustID
AS
(
    SELECT
        AcctNumber AS LatestAcctNumber
        ,dt AS LatestDT
        ,CustID
    FROM CTE_RN
    WHERE rn = 1
)
INSERT INTO @LatestAccounts (LatestAcctNumber, LatestDT, CustID)
SELECT LatestAcctNumber, LatestDT, CustID
FROM CTE_LatestAccounts;


WITH
CTE_MaxLatestAccounts
AS
(
    SELECT
        A.Cust1ID
        ,A.Cust2ID
        ,CASE WHEN ISNULL(A1.LatestDT, '2000-01-01') > ISNULL(A2.LatestDT, '2000-01-01')
        THEN A1.LatestDT ELSE A2.LatestDT END AS MaxLatestDT
        ,CASE WHEN ISNULL(A1.LatestDT, '2000-01-01') > ISNULL(A2.LatestDT, '2000-01-01')
        THEN A1.LatestAcctNumber ELSE A2.LatestAcctNumber END AS MaxLatestAcctNumber
    FROM
        @ACCT AS A
        LEFT JOIN @LatestAccounts AS A1 ON A1.CustID = A.Cust1ID
        LEFT JOIN @LatestAccounts AS A2 ON A2.CustID = A.Cust2ID
)
,CTE_CustomersWithLatestAccountFromPair
AS
(
    SELECT
        Cust1ID AS CustID
        ,MaxLatestDT
        ,MaxLatestAcctNumber
    FROM CTE_MaxLatestAccounts
    WHERE Cust1ID IS NOT NULL

    UNION ALL

    SELECT
        Cust2ID AS CustID
        ,MaxLatestDT
        ,MaxLatestAcctNumber
    FROM CTE_MaxLatestAccounts
    WHERE Cust2ID IS NOT NULL
)
,CTE_CustomersWithLatestAccountFromPairRN
AS
(
    SELECT
        CustID
        ,MaxLatestDT
        ,MaxLatestAcctNumber
        ,ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY MaxLatestDT DESC) AS rn
    FROM CTE_CustomersWithLatestAccountFromPair
)
,CTE_FinalAccounts
AS
(
    SELECT MaxLatestAcctNumber
    FROM CTE_CustomersWithLatestAccountFromPairRN
    WHERE rn = 1
)
SELECT *
FROM @ACCT AS A
WHERE A.AcctNumber IN (SELECT MaxLatestAcctNumber FROM CTE_FinalAccounts)
;

If you really need to merge/group all linked customers into one row when the length of the chain is arbitrary, you can do it with recursive query like shown, for example, here: How to find all connected subgraphs of an undirected graph

Once you have tagged each customer with some GroupID, find the latest account for each individual customer as in the beginning of this query. Then find the latest account among the group (rather than for the simple pair as in this query).

The query that finds all subgraphs of an undirected graph in the linked question may be quite slow for a large dataset and there are efficient non-set based algorithms to do it.

If you know that the maximum length of the chain can't exceed some number, it is possible to make this recursive query more efficient.

like image 132
Vladimir Baranov Avatar answered Oct 25 '22 04:10

Vladimir Baranov


To apply logic to each subset a good operator to use is the CROSS APPLY operator. This allows us to find the most recent account for each Customer Id.

Setup

DECLARE @Stage TABLE
(
    AcctNumber INT
    ,[Date] DATETIME
    ,Cust1Id INT
   ,Cust2Id INT
)

INSERT INTO @Stage (AcctNumber, [Date] ,Cust1Id ,Cust2Id)
VALUES
(10000,'2.1.16',1110,NULL)
,(10001,'2.1.16',1111,NULL)
,(10050,'2.1.17',1111,NULL)
,(10008,'2.1.16',1120,NULL)
,(10058,'2.3.17',1120,1121)
,(10002,'2.1.16',1112,NULL)
,(10052,'2.2.17',1113,1112)
,(10003,'2.2.16',1114,1115)
,(7060,'2.4.17',1115,1114)
,(10004,'2.2.16',1116,1117)
,(10067,'2.5.17',1117,NULL)
,(10005,'2.1.16',1118,NULL)
,(10054,'2.3.17',1118,1119)
,(10101,'6.2.17',1119,NULL)
,(10007,'2.1.16',1122,1123)
,(10057,'2.3.17',1123,1124)
,(10107,'6.2.17',1124,1125)

--Additional Cases to cover
,(50001, '2016-01-01', 2001, NULL)
,(50002, '2017-02-02', 2001, 2002)
,(50003, '2017-03-03', 2001, NULL)
,(50004, '2017-04-04', 2002, NULL)

,(50005, '2016-01-01', 2003, NULL)
,(50006, '2017-02-02', 2003, 2004)
,(50007, '2017-03-03', 2004, NULL)
,(50008, '2017-04-04', 2003, NULL)

Execution

Cross Apply

;WITH Results AS(
    SELECT DISTINCT S2.*
    FROM @Stage S1
    CROSS APPLY (
        SELECT TOP 1 S2.*
        FROM @Stage S2
        WHERE 
            (S1.Cust1Id = S2.Cust1Id
            OR S1.Cust1Id = S2.Cust2Id
            OR S1.Cust2Id = S2.Cust1Id
            OR S1.Cust2Id = S2.Cust2Id)
        ORDER BY S2.[Date] DESC
            ) S2
)
SELECT R1.*
FROM Results R1
    LEFT JOIN Results R2
        ON R1.Cust2Id = R2.Cust1Id
WHERE R1.[Date] > R2.[Date]
    OR R2.AcctNumber IS NULL

The CROSS APPLY operators walk the cases backward to apply the logic to each joint account case while ensuring the most recent account is carried over. This alone covers most of the cases. The only lingering cases are the ones with 3 accounts being shifted between 3 customers. The self join and WHERE clause in the final select cover these.

Results

+------------+------------+---------+---------+
| AcctNumber | Date       | Cust1Id | Cust2Id |
| 7060       | 2017-02-04 | 1115    | 1114    |
| 10000      | 2016-02-01 | 1110    | NULL    |
| 10050      | 2017-02-01 | 1111    | NULL    |
| 10052      | 2017-02-02 | 1113    | 1112    |
| 10058      | 2017-02-03 | 1120    | 1121    |
| 10067      | 2017-02-05 | 1117    | NULL    |
| 10101      | 2017-06-02 | 1119    | NULL    |
| 10107      | 2017-06-02 | 1124    | 1125    |
| 50003      | 2017-03-03 | 2001    | NULL    |
| 50004      | 2017-04-04 | 2002    | NULL    |
| 50007      | 2017-03-03 | 2004    | NULL    |
| 50008      | 2017-04-04 | 2003    | NULL    |
+------------+------------+---------+---------+
like image 35
Matt Rowland Avatar answered Oct 25 '22 05:10

Matt Rowland


I'm sure there is a much easier approach, but this is what I've had in mind :

SELECT 
    a.acctnumber, 
    a.date, 
    a.Cust1ID, 
    a.Cust2ID 
FROM acct a
OUTER APPLY (
SELECT acctnumber
FROM (
SELECT *, 
    ROW_NUMBER() OVER(PARTITION BY acctnumber ORDER BY [date] DESC) AS ACC_RN,
    ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY [date] DESC) AS RN
FROM (
SELECT 
     a1.acctnumber,
     a1.[date],
     a1.Cust1ID AS CustomerID
FROM acct a1
UNION 
SELECT 
     a2.acctnumber,
     a2.[date],
     a2.Cust2ID
FROM acct a2
) D
) C
WHERE 
    RN = 1
AND CustomerID IS NOT NULL
AND ACC_RN = 2
) acc
WHERE a.acctnumber IN(acc.acctnumber)
like image 37
iSR5 Avatar answered Oct 25 '22 05:10

iSR5