Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find groups of records that match other groups of records (relational division?)

For setting consolidated account handling, I want to find out accounts that have "exactly the same" set of owners.

I think it might work to pivot the owners with dynamic sql, then use ranking functions, but I don't want to pursue that approach; I don't have an upper limit on how many names can be associated with a given account, so I want to avoid dynamic SQL.

My data (also this is at http://www.sqlfiddle.com/#!3/1d36e)

 CREATE TABLE allacctRels
 (account INT NOT NULL,
 module CHAR(3) NOT NULL,
 custCode CHAR(20) NOT NULL)


 INSERT INTO allacctrels
 (account, module, custCode)
 VALUES
 (1, 'DDA', 'Wilkie, Walker'),
 (1, 'DDA', 'Houzemeal, Juvy'),
 (2, 'CDS', 'Chase, Billy'),
 (2, 'CDS', 'Norman, Storm'),
 (3, 'CDS', 'Chase, Billy'),
 (3, 'CDS', 'Norman, Storm'),
 (7, 'CDS', 'Perkins, Tony'),
 (15, 'SVG', 'Wilkie, Walker'), --typo in name before mwigdahl's response
 (16, 'SVG', 'Wilkie, Walker'), -- corrected typo here too
 (606, 'DDA', 'Norman, Storm'),
 (606, 'DDA', 'Chase, Billy'),-- corrected 2nd typo found 
 (4, 'LNS', 'Wilkie, Walker'),
 (4, 'LNS', 'Houzemeal, Juvy'),
 (44, 'DDA', 'Perkins, Tony'),
 (222, 'DDA', 'Wilkie, Walker'),
 (222, 'DDA', 'Houzemeal, Juvy'),
 (17, 'SVG', 'Wilkie, Walker'), -- added these three rows in edit, SVG 17 doesn't match any dda 
 (17, 'SVG', 'Welch, Raquel'),
 (17, 'SVG', 'Houzemeal, Juvy')

I want to find out, for each MODULE-ACCOUNT, what the lowest DDA account is that has the exact same owners associated with it.

In the sample data, I would want these results, the third column being the lowest DDA account that has the same owners. The results should have the same number of rows as therea re module/account combos - one row per each row in "SELECT DISTINCT module, account FROM allAcctRels")

1, DDA, 1
2, CDS, 606
3, CDS, 606
15, SVG, NULL
16, SVG, NULL
606, DDA, 606
4, LNS, 1
7, CDS, 44
44, DDA, 44
222, DDA, 1
17, SVG, NULL -- added to original post.

SVG 15 and 16 don't match any DDA account, so it doesn't matter that they match each other, they get NULL for the account to consolidate to. EDIT: SVG 17 doesn't match anything, even though there is a DDA acct that has all its holders on in SVG 17, the combination of holders in SVG 17 doesn't occur for any one DDA acct. Every DDA account will match itself, unless a dda account with the same owners and lower DDA exists (as is the case for DDA 222).

I can see that one general approach is to pivot each account, group the pivoted table, and use row_number. Given the unbounded number of holders associated with each account, I think the pivoting would take dynamic SQL I'd rather avoid.

It seems to me that this is a "relational divsion" problem, with the relational division probably being "fed" by a CROSS APPLY. I tried writing a function that would take a table of account holders associated with a specific account and find the lowest dda account, along the lines shown below, the idea being to see if all the the number of people in a given account is the same as the number of people when that account is joined to a given dda account, but I can't figure out how to "feed" tables of account numbers in to the function.

-- this is what I tried but I'm not sure it the logic would work
-- and I can't figure out how to pass the account holders for each
-- account in.  This is a bit changed from the function I wrote, some
    -- extraneous fields removed and cryptic column names changed.  So it 
    -- probably won't run as is.

    -- to support a parameter type to a tape
-- CREATE type VisionCustomer as Table
-- (customer varchar(30))

CREATE FUNCTION  consolidatable 
(@custList dbo.VisionCustomer READONLY)
RETURNS char(10)
AS  
BEGIN
DECLARE @retval Varchar(10)
DECLARE @howmany int
select @howmany=Count(*) FROM @custlist;

SELECT @retval = min (acct) FROM allAcctRels
    JOIN @custlist
        On VendorCustNo = Customer
            WHERE acctType = 'DDA'
            GROUP BY acct
            HAVING (count(*) = @howmany)
            and
            COUNT(*) = (select Count(*) FROM allAcctRels X
    WHERE X.acctType = 'DDA'
    AND X.account = AllAcctRels.account) ;
RETURN @retval
END;
like image 962
Levin Magruder Avatar asked Apr 10 '12 19:04

Levin Magruder


People also ask

How are records grouped?

How Records are Grouped. The most common unit of records is called a "series.". A series of records were accumulated and used together for a specific purpose, during a distinct period of time, and the records in a series are usually arranged in a particular order.

What is an example of group record linkage?

An example of group (household) record linkage, and the corresponding MIL setting. Links between individual records correspond to instances while a bag is made of all links between the records in two groups. Record linkage is the process of identifying records that refer to the same entities from different data sources.

What are record Groups (RGs)?

These units, called " Record Groups " (RGs) at the National Archives, are designated according to the highest level of responsibility in an organization that created the records. Each RG has a number.

What is the most common unit of records?

The most common unit of records is called a "series." A series of records were accumulated and used together for a specific purpose, during a distinct period of time, and the records in a series are usually arranged in a particular order.


2 Answers

I believe this is what you are looking for ( http://www.sqlfiddle.com/#!3/f96c5/1 ):

;WITH AccountsWithOwners AS
(
  SELECT DISTINCT
    DA.module
    , DA.account
    , STUFF((SELECT 
                 ',' + AAR.custCode
               FROM allacctRels AAR 
               WHERE AAR.module = DA.module 
                 AND AAR.account = DA.account
               ORDER BY AAR.custCode
               FOR XML PATH(''))
              , 1, 1, '') AS Result
  FROM allacctRels DA
) 
, WithLowestDda AS
(
    SELECT
        AWO.module
        , AWO.account
        , MatchingAccounts.account AS DdaAccount
        , ROW_NUMBER() OVER(PARTITION BY AWO.module, AWO.account ORDER BY MatchingAccounts.account) AS Row
    FROM AccountsWithOwners AWO
    LEFT JOIN AccountsWithOwners MatchingAccounts
        ON MatchingAccounts.module = 'DDA'
        AND MatchingAccounts.Result = AWO.Result
)
SELECT
    account
    , module
    , DdaAccount
FROM WithLowestDda
WHERE Row = 1
like image 109
Jeremy Pridemore Avatar answered Sep 21 '22 10:09

Jeremy Pridemore


This actually turns out to be pretty simple, if I understand you correctly. Try this:

SELECT a.account, a.module, MIN(b.account) 
FROM allacctRels a
    LEFT JOIN allacctRels b ON a.custCode = b.custCode AND b.module = 'DDA'
GROUP BY a.account, a.module

EDIT: The above doesn't work after the clarifications, but this should. It is indeed a type of relational division. Probably not the most efficient query plan in the world, but it works.

SELECT a.account, a.module, MIN(b.account)
FROM allacctRels a
    LEFT JOIN allacctRels b ON b.module = 'DDA'
    AND
    -- first test is to confirm that the number of matching names for this combination equals the number of names for the DDA set...
    (
        SELECT COUNT(*) 
        FROM allacctRels b2 
            INNER JOIN allacctRels a2 ON b2.custCode = a2.custCode 
        WHERE a.account = a2.account AND b.account = b2.account
    ) = 
    (
        SELECT COUNT(*) 
        FROM allacctRels b2 
        WHERE b.account = b2.account
    )
    AND 
    -- second test is to confirm that the number of names for the DDA set equals the number of names for the base set...
    (
        SELECT COUNT(*) 
        FROM allacctRels b2 
        WHERE b.account = b2.account
    ) = 
    (
        SELECT COUNT(*) 
        FROM allacctRels a2 
        WHERE a.account = a2.account
    )
GROUP BY a.account, a.module
like image 41
mwigdahl Avatar answered Sep 22 '22 10:09

mwigdahl