SQL over clause - dividing partition into numbered sub-partitions

Tags:

I have a challenge, that I've come across at multiple occasions but never been able to find an efficient solution to. Imagine I have a large table with data regarding e.g. bank accounts and their possible revolving moves from debit to credit:

AccountId DebitCredit AsOfDate
--------- ----------- ----------
aaa       d           2018-11-01
aaa       d           2018-11-02
aaa       c           2018-11-03
aaa       c           2018-11-04
aaa       c           2018-11-05
bbb       d           2018-11-02
ccc       c           2018-11-01
ccc       d           2018-11-02
ccc       d           2018-11-03
ccc       c           2018-11-04
ccc       d           2018-11-05
ccc       c           2018-11-06

In the example above I would like to assign sub-partition numbers to the combination of AccountId and DebitCredit where the partition number is incremented each time DebitCredit shifts. In other words in the example above I would like this result:

AccountId DebitCredit AsOfDate   PartNo
--------- ----------- ---------- ------
aaa       d           2018-11-01      1
aaa       d           2018-11-02      1
aaa       c           2018-11-03      2
aaa       c           2018-11-04      2
aaa       c           2018-11-05      2

bbb       d           2018-11-02      1

ccc       c           2018-11-01      1
ccc       d           2018-11-02      2
ccc       d           2018-11-03      2
ccc       c           2018-11-04      3
ccc       d           2018-11-05      4
ccc       c           2018-11-06      5

I cannot really figure out how to do it quickly and efficiently. The operation has to be done daily on a tables with millions of rows.

In this example it is guaranteed that we will have consecutive rows for all accounts. However, of course the customer might open an account the 15th in the month and/or close his account the 26th.

The challenge is to be solved on an MSSQL 2016 server, but a solution that would work on 2012 (and maybe even 2008r2) would be nice.

As you can imagine there's no way of telling whether there will only be debit or credit rows or whether the account will be revolving each day.

416

asked Nov 12 '18 07:11

Stanley Gade

1 Answers

If you have sql server 2012+, you can use lag() and a window summation to get this:

select *,sum(PartNoAdd) over (partition by AccountId order by AsOfDate asc) as PartNo_calc
from
(
    select *,
    case when DebitCredit=lag(DebitCredit,1) over (partition by AccountId order by AsOfDate asc) then 0 else 1 end as PartNoAdd
    from t 
)t2
order by AccountId asc, AsOfDate  asc

At the inner query, PartNoAdd checks if the previous DebitCard for this account is the same. If it is, it returns 0 (we should add nothing), else it returns 1.

Then the outer query sums all the PartNoAdd for this Account.

154

answered Oct 13 '22 11:10

George Menoutis

Related questions
                            
                                How can I limit DBeaver Data Editor to limit result set size?
                            
                                What is the best way to partition large tables in SQL Server?
                            
                                How to best handle the storage of historical data?
                            
                                rails union hack, how to pull two different queries together
                            
                                Get Number of Rows from a Select statement
                            
                                Adding a new row Using SQL Server Management Studio?
                            
                                SQLAlchemy - ObjectDeletedError: Instance '<Class at...>' has been deleted. Help
                            
                                Why can't I GROUP BY 1 when it's OK to ORDER BY 1?
                            
                                sqlcmd script with spaces in filename
                            
                                Explanation needed for missing rows with left join and count()
                            
                                Beginner's SQL: How do I find a detached database?
                            
                                Fastest and most efficient way to pre-populate database in Android
                            
                                Sort a list with SQL or as a collection?
                            
                                Resources Exceeded during query execution
                            
                                SQL Server Selecting Records with most recent date time
                            
                                make python wait for stored procedure to finish executing
                            
                                Why does Firebird truncate decimal places when dividing?
                            
                                Why is inner join and outer join so called?
                            
                                Select row with most recent date per user with 1 condition in JPA
                            
                                Is it possible to use result of an SQL function as a field in Doctrine?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SQL over clause - dividing partition into numbered sub-partitions

Tags:

sql

sql-server

tsql

sql-server-2016

ranking-functions

Stanley Gade

People also ask

1 Answers

George Menoutis

Recent Activity

Donate For Us