I have a table with this structure
Create Table Example (
[order] INT,
[typeID] INT
)
With this data:
order|type
1 7
2 11
3 11
4 18
5 5
6 19
7 5
8 5
9 3
10 11
11 11
12 3
I need to get the count of each type based on the order, something like:
type|count
7 1
11 **2**
18 1
5 1
19 1
5 **2**
3 1
11 **2**
3 1
Context
Lets say that this table is about houses, so I have a list houses in an order. So I have
So I need to show that info condensed. I need to say:
So the count is based on the order. The DENSE_RANK function would help me if I were able to reset the RANK when the partition changes.
So I have an answer, but I have to warn you it's probably going to get some raised eyebrows because of how it's done. It uses something known as a "Quirky Update". If you plan to implement this, please for the love of god read through the linked article and understand that this is an "undocumented hack" which needs to be implemented precisely to avoid unintended consequences.
If you have a tiny bit of data, I'd just do it row by agonizing row for simplicity and clarity. However if you have a lot of data and still need high performance, this might do.
Requirements
What it does
You know how people tell you there is no implicit order to the data in a table? That's still true 99% of the time. Except we know that ultimately it HAS to be stored on disk in SOME order. And it's that order that we're exploiting here. By forcing a clustered index update and the fact that you can assign variables in the same update statement that columns are updated, you can effectively scroll through the data REALLY fast.
Let's set up the data:
if object_id('tempdb.dbo.#t') is not null drop table #t
create table #t
(
_order int primary key clustered,
_type int,
_grp int
)
insert into #t (_order, _type)
select 1,7
union all select 2,11
union all select 3,11
union all select 4,18
union all select 5,5
union all select 6,19
union all select 7,5
union all select 8,5
union all select 9,3
union all select 10,11
union all select 11,11
union all select 12,3
Here's the update statement. I'll walk through each of the components below
declare @Order int, @Type int, @Grp int
update #t with (tablockx)
set @Order = _order,
@Grp = case when _order = 1 then 1
when _type != @Type then @grp + 1
else @Grp
end,
@Type = _type,
_grp = @Grp
option (maxdop 1)
(tablockx)
. If you're working with a temp table, you know there's no contention on the table, but still it's a good habit to get into (if using this approach can even be considered a good habit to get into at all).@Order = _order
. This looks like a pointless statement, and it kind of is. However since _order
is the primary key of the table, assigning that to a variable is what forces SQL to perform a clustered index update, which is crucial to this working_order
is 1 (the first row), just set the @Grp
variable to 1. If, on any given row, the column value of _type
differs from the variable value of @type
, we increment the grouping variable. If the values are the same, we just stick with the @Grp
we have from the previous row.@Type
variable with the column _type
's value. Note this HAS to come after the assignment of @Grp
for it to have the correct value._grp = @Grp
. This is where the actual column value is updated with the results of step 3.option (maxdop 1)
. This means the Maximum Degree of Parallelism is set to 1. In other words, SQL cannot do any task parallelization which might lead to the ordering being off.Now it's just a matter of grouping by the _grp
field. You'll have a unique _grp
value for each consecutive batch of _type
.
Conclusion
If this seems bananas and hacky, it is. As with all things, you need to take this with a grain of salt, and I'd recommend really playing around with the concept to fully understand it if you plan to implement it because I guarantee nobody else is going to know how to troubleshoot it if you get a call in the middle of the night that it's breaking.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With