A brief explanation on the relevant domain part:
A Category is composed of four data:
So, Male Adult Black Rooster
forms one category. Some combinations may not exist, such as mighty mite black belt.
An Athlete fights Athletes of the same Category, and if he classifies, he fights Athletes of different Weight Divisions (but of the same Gender, Age and Belt).
To the modeling. I have a Category
table, already populated with all combinations that exists in the domain.
CREATE TABLE Category (
[Id] [int] IDENTITY(1,1) NOT NULL,
[AgeDivision_Id] [int] NULL,
[Gender] [int] NULL,
[BeltColor] [int] NULL,
[WeightDivision] [int] NULL
)
A CategorySet
and a CategorySet_Category
, which forms a many to many relationship with Category
.
CREATE TABLE CategorySet (
[Id] [int] IDENTITY(1,1) NOT NULL,
[Championship_Id] [int] NOT NULL,
)
CREATE TABLE CategorySet_Category (
[CategorySet_Id] [int] NOT NULL,
[Category_Id] [int] NOT NULL
)
Given the following result set:
| Options_Id | Championship_Id | AgeDivision_Id | BeltColor | Gender | WeightDivision |
|------------|-----------------|----------------|-----------|--------|----------------|
1. | 2963 | 422 | 15 | 7 | 0 | 0 |
2. | 2963 | 422 | 15 | 7 | 0 | 1 |
3. | 2963 | 422 | 15 | 7 | 0 | 2 |
4. | 2963 | 422 | 15 | 7 | 0 | 3 |
5. | 2964 | 422 | 15 | 8 | 0 | 0 |
6. | 2964 | 422 | 15 | 8 | 0 | 1 |
7. | 2964 | 422 | 15 | 8 | 0 | 2 |
8. | 2964 | 422 | 15 | 8 | 0 | 3 |
Because athletes may fight two CategorySets, I need CategorySet
and CategorySet_Category
to be populated in two different ways (it can be two queries):
One Category_Set
for each row, with one CategorySet_Category
pointing to the corresponding Category
.
One Category_Set
that groups all WeightDivisions in one CategorySet
in the same AgeDivision_Id, BeltColor, Gender. In this example, only BeltColor
varies.
So the final result would have a total of 10 CategorySet
rows:
| Id | Championship_Id |
|----|-----------------|
| 1 | 422 |
| 2 | 422 |
| 3 | 422 |
| 4 | 422 |
| 5 | 422 |
| 6 | 422 |
| 7 | 422 |
| 8 | 422 |
| 9 | 422 | /* groups different Weight Division for BeltColor 7 */
| 10 | 422 | /* groups different Weight Division for BeltColor 8 */
And CategorySet_Category
would have 16 rows:
| CategorySet_Id | Category_Id |
|----------------|-------------|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
| 7 | 7 |
| 8 | 8 |
| 9 | 1 | /* groups different Weight Division for BeltColor 7 */
| 9 | 2 | /* groups different Weight Division for BeltColor 7 */
| 9 | 3 | /* groups different Weight Division for BeltColor 7 */
| 9 | 4 | /* groups different Weight Division for BeltColor 7 */
| 10 | 5 | /* groups different Weight Division for BeltColor 8 */
| 10 | 6 | /* groups different Weight Division for BeltColor 8 */
| 10 | 7 | /* groups different Weight Division for BeltColor 8 */
| 10 | 8 | /* groups different Weight Division for BeltColor 8 */
I have no idea how to insert into CategorySet
, grab it's generated Id, then use it to insert into CategorySet_Category
I hope I've made my intentions clear.
I've also created a SQLFiddle.
Edit 1: I commented in Jacek's answer that this would run only once, but this is false. It will run a couple of times a week. I have the option to run as SQL Command from C# or a stored procedure. Performance is not crucial.
Edit 2: Jacek suggested using SCOPE_IDENTITY
to return the Id. Problem is, SCOPE_IDENTITY
returns only the last inserted Id, and I insert more than one row in CategorySet
.
Edit 3: Answer to @FutbolFan who asked how the FakeResultSet is retrieved.
It is a table CategoriesOption
(Id, Price_Id, MaxAthletesByTeam)
And tables CategoriesOptionBeltColor
, CategoriesOptionAgeDivision
, CategoriesOptionWeightDivison
, CategoriesOptionGender
. Those four tables are basically the same (Id, CategoriesOption_Id, Value).
The query look like this:
SELECT * FROM CategoriesOption co
LEFT JOIN CategoriesOptionAgeDivision ON
CategoriesOptionAgeDivision.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionBeltColor ON
CategoriesOptionBeltColor.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionGender ON
CategoriesOptionGender.CategoriesOption_Id = co.Id
LEFT JOIN CategoriesOptionWeightDivision ON
CategoriesOptionWeightDivision.CategoriesOption_Id = co.Id
The solution described here will work correctly in multi-user environment and when destination tables CategorySet
and CategorySet_Category
are not empty.
I used schema and sample data from your SQL Fiddle.
First part is straight-forward
(ab)use MERGE
with OUTPUT
clause.
MERGE
can INSERT
, UPDATE
and DELETE
rows. In our case we need only to INSERT
. 1=0
is always false, so the NOT MATCHED BY TARGET
part is always executed. In general, there could be other branches, see docs. WHEN MATCHED
is usually used to UPDATE
; WHEN NOT MATCHED BY SOURCE
is usually used to DELETE
, but we don't need them here.
This convoluted form of MERGE
is equivalent to simple INSERT
, but unlike simple INSERT
its OUTPUT
clause allows to refer to the columns that we need.
MERGE INTO CategorySet
USING
(
SELECT
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,Category.Id AS Category_Id
FROM
FakeResultSet
INNER JOIN Category ON
Category.AgeDivision_Id = FakeResultSet.AgeDivision_Id AND
Category.Gender = FakeResultSet.Gender AND
Category.BeltColor = FakeResultSet.BeltColor AND
Category.WeightDivision = FakeResultSet.WeightDivision
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
(Championship_Id
,Price_Id
,MaxAthletesByTeam)
VALUES
(Src.Championship_Id
,Src.Price_Id
,Src.MaxAthletesByTeam)
OUTPUT inserted.id AS CategorySet_Id, Src.Category_Id
INTO CategorySet_Category (CategorySet_Id, Category_Id)
;
FakeResultSet
is joined with Category
to get Category.id
for each row of FakeResultSet
. It is assumed that Category
has unique combinations of AgeDivision_Id, Gender, BeltColor, WeightDivision
.
In OUTPUT
clause we need columns from both source and destination tables. The OUTPUT
clause in simple INSERT
statement doesn't provide them, so we use MERGE
here that does.
The MERGE
query above would insert 8 rows into CategorySet
and insert 8 rows into CategorySet_Category
using generated IDs.
Second part
needs temporary table. I'll use a table variable to store generated IDs.
DECLARE @T TABLE (
CategorySet_Id int
,AgeDivision_Id int
,Gender int
,BeltColor int);
We need to remember the generated CategorySet_Id
together with the combination of AgeDivision_Id, Gender, BeltColor
that caused it.
MERGE INTO CategorySet
USING
(
SELECT
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,FakeResultSet.AgeDivision_Id
,FakeResultSet.Gender
,FakeResultSet.BeltColor
FROM
FakeResultSet
GROUP BY
FakeResultSet.Championship_Id
,FakeResultSet.Price_Id
,FakeResultSet.MaxAthletesByTeam
,FakeResultSet.AgeDivision_Id
,FakeResultSet.Gender
,FakeResultSet.BeltColor
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
(Championship_Id
,Price_Id
,MaxAthletesByTeam)
VALUES
(Src.Championship_Id
,Src.Price_Id
,Src.MaxAthletesByTeam)
OUTPUT
inserted.id AS CategorySet_Id
,Src.AgeDivision_Id
,Src.Gender
,Src.BeltColor
INTO @T(CategorySet_Id, AgeDivision_Id, Gender, BeltColor)
;
The MERGE
above would group FakeResultSet
as needed and insert 2 rows into CategorySet
and 2 rows into @T
.
Then join @T
with Category
to get Category.IDs
:
INSERT INTO CategorySet_Category (CategorySet_Id, Category_Id)
SELECT
TT.CategorySet_Id
,Category.Id AS Category_Id
FROM
@T AS TT
INNER JOIN Category ON
Category.AgeDivision_Id = TT.AgeDivision_Id AND
Category.Gender = TT.Gender AND
Category.BeltColor = TT.BeltColor
;
This will insert 8 rows into CategorySet_Category
.
Here is not the full answer, but direction which you can use to solve this:
1st query:
select row_number() over(order by t, Id) as n, Championship_Id
from (
select distinct 0 as t, b.Id, a.Championship_Id
from FakeResultSet as a
inner join
Category as b
on
a.AgeDivision_Id=b.AgeDivision_Id and
a.Gender=b.Gender and
a.BeltColor=b.BeltColor and
a.WeightDivision=b.WeightDivision
union all
select distinct 1, BeltColor, Championship_Id
from FakeResultSet
) as q
2nd query:
select q2.CategorySet_Id, c.Id as Category_Id from (
select row_number() over(order by t, Id) as CategorySet_Id, Id, BeltColor
from (
select distinct 0 as t, b.Id, null as BeltColor
from FakeResultSet as a
inner join
Category as b
on
a.AgeDivision_Id=b.AgeDivision_Id and
a.Gender=b.Gender and
a.BeltColor=b.BeltColor and
a.WeightDivision=b.WeightDivision
union all
select distinct 1, BeltColor, BeltColor
from FakeResultSet
) as q
) as q2
inner join
Category as c
on
(q2.BeltColor is null and q2.Id=c.Id)
OR
(q2.BeltColor = c.BeltColor)
of course this will work only for empty CategorySet
and CategorySet_Category
tables, but you can use select coalese(max(Id), 0) from CategorySet
to get current number and add it to row_number
, thus you will get real ID which will be inserted into CategorySet row for second query
What I do when I run into these situations is to create one or many temporary tables with row_number() over clauses giving me identities on the temporary tables. Then I check for the existence of each record in the actual tables, and if they exist update the temporary table with the actual record ids. Finally I run a while exists loop on the temporary table records missing the actual id and insert them one at a time, after the insert I update the temporary table record with the actual ids. This lets you work through all the data in a controlled manner.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With