Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Complex Joining multiple tables in SQL Server for a fact table

Hi I have 4 dimensions and I am trying to insert multiple data from the dimensions into the fact table.

I have a Gunsales table that contains the majority of the data for the fact table and then primary keys from other tables that I would like to join on. homicide_id from the Homicide table, article_id from the BBC table, incident_id from the Gun_violence table and shooting_id from the School_shooting table. The rest of the data if from the Gunsales table.

INSERT INTO [dbo].[FactGunSales]
        (
        sale_id,
        sale_date,
        sale_state,
        permit,
        hand_gun,
        long_gun,
        other_gun,
        multiple_gun,
        incident_id,
        homicide_id,
        article_id,
        shooitng_id)

So logically it is a full join of the Gun_sales table and an inner join for the ID keys in the other table but I am struggling to get this to work.

adding in the DDL for all the tables:

    USE [Gun Violence]
GO

DROP TABLE IF EXISTS Gun_Violence

CREATE TABLE Gun_Violence(
        incident_id int PRIMARY KEY, 
        incident_date date,
        state_name varchar (50),
        city_name varchar(50),
        death int ,
        injury int ,
        )

DROP TABLE IF EXISTS Gun_Sales
CREATE TABLE  Gun_Sales(
            sale_id int  PRIMARY KEY,
            sale_date date,
            sale_state varchar(50),
            permit int,
            hand_gun int ,
            long_gun int ,
            other_gun int ,
            multiple_gun int ,
            )

DROP TABLE IF EXISTS School_Shootings
CREATE TABLE School_Shootings(
            shooting_id int  PRIMARY KEY,
            shooting_date date,
            shooting_state varchar(50),
            shooting_city varchar(50),
            shooting_death int ,
            shooting_injury int,
            )

DROP TABLE IF EXISTS Homicide
CREATE TABLE Homicide(
            Homicide_id int PRIMARY KEY,
            homicide_state varchar(50),
            homicide_victims int,
            homeicide_date date             
)

DROP TABLE IF EXISTS BBC
CREATE TABLE BBC(
            ariticle_id int PRIMARY KEY,
            article_date date,
            article_link varchar(1000),
            article_headline varchar(1000),
            article_count int,
            article_keyword varchar(100),
            article_month varchar(10),
            article_year int,
            article_state varchar(50)
)

Output: output for gun As mentioned above I am trying to create a new fact table that has all columns from the Gun_sales table and the primary keys from the other tables.

Thanks in advance

like image 271
Rita Murran Avatar asked Nov 17 '18 15:11

Rita Murran


1 Answers

@RitaMurran Is't true? Do you want Something like this?

USE [Gun Violence]

go

INSERT INTO [dbo].[FACTGUNSALES]
            ([sale_id],
             [sale_date],
             [sale_state],
             [hand_gun],
             [other_gun],
             [multiple_gun],
             [incident_id],
             [homicide_id],
             [article_id],
             [shooitng_id])
SELECT [sale_id],
       [sale_date],
       [sale_state],
       [hand_gun],
       [other_gun],
       [multiple_gun],
       [dbo].[GUN_VIOLENCE].incident_id,
       [dbo].[HOMICIDE].homicide_id,
       [dbo].[BBC].ariticle_id,
       [dbo].[SCHOOL_SHOOTINGS].shooting_id
FROM   [dbo].[GUN_SALES]
       LEFT JOIN [dbo].[GUN_VIOLENCE]
         ON ( [dbo].[GUN_VIOLENCE].incident_date = [dbo].[GUN_SALES].[sale_date] AND
              [dbo].[GUN_VIOLENCE].state_name = [dbo].[GUN_SALES].[sale_state] )
       LEFT JOIN [dbo].[SCHOOL_SHOOTINGS]
         ON ( [dbo].[SCHOOL_SHOOTINGS].shooting_date = [dbo].[GUN_SALES].[sale_date]     AND
              [dbo].[SCHOOL_SHOOTINGS].shooting_state = [dbo].[GUN_SALES].    [sale_state] )
       LEFT JOIN [dbo].[HOMICIDE]
         ON ( [dbo].[HOMICIDE].homeicide_date = [dbo].[GUN_SALES].[sale_date] AND
              [dbo].[HOMICIDE].homicide_state = [dbo].[GUN_SALES].[sale_state] )
       LEFT JOIN [dbo].[BBC]
         ON ( [dbo].[BBC].article_date = [dbo].[GUN_SALES].[sale_date] AND
              [dbo].[BBC].[article_state]  = [dbo].[Gun_Sales].[sale_state] ) 

EDIT:

@RitaMurran, I answer your question:

"Is there a way to have only distinct values when the join happens?"

with the example:

DECLARE @tbl_A TABLE (ID_A INT, col_A varChar(10))
INSERT  @tbl_A (ID_A,col_A)
SELECT 1 AS ID_A,'aaa' AS  col_A UNION ALL
SELECT 2        ,'bbb'           UNION ALL
SELECT 3        ,'ccc'           UNION ALL
SELECT 4        ,'ddd'           UNION ALL
SELECT 5        ,'eee'

DECLARE @tbl_B TABLE (ID_B INT,ID_A_FK INT, col_B varChar(10))
INSERT  @tbl_B (ID_B,ID_A_FK,col_B)
SELECT 1 AS ID_B,1 AS ID_A_FK,NULL AS  col_B UNION ALL
SELECT 2        ,1           ,NULL           UNION ALL
SELECT 3        ,3           ,NULL           UNION ALL
SELECT 4        ,3           ,NULL           UNION ALL
SELECT 5        ,3           ,NULL

SELECT * 
FROM @tbl_A tbA
LEFT JOIN 
    ( select ID_A_FK,col_B FROM @tbl_B GROUP BY ID_A_FK,col_B )  tbB ON     tbA.ID_A = tbB.ID_A_FK  

The Output:

ID_A    col_A   ID_A_FK col_B
1       aaa     1       NULL
2       bbb     NULL    NULL
3       ccc     3       NULL
4       ddd     NULL    NULL
5       eee     NULL    NULL  

If col_B has value,The output is like this:

INSERT  @tbl_B (ID_B,ID_A_FK,col_B)
SELECT 1 AS ID_B,1 AS ID_A_FK,'a1' AS  col_B UNION ALL
SELECT 2        ,1           ,'a2'           UNION ALL
SELECT 3        ,3           ,'c1'           UNION ALL
SELECT 4        ,3           ,'c2'           UNION ALL
SELECT 5        ,3           ,'c3'  

Output:

ID_A    col_A   ID_A_FK col_B
1       aaa     1       a1
1       aaa     1       a2
2       bbb     NULL    NULL
3       ccc     3       c1
3       ccc     3       c2
3       ccc     3       c3
4       ddd     NULL    NULL
5       eee     NULL    NULL
like image 51
Raha Shafaei Avatar answered Oct 12 '22 20:10

Raha Shafaei