I need to create a database using a CSV file with SSIS. The CSV file includes four columns:
I need to use the information of that table to populate the three tables I created in SQL below.
I have realized that what I need is to use one column of the Employee Table, EmployeeNumber
, and Group Table, GroupID
, to populate the EmployeeGroup table. For that, I thought that a Join Merge table is what I needed, but I created the Data Flow Task in SSIS, and the results are the same, no data displayed.
The middle table is the one used to relate the other tables.
I created the package in SSIS and the Employee and Group Tables are populated, but the EmployeeGroup table is not. EmployeeGroup will only show the EmployeeNumber and Group ID columns with no data.
I am new using SSIS, and I really do not know what else to do. I will really appreciate your help.
In this article, you will learn how to import CSV file into SQL server using SQL server management Studio. Log in to your database using SQL Server Management Studio. Right-click the database and select Tasks -> Import Data.
This can be done using only 2 Data Flow Task, but according to what the OP mentioned in the question I am new using SSIS, and I really do not know what else to do
, i will provide easiest solution which is 3 DataFlow Task to avoid using more components like MultiCast
.
Because you want to build a relational database and extract relations from the csv, you have to read the csv 3 times -consider it as 3 seperated files -.
First you have to import Employees and Groups Data, Then you have to import the relation table between them.
Each Import step can be done in a seperate Data Flow Task
SynchronousInputID Property
to None
And add an output column OutGroupname
with type DT_STR
In the Script section write the following Code:
Imports System.Collections.Generic
Private m_List As New List(Of String)
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.GroupName_IsNull AndAlso
Not String.IsNullOrEmpty(Row.GroupName.Trim) Then
If Not m_List.Contains(Row.GroupName.Trim) Then
m_List.Add(Row.GroupName.Trim)
CreateOutputRows(Row.GroupName.Trim)
End If
End If
End Sub
Public Sub CreateOutputRows(ByVal strValue As String)
Output0Buffer.AddRow()
Output0Buffer.OutGroupName = strValue
End Sub
On the OLEDB Destination map OutGroupName
to GroupName
Column
Groupname
Column : with a single difference that is you have to choose the EmployeeID
, Employee Name
, LoginName
columns as Input in the Script Component and Use the ID
Column instead of Groupname
column in the comparaisonIn The LookUp Transformation Component select Groups
Table as a Lookup table
Map GroupName
Columns and Get Group ID
as output
Choose Ignore Failure
in the Error Output Configuration
In Oledb Destination map columns as following
Note: GroupID
must be an Identity (set it in sql server)
You have to do the same steps as the 3 Data Flow Tasks solution, but instead of adding 2 Data Flow Tasks to Group
and Employee
, just add one Data Flow Task, and after the Flat File Source
add a MultiCast
component to duplicate the Flow. Then for the first flow use the same Script Component
and OLEDB Destination
used in the Employee
Data Flow Task, and for the second flow use the Script Component
and OLEDB Destination
related to Group
.
There are many method to import Flat file to SQL via T-SQL commands
Assuming that the installed version of Microsoft ACE OLEDB is Microsoft.ACE.OLEDB.12.0
and that the csv file location is C:\abc.csv
First Import data into Employee and Group Table
INSERT INTO [GROUP]
([Group Name])
SELECT
[Group Name]
FROM
OPENROWSET
(
'Microsoft.ACE.OLEDB.12.0','Text;Database=C:\;IMEX=1;','SELECT * FROM abc.csv'
) t
INSERT INTO [Employee]
([Employee Number],[Employee Name],[LoginName])
SELECT
[Employee Number],[Employee Name],[LoginName]
FROM
OPENROWSET
(
'Microsoft.ACE.OLEDB.12.0','Text;Database=C:\;IMEX=1;','SELECT * FROM abc.csv'
) t
Import the Employee_Group Data
INSERT INTO [EmployeeGroup]
([Employee Number],[GroupID])
SELECT
t1.[Employee Number],t2.[GroupID]
FROM
OPENROWSET
(
'Microsoft.ACE.OLEDB.12.0','Text;Database=C:\;IMEX=1;','SELECT * FROM abc.csv'
) t1 INNER JOIN GROUP t2 ON t1.[Group Name] = T2.[Group Name]
First Import data into Employee and Group Table
INSERT INTO [GROUP]
([Group Name])
SELECT
[Group Name]
FROM
OPENROWSET
(
'MSDASQL',
'Driver={Microsoft Text Driver (*.txt; *.csv)};
DefaultDir=C:\;',
'SELECT * FROM abc.csv'
) t
INSERT INTO [Employee]
([Employee Number],[Employee Name],[LoginName])
SELECT
[Employee Number],[Employee Name],[LoginName]
FROM
OPENROWSET
(
'MSDASQL',
'Driver={Microsoft Text Driver (*.txt; *.csv)};
DefaultDir=C:\;',
'SELECT * FROM abc.csv'
) t
Import the Employee_Group Data
INSERT INTO [EmployeeGroup]
([Employee Number],[GroupID])
SELECT
t1.[Employee Number],t2.[GroupID]
FROM
OPENROWSET
(
'MSDASQL',
'Driver={Microsoft Text Driver (*.txt; *.csv)};
DefaultDir=C:\;',
'SELECT * FROM abc.csv'
) t1 INNER JOIN GROUP t2 ON t1.[Group Name] = T2.[Group Name]
Note: You can Import Data to a staging table, then query this table, to avoid connecting many times to the csv File
There are many method to import csv files to SQL server, you can check the following links for additional informations.
I think the easiest solution would be to import the csv to a flat staging table and then use some insert into...select
statements to populate the target tables.
Assuming you know how to import to a flat table, the rest is quite simple:
INSERT INTO Employee (EmployeeNumber, EmployeeName, LoginName)
SELECT DISTINCT EmployeeNumber, EmployeeName, LoginName
FROM Stage
INSERT INTO [Group] (GroupName)
SELECT DISTINCT GroupName
FROM Stage
INSERT INTO EmployeeGroup(EmployeeNumber, GroupId)
SELECT DISTINCT EmployeeNumber, GroupId
FROM Stage s
INNER JOIN [Group] g ON s.GroupName = g.GroupName
You can see a live demo on rextester.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With