Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to import text files with the same name and schema but different directories into database?

I require to import multiple txt files with the same name and same schemas into the same table in SQL Server 2008 database. The problem that I have is that they are all in different directories:

TEST
     201304
            sample1.txt
            sample2.txt
     201305
            sample1.txt
            sample2.txt
     201306
            sample1.txt
            sample2.txt

Is there any way in SSIS that I can set this up?

like image 988
user2988281 Avatar asked Nov 13 '13 15:11

user2988281


1 Answers

Yes. You will want to use a Foreach File Container and then check the Traverse Subfolder option.

Edit

Apparently my answer wasn't cromulent enough, so please accept this working code which illustrates what my brief original answer stated.

Source data

I created 3 folders as described above to contain files sample1.txt and sample2.txt

C:\>MKDIR SSISDATA\SO\TEST\201304
C:\>MKDIR SSISDATA\SO\TEST\201305
C:\>MKDIR SSISDATA\SO\TEST\201306

The contents of the file are below. Each version of the file in each folder has the ID value incremented along with the text values altered to prove it has picked up the new file.

ID,value
1,ABC

Package generation

This part assumes you have BIDS Helper installed. It is not required for the solution but simply provides a common framework future readers could use to reproduce this solution

I created a BIML file with the following content. Even though I have the table create step in there, I needed to have that run on the target server prior to generating the package.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <!-- Create a basic flat file source definition -->
    <FileFormats>
        <FlatFileFormat
            Name="FFFSrc"
            CodePage="1252"
            RowDelimiter="CRLF"
            IsUnicode="false"
            FlatFileType="Delimited"
            ColumnNamesInFirstDataRow="true"
        >
            <Columns>
                <Column
                    Name="ID"
                    DataType="Int32"
                    Delimiter=","
                    ColumnType="Delimited"
                />
                <Column
                    Name="value"
                    DataType="AnsiString"
                    Delimiter="CRLF"
                    InputLength="20"
                    MaximumWidth="20"
                    Length="20"
                    CodePage="1252"
                    ColumnType="Delimited"
                    />
            </Columns>
        </FlatFileFormat>
    </FileFormats>

    <!-- Create a connection that uses the flat file format defined above-->
    <Connections>
        <FlatFileConnection
            Name="FFSrc"
            FileFormat="FFFSrc"
            FilePath="C:\ssisdata\so\TEST\201306\sample1.txt"
            DelayValidation="true"
        />
        <OleDbConnection
            Name="tempdb"
            ConnectionString="Data Source=localhost\dev2012;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
        />

    </Connections>

    <!-- Create a package to illustrate how to apply an expression on the Connection Manager -->
    <Packages>
        <Package
            Name="so_19957451"
            ConstraintMode="Linear"
        >
            <Connections>
                <Connection ConnectionName="tempdb"/>
                <Connection ConnectionName="FFSrc">
                    <Expressions>
                        <!-- Assign a variable to the ConnectionString property. 
                        The syntax for this is ConnectionManagerName.Property -->
                        <Expression PropertyName="FFSrc.ConnectionString">@[User::CurrentFileName]</Expression>
                    </Expressions>
                </Connection>
            </Connections>

            <!-- Create a single variable that points to the current file -->
            <Variables>
                <Variable Name="CurrentFileName" DataType="String">C:\ssisdata\so\TEST\201306\sample1.txt</Variable>
                <Variable Name="FileMask" DataType="String">*.txt</Variable>
                <Variable Name="SourceFolder" DataType="String">C:\ssisdata\so\TEST</Variable>
                <Variable Name="RowCountInput" DataType="Int32">0</Variable>
                <Variable Name="TargetTable" DataType="String">[dbo].[so_19957451]</Variable>
            </Variables>

            <!-- Add a foreach file enumerator. Use the above -->
            <Tasks>
                <ExecuteSQL 
                    Name="SQL Create Table"
                    ConnectionName="tempdb">
                    <DirectInput>
                        IF NOT EXISTS (SELECT * FROM sys.tables T WHERE T.name = 'so_19957451' and T.schema_id = schema_id('dbo'))
                        BEGIN
                            CREATE TABLE dbo.so_19957451(ID int NOT NULL, value varchar(20) NOT NULL);
                        END
                    </DirectInput>
                </ExecuteSQL>
                <ForEachFileLoop
                    Name="FELC Consume files"
                    FileSpecification="*.csv"
                    ProcessSubfolders="true"
                    RetrieveFileNameFormat="FullyQualified"
                    Folder="C:\"
                    ConstraintMode="Linear"
                >
                    <!-- Define the expressions to make the input folder and the file mask 
                    driven by variable values -->
                    <Expressions>
                        <Expression PropertyName="Directory">@[User::SourceFolder]</Expression>
                        <Expression PropertyName="FileSpec">@[User::FileMask]</Expression>
                    </Expressions>
                    <VariableMappings>
                        <!-- Notice that we use the convention of User.Variable name here -->
                        <VariableMapping
                            Name="0"
                            VariableName="User.CurrentFileName"
                        />
                    </VariableMappings>
                    <Tasks>
                        <Dataflow Name="DFT Import file" DelayValidation="true">
                            <Transformations>
                                <FlatFileSource Name="FFS Sample" ConnectionName="FFSrc"/>
                                <RowCount Name="RC Source" VariableName="User.RowCountInput"/>
                                <OleDbDestination 
                                    Name="OLE_DST"
                                    ConnectionName="tempdb">
                                    <TableFromVariableOutput VariableName="User.TargetTable"/>                                  
                                </OleDbDestination>
                            </Transformations>
                        </Dataflow>
                    </Tasks>
                </ForEachFileLoop>
            </Tasks>
        </Package>
    </Packages>
</Biml>

Right click on the biml file and select Generate SSIS Package. At this point, you should have a package named so_19957451 added to your current SSIS project.

Package configuration

There's no need for any configuration because it's already been done via BIML but moar screenshots make for better answers.

This is the basic package

Control flow

Here are my variables

Variables

Configuration of the Foreach Loop, as called out in the MSDN article as well as my note of select the Traverse subfolder

Traverse Subfolder

Assign the value generated per loop to the variable Current

assign variables

The flat file source has an expression applied to the ConnectionString property to ensure it uses the Variable @User::CurrentFileName. This changes the source per execution of the loop.

enter image description here

Execution results

Results from the database

enter image description here

Match the output from the package execution

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample1.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample1.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample2.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample2.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample1.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample1.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample2.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample2.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample1.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample1.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample2.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample2.txt" has ended.

like image 146
billinkc Avatar answered Oct 14 '22 05:10

billinkc