Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query a database based on result of query from another database

I am using SSIS in VS 2013. I need to get a list of IDs from 1 database, and with that list of IDs, I want to query another database, ie SELECT ... from MySecondDB WHERE ID IN ({list of IDs from MyFirstDB}).

like image 829
faujong Avatar asked May 02 '17 19:05

faujong


People also ask

How can I use one query result in another query?

Use the results of a query as a field in another query. You can use a subquery as a field alias. Use a subquery as a field alias when you want to use the subquery results as a field in your main query. Note: A subquery that you use as a field alias cannot return more than one field.

How do you call a query in another query?

A subquery is a query nested inside another query. It can be placed anywhere in a query, even within another subquery. The syntax is very simple – you just put your subquery inside parentheses and insert it into the main query, wherever it is needed.

Can I query from two different databases?

For querying multiple tables in different databases on the same server, all we have to do is use the fully qualified table name. The only condition is, the user logged into the query analyzer (or used for executing the query) should have permission on both the databases.


1 Answers

There is 3 Methods to achieve this:

1st method - Using Lookup Transformation

First you have to add a Lookup Transformation like @TheEsisia answered but there are more requirements:

  • In the Lookup you Have to write the query that contains the ID list (ex: SELECT ID From MyFirstDB WHERE ...)

  • At least you have to select one column from the lookup table

  • These will not filter rows , but this will add values from the second table

To filter rows WHERE ID IN ({list of IDs from MyFirstDB}) you have to do some work in the look up error output Error case there are 2 ways:

enter image description here

  1. set Error handling to Ignore Row so the added columns (from lookup) values will be null , so you have to add a Conditional split that filter rows having values equal NULL.

Assuming that you have chosen col1 as lookup column so you have to use a similar expression

ISNULL([col1]) == False
  1. Or you can set Error handling to Redirect Row, so all rows will be sent to the error output row, which may not be used, so data will be filtered

The disadvantage of this method is that all data is loaded and filtered during execution.

Also if working on network filtering is done on local machine (2nd method on server) after all data is loaded is memory.

2nd method - Using Script Task

To avoid loading all data, you can do a workaround, You can achieve this using a Script Task: (answer writen in VB.NET)

Assuming that the connection manager name is TestAdo and "Select [ID] FROM dbo.MyTable" is the query to get the list of id's , and User::MyVariableList is the variable you want to store the list of id's

Note: This code will read the connection from the connection manager

    Public Sub Main()

        Dim lst As New Collections.Generic.List(Of String)


        Dim myADONETConnection As SqlClient.SqlConnection  
    myADONETConnection = _  
        DirectCast(Dts.Connections("TestAdo").AcquireConnection(Dts.Transaction), _  
        SqlClient.SqlConnection)

        If myADONETConnection.State = ConnectionState.Closed Then
        myADONETConnection.Open()
        End If

        Dim myADONETCommand As New SqlClient.SqlCommand("Select [ID] FROM dbo.MyTable", myADONETConnection)

        Dim dr As SqlClient.SqlDataReader

        dr = myADONETCommand.ExecuteReader

        While dr.Read

            lst.Add(dr(0).ToString)

        End While


        Dts.Variables.Item("User::MyVariableList").Value = "SELECT ... FROM ... WHERE ID IN(" &  String.Join(",", lst) & ")"

        Dts.TaskResult = ScriptResults.Success
    End Sub

And the User::MyVariableList should be used as source (Sql command in a variable)

3rd method - Using Execute Sql Task

Similar to the second method but this will build the IN clause using an Execute SQL Task then using the whole query as OLEDB Source,

  1. Just add an Execute SQL Task before the DataFlow Task
  2. Set ResultSet property to single
  3. Select User::MyVariableList as Result Set
  4. Use the following SQL command

    DECLARE @str AS VARCHAR(4000)
    
    SET @str = ''
    
    SELECT @str = @str + CAST([ID] AS VARCHAR(255)) + ','
    FROM dbo.MyTable 
    
    SET @str = 'SELECT * FROM  MySecondDB WHERE ID IN (' + SUBSTRING(@str,1,LEN(@str) - 1) + ')'
    
    SELECT @str
    

enter image description here

enter image description here

If the column has string data type you should add quotation before and after values as below:

SELECT @str = @str + '''' + CAST([ID] AS VARCHAR(255)) + ''','
    FROM dbo.MyTable

Make sure that you have set the DataFlow Task Delay Validation property to True

like image 188
Hadi Avatar answered Nov 09 '22 19:11

Hadi