Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pull data without altering the item sequence in the reference table

I want to lookup values listed in a temp table:

So let us say:

Create Table #mylist
(
eserial nvarchar(35) Collate SQL_Latin1_General_CP850_CI_AS,
refdate datetime
)

Insert Into #mylist (eserial, refdate) Values ('A', '2015-09-15')
Insert Into #mylist (eserial, refdate) Values ('B', '2015-09-14')
Insert Into #mylist (eserial, refdate) Values ('C', '2015-09-13')
Insert Into #mylist (eserial, refdate) Values ('D', '2015-09-12')

I need the result to be the Top 1 date less than the reference date.
And should be returned in the same sequence as is in the temporary table.

What I tried:

Select
    lst.eserial,
    lst.refdate,
    app.CREATEDDATETIME,
From #mylist lst
Outer Apply 
    (Select Top 1 rec.CREATEDDATETIME, rec.ESERIAL, rec.ITEMID
     From TableSource rec
     Where lst.eserial=rec.ESERIAL And rec.CREATEDDATETIME<lst.refdate
     Order By rec.CREATEDDATETIME Desc
    ) As app

This works but it is slow. Also, if the number of rows are increased, it does not consistently preserve the sequence of eserial. I need the query to preserve the order I put it in the temporary table.

Again my expected output is simply:

enter image description here

Where eserial is the same sequence as the temp table and CREATEDDATETIME is the maximum date less than the reference date. More like a conditional Vlookup if you know Excel.

like image 359
L42 Avatar asked Sep 04 '15 03:09

L42


3 Answers

It is not quite clear what you mean by

maintain the sequence of the items in the temporary table

, but if you want to get result ordered by eserial, then you have to add ORDER BY eserial to your query. Without ORDER BY the resulting rows can be returned in any order. This applies to any method that you choose.

So, taking your last query as a basis, it will look like this:

Select
    lst.eserial
    ,lst.refdate
    ,app.CREATEDDATETIME
From
    #mylist lst
    Outer Apply
    (
        Select Top 1 rec.CREATEDDATETIME
        From TableSource rec
        Where lst.eserial=rec.ESERIAL And rec.CREATEDDATETIME<lst.refdate
        Order By rec.CREATEDDATETIME Desc
    ) As app
ORDER BY lst.eserial;

To make it work fast and efficiently add an index to TableSource on (ESERIAL, CREATEDDATETIME). Order of columns in the index is important.

It is also important to know if there are any other columns that you use in OUTER APPLY query and how you use them. You mentioned column AREAID in the first variant in the question, but not in the last variant. If you do have more columns, then clearly show how you intend to use them, because the correct index would depend on it. The index on (ESERIAL, CREATEDDATETIME) is enough for the query I wrote above, but if you have more columns a different index may be required.

It would also help optimizer if you defined your temp table with a PRIMARY KEY:

Create Table #mylist
(
    eserial nvarchar(35) Collate SQL_Latin1_General_CP850_CI_AS PRIMARY KEY,
    refdate datetime
)

Primary key would create a unique clustered index.

One more important note. What is the type and collation of columns ESERIAL and CREATEDDATETIME in the main TableSource table? Make sure that types and collation of columns in your temp table matches the main TableSource table. If the type is different (varchar vs. nvarchar or datetime vs. date) or collation is different index may not be used => it will be slow.

Edit

You use the phrase "same sequence as the temp table" several times in the question, but it is not really clear what you mean by it. Your sample data doesn't help to resolve the ambiguity. The column name eserial also adds to the confusion. I can see two possible meanings:

  1. Return rows from temp table ordered by values in eserial column.
  2. Return rows from temp table in the same order as they were inserted.

My original answer implies (1): it returns rows from temp table ordered by values in eserial column.

If you want to preserve the order of rows as they were inserted into the table, you need to explicitly remember this order somehow. The easiest method is to add an IDENTITY column to the temp table and later order by this column. Like this:

Create Table #mylist
(
    ID int IDENTITY PRIMARY KEY,
    eserial nvarchar(35) Collate SQL_Latin1_General_CP850_CI_AS,
    refdate datetime
)

And in the final query use ORDER BY lst.ID.

like image 100
Vladimir Baranov Avatar answered Oct 17 '22 20:10

Vladimir Baranov


That's easy using identity. Query without Order is not guarantee to have order in SQL server.

Create Table #mylist
(
    seqId int identity(1,1),
    eserial nvarchar(35) Collate SQL_Latin1_General_CP850_CI_AS,
    refdate datetime
)

Use the table freely and put Order By seqId at the end of your query

Edit

Use MAX() instead of TOP 1 with order if you have no cluster index on ESERIAL, CREATEDDATETIME on the TableSource

https://stackoverflow.com/a/21420643/1287352

Select
    lst.eserial,
    lst.refdate,
    app.CREATEDDATETIME,
From #mylist lst
Outer Apply 
    (
        Select MAX(rec.CREATEDDATETIME), rec.ESERIAL, rec.ITEMID
        From TableSource rec
        Where lst.eserial = rec.ESERIAL And rec.CREATEDDATETIME < lst.refdate
        GROUP BY rec.ESERIAL, rec.ITEMID
    ) As app
ORDER BY lst.seqId 
like image 27
Eric Avatar answered Oct 17 '22 20:10

Eric


Perhaps the performance issue is due to indexing. Try adding the indexes below, removing UNIQUE if the keys are not unique.

CREATE UNIQUE NONCLUSTERED INDEX idx ON #mylist (eserial, refdate);
CREATE UNIQUE NONCLUSTERED INDEX idx ON TableSource (eserial, CREATEDDATETIME);
like image 1
Dan Guzman Avatar answered Oct 17 '22 21:10

Dan Guzman