Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I populate missing rows in sql with static data?

Tags:

sql

sql-server

I have collected one-minute 'tick' share price data for a number securities over a period of weeks, and stored these in a table called 'intraday'. This table includes one-minute tick data between 08:01:00 and 16:30:00 each trading day.

Some securities do not trade every minute of every day, and when they do not trade, no row is created (missing ticks). I was hoping to insert rows for all missing ticks, which would carry across the share price from the previous share price, with a trade volume of '0' for these ticks.

For example, currently for one security I have the following stored:

ticker     date           tick                     cls        volume

ASC        20151231       1899-12-30 12:30:00      3453       2743
ASC        20151231       1899-12-30 12:29:00      3449       3490
ASC        20151231       1899-12-30 12:28:00      3436       930
ASC        20151231       1899-12-30 12:27:00      3435       255
ASC        20151231       1899-12-30 12:26:00      3434       4
ASC        20151231       1899-12-30 12:23:00      3444.59    54

(apologies for the annoying 1899-12-30 dates for each tick - these make it look a bit messy but do no harm, hence why they remain there currently)

What I would ideally like to be stored, is this:

ticker     date           tick                     cls        volume

ASC        20151231       1899-12-30 12:30:00      3453       2743
ASC        20151231       1899-12-30 12:29:00      3449       3490
ASC        20151231       1899-12-30 12:28:00      3436       930
ASC        20151231       1899-12-30 12:27:00      3435       255
ASC        20151231       1899-12-30 12:26:00      3434       4
ASC        20151231       1899-12-30 12:25:00      3444.59    0          < new row
ASC        20151231       1899-12-30 12:24:00      3444.59    0          < new row
ASC        20151231       1899-12-30 12:23:00      3444.59    54

So, for each distinct ticker and date value, there would be a range of values for each minute between 08:01:00 and 16:30:00. Some would be as they are currently stored, and the others would have a volume figure of 0, and a close value that copies the previous close value.

I'm absolutely stumped, and would appreciate any help you could potentially offer on this!

Kind regards.

like image 759
Jambo Avatar asked Oct 19 '22 08:10

Jambo


2 Answers

Use a recursive CTE to create a time table for the range you need. I just used 12 minutes using the data from your sample but you can extend it out. Getting the last non-null value is a little tricky without using a cursor but it is possible. Here is a more thorough explanation on that if you are interested.

Demo of the code below: http://rextester.com/QDQR73738

setup:

create table test_data(ticker varchar(5), date integer, tick datetime, cls decimal(10,2), volume integer);
create table test_data2(ticker varchar(5), date integer, tick datetime, cls decimal(10,2), volume integer);

insert into test_data
    select 'ASC', 20151231, '1899-12-30 12:30:00', 3453,    2743 union all
    select 'ASC', 20151231, '1899-12-30 12:29:00', 3449,    3490 union all
    select 'ASC', 20151231, '1899-12-30 12:28:00', 3436,    930  union all
    select 'ASC', 20151231, '1899-12-30 12:27:00', 3435,    255  union all
    select 'ASC', 20151231, '1899-12-30 12:26:00', 3434,    4    union all
    select 'ASC', 20151231, '1899-12-30 12:23:00', 3444.59, 54   union all

    select 'BSC', 20151231, '1899-12-30 12:23:00', 3444.59, 54   union all
    select 'BSC', 20151231, '1899-12-30 12:28:00', 3436,    930 
;

query:

Declare @tickers Table (ticker varchar(5));
Insert into @tickers select distinct ticker from test_data;

Declare @ticker varchar(5);
While exists (Select * From @tickers)
  BEGIN
    select @ticker = min(ticker) from @tickers;

    with cte(tm) 
    as( Select cast('1899-12-30 12:23:00' as datetime) as tm
        union all
    Select dateadd(minute, 1, tm)
        from cte
        where tm < cast('1899-12-30 12:31:00' as datetime)
    )    

    insert into test_data2 
    select 
      max(ticker) over (partition by grp order by tick rows unbounded preceding) ticker,
      max(date) over (partition by grp order by tick rows unbounded preceding) date,
      tick,
      max(cls) over (partition by grp order by tick rows unbounded preceding) cls,
      volume
    from (
          select
              ticker,
              date,
              tick,
              cls,
              volume,
              id,
              max(id1) OVER(ORDER BY tick ROWS UNBOUNDED PRECEDING) AS grp
          from (
                select 
                  td.ticker,
                  td.date,
                  coalesce(td.tick, cte.tm) tick,
                  td.cls,
                  coalesce(td.volume, 0) volume,
                  row_number() over (order by tick) id
                from test_data td
                right outer join cte
                on td.tick = cte.tm
                and td.ticker = @ticker
         ) cte2
         CROSS APPLY ( VALUES( CASE WHEN ticker IS NOT NULL THEN id END) )
            AS A(id1)
     ) cte3;

      Delete from @tickers where ticker = @ticker;
    End

select * from test_data2
order by ticker, tick;
like image 78
msheikh25 Avatar answered Oct 20 '22 20:10

msheikh25


Tell me if this works. I used the sample data you provided and it worked with the rows that are missing. It may take a little bit of time depending on how many records you have. I created the temp table to test with. You may need to do this once for each ticker symbol.

CREATE TABLE #Stocks (  ticker VARCHAR(5),
                    date DATE,
                    tick datetime,
                    cls money,
                    volume money)

INSERT INTO #Stocks (ticker, date, tick, cls, volume)
VALUES
('ASC','2015-12-31','1899-12-30 12:30:00',3453,2743),
('ASC','2015-12-31','1899-12-30 12:29:00',3449,3490),
('ASC','2015-12-31','1899-12-30 12:28:00',3436,930),
('ASC','2015-12-31','1899-12-30 12:27:00',3435,255),
('ASC','2015-12-31','1899-12-30 12:26:00',3434,4),
('ASC','2015-12-31','1899-12-30 12:23:00',3444.59,54)



DECLARE @TotalRows1 INT
SELECT @TotalRows1 = COUNT(*) FROM #Stocks
DECLARE @Row INT = 1
WHILE @Row < @TotalRows1
BEGIN
DECLARE @TotalRows INT
SELECT @TotalRows = COUNT(*) FROM #Stocks
 IF (SELECT DATEADD(MI,1,tick) FROM (SELECT ROW_NUMBER() OVER (ORDER BY tick) RowNumber, * FROM #Stocks) AS T1 WHERE T1.RowNumber = @Row)
<>
(SELECT T2.tick FROM (SELECT ROW_NUMBER() OVER (ORDER BY tick) RowNumber, * FROM #Stocks) AS T2 WHERE T2.RowNumber = @Row + 1)
BEGIN
INSERT INTO #Stocks
SELECT T.ticker, T.Date, DATEADD(MI,1,T.tick) tick, T.cls, 0.00
    FROM (SELECT ROW_NUMBER() OVER (ORDER BY tick) RowNumber, * FROM #Stocks) AS T
    WHERE T.RowNumber = @Row
SET @Row = @Row +1
END
ELSE
SET @Row = @Row +1
END

SELECT *
FROM #Stocks
ORDER BY tick DESC
like image 44
BJones Avatar answered Oct 20 '22 22:10

BJones