Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare 3 Consecutive rows in a table

Tags:

sql

teradata

Hi I have an interesting problem. I Have an Employee Table AS Follows

CREATE TABLE EMPLOYEE(
EMPLOYEE_ID INTEGER,
SALARY DECIMAL(18,2),
PAY_PERIOD DATE)

Now the tables have employees some of whom get paid monthly,some weekly, some biweekly and some daily. What we want is to find an Indicator saying 'Y' if the salary of three consecutive Pay Periods is equal. Lets take the following example.

Employee   Pay_Period     Salary

  1         01/01/2012    $500
  1         08/01/2012    $200
  1         15/01/2012    $200
  1         22/01/2012    $200
  1         29/01/2012    $700

In this case the indicator should be Yes because 3 consecutive pay periods have a salary of $200.

Since the number of pay periods is not constant I am unsure of how to write this code because I do not know from before hand how many left joins I will need.Since I am writing this in Teradata I tried using the RECURSIVE Function but got stumped. Any general ideas on how to proceed with this?I would prefer not creating a stored procedure or having PL/SQL logic.

like image 200
Eosphorus Avatar asked Dec 09 '22 20:12

Eosphorus


2 Answers

Teradata may not support LEAD and LAG in the manner that Oracle and SQL Server now support it but the premise of these functions is based on selecting the correct window for your Window Aggregate functions. In Teradata LEAD and LAG can be accomplished by using the ROWS BETWEEN clause in your Window Aggregate Function.

Here is how you can accomplish what you are looking to do using ROWS BETWEEN and a single pass at the table:

CREATE VOLATILE TABLE myTable
( myID SMALLINT NOT NULL,
  PayPeriod DATE NOT NULL,
  PayAmount DECIMAL(5,2) NOT NULL)
PRIMARY INDEX (myID) 
ON COMMIT PRESERVE ROWS;

INSERT INTO myTable VALUES (1, DATE '2012-01-01', 500);
INSERT INTO myTable VALUES (1, DATE '2012-01-08', 200);
INSERT INTO myTable VALUES (1, DATE '2012-01-15', 200);
INSERT INTO myTable VALUES (1, DATE '2012-01-22', 200);
INSERT INTO myTable VALUES (1, DATE '2012-01-29', 700);


SELECT myID
     , PayPeriod
     , PayAmount
     , MAX(PayAmount) OVER (PARTITION BY myID 
                                ORDER BY PayPeriod 
                            ROWS BETWEEN 1 FOLLOWING 
                                     AND 1 FOLLOWING) AS NextPayAmount_
     , MAX(PayAmount) OVER (PARTITION BY myID 
                                ORDER BY PayPeriod 
                            ROWS BETWEEN 2 FOLLOWING 
                                     AND 2 FOLLOWING) AS NextPayAmount2_
     , CASE WHEN NextPayAmount_ = PayAmount
             AND NextPayAmount2_ = PayAmount
            THEN 'Y'
            ELSE 'N'
       END PayIndicator_
  FROM myTable;

Results

1   2012-01-01  500 200 200 N
1   2012-01-08  200 200 200 Y
1   2012-01-15  200 200 700 N
1   2012-01-22  200 700   ? N
1   2012-01-29  700   ?   ? N
like image 129
Rob Paller Avatar answered Dec 23 '22 23:12

Rob Paller


Teradata does not have lead/lag. However, it does have row_number(). So, you can do what you want as:

with  as (
    select e.*,
           row_number() over (partition by employee_id order by pay_period) as seqnum
    from employee
)
select <whatever you want>
from emp e join
     emp e1 join
     on e.employee_id = e1.employee_id and
        e.seqnum = e1.seqnum+1
     emp e2
     on e.employee_id = e2.employee_id and
        e.seqnum = e2.seqnum+2
where e.salary = e1.salary and e.salary = e2.salary

I do have a couple of suggestions beyond this. First, your employee table should have one row per employee, with a primary key of employee_id. This table should be called something like EmployeeSalary. Second, your pay period should have two dates, a beginning and end date.

like image 28
Gordon Linoff Avatar answered Dec 24 '22 00:12

Gordon Linoff