Operating on consecutive rows/columns

Today we will learn how to operate upon two consecutive rows. The logic here can be extrapolated and applied to as many rows as you can.

To start with, suppose there is a process which runs every day and it produces certain quantity of product (as output). I need to know by how many products is my today’s process leading/trailing then yesterday.

Sample data:

CREATE TABLE test
(
process VARCHAR(10), prcs_dt DATE, qty INT
);

INSERT INTO test
(process, prcs_dt,qty)
VALUES
('P1',GETDATE(),10),
('P1',GETDATE()+1,20),
('P1',GETDATE()+2,25),
('P1',GETDATE()+3,30),
('P1',GETDATE()+4,40);
 

Data in test table looks like:

| PROCESS |    PRCS_DT | QTY |
------------------------------
|      P1 | 2012-07-22 |  10 |
|      P1 | 2012-07-23 |  20 |
|      P1 | 2012-07-24 |  25 |
|      P1 | 2012-07-25 |  30 |
|      P1 | 2012-07-26 |  40 |

To find out if the today’s output(qty) is leading/trailing than yesterday our Query should give the difference between two consecutive dates.

Thinking strategy: If qty of every previous process execution is present in the same row, it would have made our life easier. Just subtracting the new value with the old one would have solved the problem. This can be done by first ordering the processes by prcs_dt and then assign them a sequential row number. Once this is done, we can perform a self join on consecutive rows to bring them together (in a single row).

Step-by-step logic
Step 1: Order the data set by prcs_dt and create sequential row number for them

SELECT ROW_NUMBER() OVER (ORDER BY prcs_dt) seq,
process, prcs_dt, qty
FROM test;

Output:

| SEQ | PROCESS |    PRCS_DT | QTY |
------------------------------------
|   1 |      P1 | 2012-07-22 |  10 |
|   2 |      P1 | 2012-07-23 |  20 |
|   3 |      P1 | 2012-07-24 |  25 |
|   4 |      P1 | 2012-07-25 |  30 |
|   5 |      P1 | 2012-07-26 |  40 |

Step 2: Perform a self join on this result set such that two consecutive rows appears as a single row. To do this, I am using a CTE which holds the above result set and then perform a self join on CTE with a condition that seq of one CTE is equal to seq + 1 (sequence incremented by 1) of other CTE

WITH cte
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY prcs_dt) seq,
process, prcs_dt, qty
FROM test
)
SELECT c1.seq seq1,c1.prcs_dt,c1.qty qty1,
c2.seq seq2,c2.qty qty2
FROM cte c1 LEFT JOIN cte c2
ON c1.seq = c2.seq+1;
 

Output:

| SEQ1 |    PRCS_DT | QTY1 |   SEQ2 |   QTY2 |
----------------------------------------------
|    1 | 2012-07-22 |   10 | (null) | (null) |
|    2 | 2012-07-23 |   20 |      1 |     10 |
|    3 | 2012-07-24 |   25 |      2 |     20 |
|    4 | 2012-07-25 |   30 |      3 |     25 |
|    5 | 2012-07-26 |   40 |      4 |     30 |

Step 3: Simply subtract the current quantity(qty1) from the previous one (qty2) to get the differential result

WITH cte
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY prcs_dt) seq,
process, prcs_dt, qty
FROM test
)
SELECT c1.seq seq1,c1.prcs_dt,c1.qty qty1,
c2.seq seq2,c2.qty qty2,c1.qty-ISNULL(c2.qty,0) qty_diff
FROM cte c1 LEFT JOIN cte c2
ON c1.seq = c2.seq+1;

Output:

| SEQ1 |    PRCS_DT | QTY1 |   SEQ2 |   QTY2 | QTY_DIFF |
---------------------------------------------------------
|    1 | 2012-07-22 |   10 | (null) | (null) |       10 |
|    2 | 2012-07-23 |   20 |      1 |     10 |       10 |
|    3 | 2012-07-24 |   25 |      2 |     20 |        5 |
|    4 | 2012-07-25 |   30 |      3 |     25 |        5 |
|    5 | 2012-07-26 |   40 |      4 |     30 |       10 |

Kindly observe for the first row in the output above qty2 is NULL, this cannot be subtracted from qty1 as it will output a NULL value so I have used ISNULL function to replace NULL with zero.

 

Advertisements