Date sequence generator using CTE

Today we will learn how to generate a sequence of dates. Usually “Date sequence generator” has a wide variety of use, we will touch up this in our next blog.

Before getting into generating a sequence of dates let us first generate a number sequence . Here is the example to generate a sequence from 1 to 10

DECLARE @start INT = 1
DECLARE @end INT = 10

;WITH num_seq
AS
(
 SELECT @start seq
 UNION ALL
 SELECT seq + 1
 FROM num_seq
 WHERE seq + 1 <= @end
 )
SELECT *
FROM num_seq
| SEQ |
-------
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |

Now let us expand this concept to generate a sequence of dates. Following example shows how to generate a date sequence from 01-JAN-2012 to 10-JAN-2012

DECLARE @start DATETIME = '01-JAN-2012'
DECLARE @end DATETIME = '10-JAN-2012'

;WITH date_seq
AS
(
 SELECT @start seq
 UNION ALL
 SELECT seq + 1
 FROM date_seq
 WHERE seq + 1 <= @end
 )
SELECT *
FROM date_seq
| SEQ |
-----------------------------
| January, 01 2012 00:00:00 |
| January, 02 2012 00:00:00 |
| January, 03 2012 00:00:00 |
| January, 04 2012 00:00:00 |
| January, 05 2012 00:00:00 |
| January, 06 2012 00:00:00 |
| January, 07 2012 00:00:00 |
| January, 08 2012 00:00:00 |
| January, 09 2012 00:00:00 |
| January, 10 2012 00:00:00 |

Isn’t it very simple!
Let us go a step further generate a date sequence of all Monday’s for the month of January,2012
To do this I need to get the first Monday which falls between my start and end date.

Here’s the logic to get first Monday of the given date (@start – taking this from previous example)

DECLARE @start DATETIME = '01-JAN-2012'
SELECT @start+2-DATEPART(dw,@start) Monday_dt
| MONDAY_DT |
-----------------------------
| January, 02 2012 00:00:00 |

The idea here is that DATEPART function with ‘dw’ as datepart will return week-day. DATEPART by default takes Sunday to be the first day of the week and returns an interger value 1 and for Monday it returns value as 2. Because I am interested in getting Monday, I have added an offset of +2 in my given date to nullify the DATEPART value to zero if it is Tuesday.

Something like this: @Start +2 (offset value ) – 2 (this 2 is returned if @Start is Monday)
Hence what I am left in the end is @Start [considering if it is Monday]

Let us take another example of @Start = ’01-JAN-2011’ (This date is Saturday)

DECLARE @start DATETIME = '01-JAN-2011'
SELECT @start+2-DATEPART(dw,@start) Monday_dt
| MONDAY_DT |
------------------------------
| December, 27 2010 00:00:00 |

This logic is correct, 1st of January,2011 being Saturday, this logic returns first Monday as 27th December,2011. But the point is that date sequence should always generate a date (Monday in our case) which is greater than or equal to start date. Hence I need to modify my logic to get this.

Here it is: I check if weekday is less than or equal to 2, if it is there is no need to modify the logic, if it is not I add +7 as offset to get Next Monday.

DECLARE @start DATETIME = '01-JAN-2011'
SELECT CASE
 WHEN DATEPART(dw, @Start)<=2
 THEN @start +2 -DATEPART(dw,@start)
 ELSE
 @start +2 + 7 -DATEPART(dw,@start)
 END Monday_dt
| MONDAY_DT |
-----------------------------
| January, 03 2011 00:00:00 |

Looking at the logic above for same date I am getting a Monday which is greater than start date.

Now embed this logic in the CTE to generate an anchor record and make a small modification in the recursive code to add an offset of +7 to start date

DECLARE @start DATETIME = '01-JAN-2012'
DECLARE @end DATETIME = '31-JAN-2012'

;WITH date_seq
AS
(
 SELECT CASE
 WHEN DATEPART(dw, @Start)<=2
 THEN @start +2 -DATEPART(dw,@start)
 ELSE
 @start +2 + 7 -DATEPART(dw,@start)
 END Monday_seq
 UNION ALL
 SELECT Monday_seq + 7
 FROM date_seq
 WHERE Monday_seq + 7 <= @end
 )
SELECT *
FROM date_seq
| MONDAY_SEQ |
-----------------------------
| January, 02 2012 00:00:00 |
| January, 09 2012 00:00:00 |
| January, 16 2012 00:00:00 |
| January, 23 2012 00:00:00 |
| January, 30 2012 00:00:00 |

Hope you enjoyed reading this post.

Happy Learning
Lokesh

Operating on consecutive rows/columns

Today we will learn how to operate upon two consecutive rows. The logic here can be extrapolated and applied to as many rows as you can.

To start with, suppose there is a process which runs every day and it produces certain quantity of product (as output). I need to know by how many products is my today’s process leading/trailing then yesterday.

Sample data:

CREATE TABLE test
(
process VARCHAR(10), prcs_dt DATE, qty INT
);

INSERT INTO test
(process, prcs_dt,qty)
VALUES
('P1',GETDATE(),10),
('P1',GETDATE()+1,20),
('P1',GETDATE()+2,25),
('P1',GETDATE()+3,30),
('P1',GETDATE()+4,40);
 

Data in test table looks like:

| PROCESS |    PRCS_DT | QTY |
------------------------------
|      P1 | 2012-07-22 |  10 |
|      P1 | 2012-07-23 |  20 |
|      P1 | 2012-07-24 |  25 |
|      P1 | 2012-07-25 |  30 |
|      P1 | 2012-07-26 |  40 |

To find out if the today’s output(qty) is leading/trailing than yesterday our Query should give the difference between two consecutive dates.

Thinking strategy: If qty of every previous process execution is present in the same row, it would have made our life easier. Just subtracting the new value with the old one would have solved the problem. This can be done by first ordering the processes by prcs_dt and then assign them a sequential row number. Once this is done, we can perform a self join on consecutive rows to bring them together (in a single row).

Step-by-step logic
Step 1: Order the data set by prcs_dt and create sequential row number for them

SELECT ROW_NUMBER() OVER (ORDER BY prcs_dt) seq,
process, prcs_dt, qty
FROM test;

Output:

| SEQ | PROCESS |    PRCS_DT | QTY |
------------------------------------
|   1 |      P1 | 2012-07-22 |  10 |
|   2 |      P1 | 2012-07-23 |  20 |
|   3 |      P1 | 2012-07-24 |  25 |
|   4 |      P1 | 2012-07-25 |  30 |
|   5 |      P1 | 2012-07-26 |  40 |

Step 2: Perform a self join on this result set such that two consecutive rows appears as a single row. To do this, I am using a CTE which holds the above result set and then perform a self join on CTE with a condition that seq of one CTE is equal to seq + 1 (sequence incremented by 1) of other CTE

WITH cte
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY prcs_dt) seq,
process, prcs_dt, qty
FROM test
)
SELECT c1.seq seq1,c1.prcs_dt,c1.qty qty1,
c2.seq seq2,c2.qty qty2
FROM cte c1 LEFT JOIN cte c2
ON c1.seq = c2.seq+1;
 

Output:

| SEQ1 |    PRCS_DT | QTY1 |   SEQ2 |   QTY2 |
----------------------------------------------
|    1 | 2012-07-22 |   10 | (null) | (null) |
|    2 | 2012-07-23 |   20 |      1 |     10 |
|    3 | 2012-07-24 |   25 |      2 |     20 |
|    4 | 2012-07-25 |   30 |      3 |     25 |
|    5 | 2012-07-26 |   40 |      4 |     30 |

Step 3: Simply subtract the current quantity(qty1) from the previous one (qty2) to get the differential result

WITH cte
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY prcs_dt) seq,
process, prcs_dt, qty
FROM test
)
SELECT c1.seq seq1,c1.prcs_dt,c1.qty qty1,
c2.seq seq2,c2.qty qty2,c1.qty-ISNULL(c2.qty,0) qty_diff
FROM cte c1 LEFT JOIN cte c2
ON c1.seq = c2.seq+1;

Output:

| SEQ1 |    PRCS_DT | QTY1 |   SEQ2 |   QTY2 | QTY_DIFF |
---------------------------------------------------------
|    1 | 2012-07-22 |   10 | (null) | (null) |       10 |
|    2 | 2012-07-23 |   20 |      1 |     10 |       10 |
|    3 | 2012-07-24 |   25 |      2 |     20 |        5 |
|    4 | 2012-07-25 |   30 |      3 |     25 |        5 |
|    5 | 2012-07-26 |   40 |      4 |     30 |       10 |

Kindly observe for the first row in the output above qty2 is NULL, this cannot be subtracted from qty1 as it will output a NULL value so I have used ISNULL function to replace NULL with zero.

 

Performing SUM over a column of datatype TIME

Few days back one of my colleague asked me a very interesting problem of adding time. He wanted to know how to add time in a similar fashion just add we add integers using SUM() function.

This can be done very easily: Let us first create a table time_store to store TIME format [hh:mm:ss.msc] and insert some values in this table.


CREATE TABLE time_store
  (
     tme TIME
  )
GO

--Following code loops 5 times
--and inserts time in time_store table with a delay of 5 seconds

WAITFOR DELAY '000:00:5'
INSERT INTO time_store VALUES (CONVERT(TIME,GETDATE()))
GO 5

SELECT * FROM time_store
GO

tme
----------------
02:09:05.8370000
02:09:10.8500000
02:09:15.8670000
02:09:20.8800000
02:09:25.8930000

Adding time is a simple three step process:
1) Convert your time to seconds
2) Add all the seconds together
3) Convert your seconds back in time format [hh:mm:ss.msc]

Below code snippet gives you the result.


SELECT CONVERT(TIME, DATEADD(s, SUM(( DATEPART(hh, tme) * 3600 ) + ( DATEPART(mi, tme) * 60 ) + DATEPART(ss, tme)), 0)) AS total_time
FROM   time_store

total_time
----------------
10:46:15.0000000

To understand, let us investigate the code step by step

1) Convert time into seconds

Formula: Hours x 60(mins) x 60(secs) + Minutes x 60 (secs) + Seconds


SELECT tme,
       ( DATEPART(hh, tme) * 3600 ) +
       ( DATEPART(mi, tme) * 60 ) +
         DATEPART(ss, tme) AS tme_in_sec
FROM   time_store

tme              tme_in_sec
---------------- -----------
02:09:05.8370000 7745
02:09:10.8500000 7750
02:09:15.8670000 7755
02:09:20.8800000 7760
02:09:25.8930000 7765

2) Use the simple SUM() formula now to add seconds


SELECT SUM(
             ( DATEPART(hh, tme) * 3600 ) +
             ( DATEPART(mi, tme) * 60 ) +
               DATEPART(ss, tme)
          )    AS total_sec
FROM   time_store

total_sec
-----------
38775

3) Convert total seconds to time format [hh:mm:ss]

Simple tip to do this is to use ADDDATE() function. Using this function you should add total seconds to a dummy date. This will return you with a datetime value. Clip-off the date part by converting datetime to time using CONVERT() function.

Let us firt generate a dummy datetime, this can be done with:


SELECT DATEADD(s, 0, 0) AS dummy_datetime

dummy_datetime
-----------------------
1900-01-01 00:00:00.000

Here we are adding zero seconds to dummy date. If we add total seconds we calculated earlier to this dummy date, it will automatically convert seconds to time format.


DECLARE @total_sec INT    -- declared variable to hold total seconds

SELECT @total_sec  =  SUM(( DATEPART(hh, tme) * 3600 ) +
                          ( DATEPART(mi, tme) * 60 ) +
                            DATEPART(ss, tme))
FROM   time_store

SELECT DATEADD(s, @total_sec, 0) AS total_datetime
FROM   time_store

total_datetime
-----------------------
1900-01-01 10:46:15.000

Observe the date and time part, date part matches with that of dummy date but total seconds are converted into time format. Now to get rid of the date part, convert this datetime to time format using CONVERT() function.


SELECT CONVERT(TIME, DATEADD(s, @total_sec, 0)) AS total_time
FROM   time_store

total_time
----------------
10:46:15.0000000

If you think this is the end, then you are wrong, there is a catch here. Let us try this logic on a different set of data and check if it works fine or not.

New data set being the following, let us apply our code:


tme
----------------
13:06:55.8030000
13:07:00.8300000
13:07:05.8400000
13:07:10.8500000
13:07:15.8600000

DECLARE @total_sec INT

SELECT @total_sec  =  SUM(( DATEPART(hh, tme) * 3600 ) +
                          ( DATEPART(mi, tme) * 60 ) +
                            DATEPART(ss, tme))
FROM   time_store

SELECT CONVERT(TIME, DATEADD(s, @total_sec, 0)) AS total_time
FROM   time_store

total_time
----------------
17:35:25.0000000

Do you think this is correct?
Absolute No No!!

Let us remove the CONVERT() function from the above code and check the result in datetime format


SELECT DATEADD(s, @total_sec, 0) AS total_datetime
FROM   time_store

total_datetime
-----------------------
1900-01-03 17:35:25.000

Clearly observe the date part, dummy date has got incremented by two days. With a small modification in the code, we can actually calcuate total time in days, hours:mins:secs


SELECT @total_sec / ( 24 * 3600 ) as total_days, convert(time,DATEADD(s, @total_sec, 0)) AS total_datetime

total_days  total_datetime
----------- ----------------
2           17:35:25.0000000

If we apply the same code to the data set chosen earlier, it gives the following result.


total_days  total_datetime
----------- ----------------
0           10:46:15.0000000

Hope you all enjoyed reading this post.

Happy Learning
Lokesh Vij