T-SQL : Grouping based on chronological switch periods

Question

I have the below table extracted from a Type 2 Slowly Changing dimension. I want to group rows to only have distinct classifications and their respective 'switch' time periods in a chronological order. A simple GROUP BY won't work because of the classifications that can 'come back', like the classification 'K' that 'came back' in RowNum 5.

Current Data :

RowNum_Helper	Item_ID	Classification	Start_Date	End_Date
1	X	K	2021-01-07 02:02:44.060	2021-01-08 02:03:45.980
2	X	K	2021-01-08 02:03:45.980	2021-01-31 20:03:38.993
3	X	N	2021-01-31 20:03:38.993	2021-03-26 14:44:08.617
4	X	N	2021-03-26 14:44:08.617	2021-04-08 02:02:51.013
5	X	K	2021-04-08 02:02:51.013	2021-10-13 02:02:51.013
6	X	N	2021-10-13 02:02:51.013	2021-10-14 02:02:51.013
7	X	K	2021-10-14 02:02:51.013	9999-12-31 23:59:59.999
8	A	K	2021-06-14 02:02:51.013	9999-12-31 23:59:59.999

Desired data :

Item_ID	Classification	Earliest_Start_Date	Latest_End_Date
X	K	2021-01-07 02:02:44.060	2021-01-31 20:03:38.993
X	N	2021-01-31 20:03:38.993	2021-04-08 02:02:51.013
X	K	2021-04-08 02:02:51.013	2021-10-13 02:02:51.013
X	N	2021-10-13 02:02:51.013	2021-10-14 02:02:51.013
X	K	2021-10-14 02:02:51.013	9999-12-31 23:59:59.999
A	K	2021-06-14 02:02:51.013	9999-12-31 23:59:59.999

I tired using windowed functions like SUM OVER PARTITION BY etc, but can't really find an elegant way of doing it. Any help would be appreciated.

SQLFiddle seems to be offline, I used StackExchangeDE to create table with values for playground

Lewis · Accepted Answer · 2021-10-07 09:29:01Z

Try an approach like this.

First identify the rows that have changed by using LAG. The do a running total on those change points. Once you know that you can do the Group By but include the groups.

DROP TABLE IF EXISTS #Source
CREATE TABLE #Source (RowNum_Helper INT, ItemID CHAR(1), Classification CHAR(1), Start_Date DATETIME2, End_Date DATETIME2)

INSERT INTO #Source
SELECT 1, 'X','K','2021-01-07 02:02:44.060', '2021-01-08 02:03:45.980' UNION ALL
SELECT 2, 'X','K','2021-01-08 02:03:45.980', '2021-01-31 20:03:38.993' UNION ALL
SELECT 3, 'X','N','2021-01-31 20:03:38.993', '2021-03-26 14:44:08.617' UNION ALL
SELECT 4, 'X','N','2021-03-26 14:44:08.617', '2021-04-08 02:02:51.013' UNION ALL
SELECT 5, 'X','K','2021-04-08 02:02:51.013', '2021-10-13 02:02:51.013' UNION ALL
SELECT 6, 'X','N','2021-10-13 02:02:51.013', '2021-10-14 02:02:51.013' UNION ALL
SELECT 7, 'X','K','2021-10-14 02:02:51.013', '9999-12-31 23:59:59.999' UNION ALL
SELECT 8, 'A','K','2021-06-14 02:02:51.013', '9999-12-31 23:59:59.999'
Go

WITH IdentifyChangedRows
AS (SELECT RowNum_Helper,
           ItemID,
           Classification,
           Start_Date,
           End_Date,
           LAG(Classification) OVER (PARTITION BY ItemID ORDER BY Start_Date) AS Classification_Lag
    FROM #Source S
   ),
     CreateGroups
AS (SELECT RowNum_Helper,
           ItemID,
           Classification,
           Start_Date,
           End_Date,
           SUM(   CASE
                      WHEN Classification_Lag = Classification
                           OR Classification_Lag IS NULL THEN
                          0
                      ELSE
                          1
                  END
              ) OVER (ORDER BY RowNum_Helper
                      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
                     ) AS ChangeGroup
    FROM IdentifyChangedRows
   )
SELECT ItemID,
       Classification,
       MIN(Start_Date) Start_Date,
       MAX(End_Date) End_Date
FROM CreateGroups G
GROUP BY ItemID,
         Classification,
         ChangeGroup
ORDER BY ItemID DESC,
         Start_Date

That works very well indeed... Thank you !
– Yass T
Commented Oct 7, 2021 at 16:49 — Yass T, Commented Oct 7, 2021 at 16:49

Stack Exchange Network

T-SQL : Grouping based on chronological switch periods

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged
sql-server
t-sql
gaps-and-islands
or ask your own question.

Hot Network Questions

T-SQL : Grouping based on chronological switch periods

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged sql-servert-sqlgaps-and-islands or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
sql-server
t-sql
gaps-and-islands
or ask your own question.