Retrieve gaps in dates in SQL Server



  • I'm trying to figure out how to retrieve gaps in time-series data. Below is my example.

    CREATE TABLE MYTABLE
        ([ID] int, [DATE] DATE)
    ;
    

    INSERT INTO MYTABLE
    ([ID], [DATE])
    VALUES
    (1, '2022-01-01'),
    (1, '2022-02-01'),
    (1, '2022-03-01'),
    (1, '2022-04-01'),
    (1, '2022-05-01'),
    (1, '2022-06-01'),
    (1, '2022-10-01'),
    (1, '2022-11-01'),
    (1, '2022-12-01'),
    (2, '2022-01-01'),
    (2, '2022-02-01'),
    (2, '2022-03-01'),
    (2, '2022-04-01'),
    (2, '2022-05-01'),
    (2, '2022-06-01'),
    (2, '2022-07-01'),
    (2, '2022-08-01'),
    (2, '2022-10-01'),
    (2, '2022-11-01'),
    (2, '2022-12-01')
    ;

    The date field follows a convention where the day is always 1st of the month. For the above example, the gaps would be the following.

    | ID |       DATE |
    |----|------------|
    |  1 | 2022-07-01 |
    |  1 | 2022-08-01 |
    |  1 | 2022-09-01 |
    |  2 | 2022-09-01 |
    

    How would I write a query to return the above results?



  • Here is one straightforward way. First you generate all dates between min and max in MYTABLE:

    with min_max(min_dt, max_dt) as ( 
        select min([DATE]), max([DATE]) from MYTABLE
    ), all_dt(d) as (
        select min_dt from min_max
        union all 
        select DATEADD(month, 1, d) from all_dt
        where d < (select max_dt from min_max)
    )
    select y.d 
    from all_dt y
    

    Now you can take the cartesian product between that and the set of users in MYTABLE

    with min_max(min_dt, max_dt) as ( 
        select min([DATE]), max([DATE]) from MYTABLE
    ), all_dt(d) as (
        select min_dt from min_max
        union all 
        select DATEADD(month, 1, d) from all_dt
        where d < (select max_dt from min_max)
    )
    select x.id, y.d 
    from all_dt y, (select distinct [ID] as id from MYTABLE) as x
    

    I used the "," join since I was not sure whether SQL-server supports explicit CROSS JOIN. Finally you can subtract all existing rows from that set:

    with min_max(min_dt, max_dt) as ( 
        select min([DATE]), max([DATE]) from MYTABLE
    ), all_dt(d) as (
        select min_dt from min_max
        union all 
        select DATEADD(month, 1, d) from all_dt
        where d < (select max_dt from min_max)
    )
    select x.id, y.d 
    from all_dt y, (select distinct [ID] as id from MYTABLE) as x
    except
    select [ID], [DATE] from MYTABLE
    

    I would advise you to avoid identifiers like DATE and ID, but I assume this is just an example

    EDIT: CROSS JOIN apparently works according to my fiddle, so you can rephrase that as:

    ...
    SELECT x.id, y.d 
    FROM all_dt y 
    CROSS JOIN (SELECT distinct [ID] AS id FROM MYTABLE) AS x
    EXCEPT
    SELECT [ID], [DATE] FROM MYTABLE
    

    https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=03c144940a57fe33a9dafd44666e74d5




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2