How to create "On this day in history" query



  • I'm using Databricks and I have a table with a list of event from various years. I want to return the event most recent to today's date from each year. For example,

    Today's date is 6th May and my table is thus:

    Year (int) Date (date) Event (str)
    2021 2021-08-04 Ate apple
    2021 2021-04-16 Flew plane
    2020 2020-10-11 Swam 100 miles
    2020 2020-03-07 Did backflip
    2020 2020-01-01 Tidied room
    2019 2019-09-30 Found 10 pence
    2018 2018-02-22 Lost 10 pence

    So I would want to return:

    On this day in history your most recent achievements were:

    Year Date Event
    2021 2021-04-16 Flew plane
    2020 2020-03-07 Did backflip
    2018 2018-02-22 Lost 10 pence

    Is there a neat way of doing this?...and by neat I mean, without creating extra columns or tables i.e. by comparing CURRENT_DATE to my Date field.



  • One way of achieving what you want might be to use the https://docs.databricks.com/sql/language-manual/functions/row_number.html to order matching event dates within each year (window), then pick only the most recent date each year.

    -- Use a common table expression to pre-select and order dates we want
    with past_events as ( 
       select 
          Date, Event, 
          -- within each year, order by the day since the beginning of that year
          -- in the descending order, and assign row numbers accordingly
          row_number() over (partition by year(Date) order by dayofyear(Date) desc) rn
       from some_table
       -- only pick days from each year in the past that's no later than today
       where dayofyear(Date) 

    This is "paper code" as I have no way of testing it with Databricks, but it should give you the basic idea.


    You may need to https://docs.databricks.com/sql/language-manual/sql-ref-identifiers.html#delimited-identifiers if Date or Event happen to be reserved words in Databricks.


Log in to reply
 


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2