Move Functions from Where Clause to Select Statement



  • I have a union query that runs abysmally slow I believe mostly because there are two functions in the where clause of each union. I am pretty sure that there is no getting around the unions, but there may be a way to move the functions from the where of each. I won't post ALL of the union sections because I don't think it is necessary as they are all almost identical with the exception of one table in each. The first function was created by someone else but it takes a date, and uses the "frequency" value like "years, months, days, etc." and the "interval" value like 3, 4, 90 to calculate the new "Due Date". For instance, a date of today with a frequency of years, and an interval of 3, would produce the date 4/21/2025. Here is the actual function:

    ALTER FUNCTION [dbo].[ReturnExpiration_IntervalxFreq](@Date datetime2,@int int, @freq int)  
    RETURNS datetime2   
    AS     
    BEGIN  
        declare @d datetime2;
        SELECT @d = case when @int = 1 then null-- '12-31-9999' 
                         when @int = 2 then dateadd(day,@freq,@date) 
                         when @int = 3 then dateadd(week,@freq,@date) 
                         when @int = 4 then dateadd(month,@freq,@date) 
                         when @int = 5 then dateadd(quarter,@freq,@date) 
                         when @int = 6 then dateadd(year,@freq,@date) 
                    end
        RETURN @d;
    

    The query itself is supposed to find and identify records whose Due Date has past or is within 90 days of the current date. Here is what each section of the union looks like

    SELECT
        R.RequirementId
        , EC.EmployeeCompanyId
        , EC.CompanyId
        , DaysOverdue = 
            CASE WHEN 
                R.DueDate IS NULL
            THEN 
                CASE WHEN 
                     EXISTS(SELECT 1 FROM tbl_Training_Requirement_Compliance RC WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId AND RC.RequirementId = R.RequirementId AND RC.Active = 1 AND ((DATEDIFF(DAY, R.DueDate, GETDATE()) > -91 OR R.DueDate Is Null ) OR (DATEDIFF(DAY, dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), GETDATE()) > -91)) OR R.IntervalId IS NULL)
                THEN 
                    DateDiff(day,ISNULL(dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), '12/31/9999'),getdate())
                ELSE 
                0
                END
            ELSE
                DATEDIFF(day,R.DueDate,getdate()) 
            END 
            ,CASE WHEN 
                    EXISTS(SELECT 1 FROM tbl_Training_Requirement_Compliance RC WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId AND RC.RequirementId = R.RequirementId AND RC.Active=1 AND (GETDATE() > dbo.ReturnExpiration_IntervalxFreq(RC.EffectiveDate, R.IntervalId, R.Frequency) OR R.IntervalId IS NULL))
                THEN 
                    CONVERT(VARCHAR(12),dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), 101)
                ELSE 
                    CONVERT(VARCHAR(12),R.DueDate,101)
                END As DateDue
    FROM
        @Employees AS EC
        INNER JOIN dbo.tbl_Training_Requirement_To_Position TRP ON TRP.PositionId = EC.PositionId
        INNER JOIN @CompanyReqs R ON R.RequirementId = TRP.RequirementId
        LEFT OUTER JOIN tbl_Training_Requirement_Compliance TRC ON TRC.EmployeeCompanyId = EC.EmployeeCompanyId AND TRC.RequirementId = R.RequirementId AND TRC.Active = 1
    WHERE
        NOT EXISTS(SELECT 1 FROM tbl_Training_Requirement_Compliance RC WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId AND RC.RequirementId = R.RequirementId  AND RC.Active = 1)
        OR ((DATEDIFF(DAY, R.DueDate, GETDATE()) > -91 OR R.DueDate Is Null ) OR (DATEDIFF(DAY, dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), GETDATE()) > -91))
    

    UNION...

    It is supposed to exclude records that either don't exist at all on the tbl_Training_Requirement_Compliance table, or if they do exist, once the frequency an intervals have been calculated, would have a new due date that is within 90 days of the current date. I am hoping that someone with much more experience and expertise in SQL Server can show me a way, if possible, to remove the functions from the WHERE clause and help the performance of this stored procedure.



  • That sure is some query you got there...

    Your query will go single threaded because of the scalar valued function. (unless you use SQL Server 2019 and the moon and stars align perfectly). Look into table valued functions.

    CREATE OR ALTER FUNCTION [dbo].[ReturnExpiration_IntervalxFreq](@date datetime2,@int int, @freq int)  
    RETURNS TABLE   
    AS     
        RETURN SELECT case when @int = 1 then null-- '12-31-9999' 
                     when @int = 2 then dateadd(day,@freq,@date) 
                     when @int = 3 then dateadd(week,@freq,@date) 
                     when @int = 4 then dateadd(month,@freq,@date) 
                     when @int = 5 then dateadd(quarter,@freq,@date) 
                     when @int = 6 then dateadd(year,@freq,@date) 
                end AS Expiration
    GO
    

    Then use it like so

    SELECT ...
    FROM        @Employees AS EC
    INNER JOIN dbo.tbl_Training_Requirement_To_Position TRP ON TRP.PositionId = EC.PositionId
    INNER JOIN @CompanyReqs R ON R.RequirementId = TRP.RequirementId
    LEFT OUTER JOIN tbl_Training_Requirement_Compliance TRC ON (TRC.EmployeeCompanyId = EC.EmployeeCompanyId AND TRC.RequirementId = R.RequirementId AND TRC.Active = 1)
    OUTER APPLY dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency) RE
    

    Now you can select and filter on RE.Expiration. You won't get any help from indexes if you still want to filter on Expiration, but at least things will go parallel.

    The OUTER APPLY is effectively a LEFT JOIN, use CROSS APPLY to do an INNER JOIN, but if R.IntervalID can be null, you probably need to check the logic.

    The predicate:

    WHERE
        NOT EXISTS(SELECT 1 FROM tbl_Training_Requirement_Compliance RC 
            WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId 
                AND RC.RequirementId = R.RequirementId  
                AND RC.Active = 1)
    OR (
            (DATEDIFF(DAY, R.DueDate, GETDATE()) > -91 
            OR R.DueDate Is Null ) 
        OR
            (DATEDIFF(DAY, dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, 
            R.IntervalId, R.Frequency), GETDATE()) > -91)
        )
    

    The NOT EXISTS part is just you asking whether the left join returns anything, so why not just ask if TRC.ID IS NULL.

    The DATEDIFF(DAY, R.DueDate, GETDATE()) > -91 part effectively forces a full scan of the index. It's better to just add 91 days to GETDATE() and ask if R.DueDate < DATEADD(DAY, 91, GETDATE()), since the dateadd needs only be calculated once, and the predicate can be executed as a seek - provided that Duedate is indexed of course.

    @Employees and @CompanyRecs look like table variables. I'm going to advise caution when using those - especially if you declare them inside your stored proc. Altering content of a table variable will usually force single threaded execution. Another thing is that SQL Server will screw up the estimates badly. Try looking up the estimated rows from those variables, my guess would be that SQL Server assumes only 1 row, and if you have thousands or millions, then your memory grants will be insufficient forcing the use of disk. There are practically no downsides to using #temptables in stead, but the upside that SQL Server will be able to estimate them correctly due to statistics. And you can index them.

    Good luck 😉


Log in to reply
 


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2