Need help tuning a complicated query



  • I've inherited a rather complex stored procedure that is causing timeouts in our Production environment. I've used SQL Sentry Plan Explorer to help me look at some of the issues. I've identified a few but I'm having a hard time coming up with optimizations that can be made. We are using SQL Server 2019 with COMPATIBILITY_VERSION = 150.

    This sproc returns data to a web client for via an API call so the performance of the page load is very important. This stored procedure has been identified as the bottleneck thru several app profiling sessions.

    Here are some of the issues identified:

    1. Timeouts in PROD where we have 4+ millions of rows of data (ActualElapsedms = 11499)
    2. Estimated Subtree Cost is a whopping 187.74 (our next slowest sproc has a subtree cost of .653)
    3. Parts of the query plan show Residual I/O (surprisingly one of them is even an Index Seek)
    4. There is a tempdb spill data warning
    5. Logical Reads is almost 128k
    6. Sentry shows 3 Missing JOIN Predicates on the INSERT INTO @ValidRows.... but I don't know how to identify them.

    There are indexes on the tables involved and from what I can see they appear to be sufficient. However, I see a number of Index Scans being called out in the Plan as problem areas (yellow highlight)

    Here is a link to the Actual Execution Plan: https://www.brentozar.com/pastetheplan/?id=S1TN4NOS9

    CREATE VIEW [dbo].[DailyNotePublishedContentView]
    WITH SCHEMABINDING
    AS 
    SELECT
        T.Id AS DailyNoteContentId,
        T.DateModified,
        T.ModifiedBy,
        T.Region,
        T.DateAdded,
        T.CreatedBy,
        V.Id AS VersionId,
        V.DateAdded AS VersionDateAdded,
        V.CreatedBy as VersionCreatedBy,
        V.ContentType,
        V.DateDue,
        V.IsPrivate,
        V.ProjectId,
        V.PublishDate,
        V.AuthorTeamId,
        V.ContactEmail,
        V.Content,
        V.ContentText,
        V.Summary,
        V.SummaryText,
        V.AllowAddendums,
        StoreTeamsList = (SELECT STRING_AGG(CONVERT(VARCHAR(max),STA.[Name]), ', ')     
            FROM
                [dbo].[DailyNoteContentStoreTeam] (NOLOCK) ST
                INNER JOIN [dbo].[Audiences] (NOLOCK) STA ON STA.Id = ST.AudienceId AND STA.IsActive = 1
            WHERE
                ST.DailyNoteContentVersionId = V.Id 
        ),  
        AssignedToList = (SELECT STRING_AGG(CONVERT(VARCHAR(max),TAA.[Name]), ', ')     
            FROM
                [dbo].[DailyNoteContentAssignedTo] (NOLOCK) TA
                INNER JOIN [dbo].[Audiences] (NOLOCK) TAA ON TAA.Id = TA.AudienceId AND TAA.IsActive = 1
            WHERE
                TA.DailyNoteContentVersionId = V.Id         
        )       
    FROM
        [dbo].[DailyNoteContent] (NOLOCK) T 
        CROSS APPLY(
                SELECT TOP 1
                    V.Id,
                    V.DateAdded,
                    V.CreatedBy,
                    V.ContentType,
                    V.DateDue,
                    V.IsPrivate,
                    V.ProjectId,
                    V.PublishDate,
                    V.AuthorTeamId,
                    V.ContactEmail,
                    V.Content,
                    V.ContentText,
                    V.Summary,
                    V.SummaryText,
                    V.AllowAddendums
                FROM
                    [dbo].[DailyNoteContentVersion] (NOLOCK) V
                WHERE
                    V.DailyNoteContentId = T.Id
                ORDER BY V.DateAdded DESC
            ) AS V
        -- Only inlcude published content
        INNER JOIN [dbo].[DailyNotePublishStatus] (NOLOCK) PS ON 
            PS.PublishDate = V.PublishDate 
            AND PS.RegionId = T.Region 
            AND PS.IsReadyToPublish = 1
            -- TODO: Account for user defined publish time/timezones?
            AND PS.PublishDate GO



  • 9.308s of the 11.5s elapsed time for your query is consumed by the batch mode hash match right join at node 7 in the provided execution plan. This is unexpected for ~4M build-side and ~13k probe-side rows.

    The most likely explanation is the ~30k hash spilled pages to tempdb, but even this is unreasonable performance. It seems most likely your tempdb is under severe pressure, due to either to insufficient raw I/O performance or allocation contention. You should look into that and fix it if you can. It may be the system is making excessive use of table variables and other things that put pressure on tempdb.

    That said, the spill ought to be avoidable. I notice from the plan you have memory grant feedback disabled, perhaps for good reason. Still, one underlying cause is the inaccurate cardinality estimation from the table variables.

    You are making use of https://docs.microsoft.com/en-us/sql/relational-databases/performance/intelligent-query-processing#table-variable-deferred-compilation , but that only helps so much. Initial estimates can easily prove to be out of date or otherwise unrepresentative on subsequent executions of the plan. For example, the estimate on @UserAudiencesTable is 20 rows, but 157 are encountered at runtime.

    The cross joins (leading to the missing join predicate warnings) are introduced by the optimizer as it seeks to implement the complicated correlated OR EXISTS clauses in the section of the query commented with "Optional: Targeted to user location". The inaccurate cardinality estimates are multiplied through these joins, leading to the spill at the hash join.

    This brings me to another issue. The query is of a kind frequently referred to as a "kitchen sink" query - it attempts to accommodate many queries with optional components. The presence of IS [NOT] NULL OR expressions is the giveaway here. These make life very difficult for cardinality estimation, and hence final plan quality. Throw in a complex view as part of a query with multiple outer joins, applies, and subqueries, and you have a recipe for poor outcomes.

    As a quick experiment, I suggest you add OPTION (RECOMPILE) to the INSERT query. This will give the optimizer some scope to simplify things by embedding the runtime values of variables, and use the actual runtime cardinality of the table variables. Compilation time for the statement is not excessive at 318ms compared with execution time, so you might even find this is a 'good enough' solution for now.

    The results of using the recompile hint might at least inform the general direction of future improvements you might choose to make, such as generating only the needed parts of the statement using carefully-written dynamic SQL.

    You might also look into using temporary tables instead of table variables here, if statistics on those tables might benefit plan quality. Better indexes (replacing or adding to the existing clustered index) might also be worth considering. Providing additional indexes can make it easier for the optimizer to find an obviously efficient plan.

    Residual I/O is not necessarily a problem, but it does indicate that a predicate has to be applied after the seeking operation to further qualify rows. This may or may not be avoidable with better indexing.

    In summary, tuning this query and workload would be a consulting engagement, and there is not nearly enough information in the question to venture beyond the general commentary above, but it might give you some ideas to pursue.

    Ultimately, using table variables and writing complex queries over complex views are not especially compatible with reliably good execution plans and system performance.

    You should also consider patching your instance beyond the current CU8 (October 2020 vintage). I can't promise that will improve performance here, but you might benefit from some of the improvements and fixes released in the past 18 months or so.

    Oh and try to remove those NOLOCK hints, they're not magic performance-enhancers. If you can live with reading dirty data etc. run your workload at READ UNCOMMITTED isolation.

    Definitely look into your tempdb situation as well.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2