Avoiding Hash match to improve CPU usage for a specific query



  • We are trying to resolve or figure out change in plan for one of the top sql queries by CPU.

    This query we have found (it's a vendor-based product) which unfortunately I do not have access to extract out and share it here, is expected to take less than 5 ms CPU time. The so-called bad plan takes over 50 ms CPU time and create an issue on high volume day. I know that 50ms may sound very less but this query is executed 5 million times in an hour from 30+ app servers so you can understand that big difference is sort of creating issues.

    Below is the plan with 50+ms CPU time (bad plan)

    enter image description here

    Below is the plan using under 5 ms CPU time (good plan)

    enter image description here

    What we tried?

    1. Updating stats(sp_updatestats) and clearing plan cache for this specific query - 1 in 10 times it will work

    2. adding forceseek in query to test and we get good plan. Problem is this change cannot be done in code as this being vendor product it could take over 3 months for final push.

    3. Thought of plan guide, but somehow does not get picked up. Not to expertise in using it as this query is executed like sp_execute @p1, @p2, @p3........ . These parameters are not constant and keep changing on every run. Probably doing something wrong so not sure if we can actually hint forceseek in plan guide for such process

    Please advise. I know it's very hard to assist without actual schema and query and just looking at plan, but any inputs what you think between 2 plans may assist. I can add details from exec plan if required. SQL version is 2017 but compatible mode is 2012



  • The Hash Match operation is less likely your issue here. More so, notice in your fast plan with the Clustered Index Seek there's actually 0 rows being returned. This is the difference that's likely driving the different performance you're seeing. It's literally a different amount of data being processed at this point, which can be influenced by a number of factors.

    By the way, the amount of data being joined together at that step is what determines what kind of join operation gets used. Generally Nested Loops are good for small amounts of data whereas a Hash Match is better for larger sets of data, especially if one set is ordered already. So those operations appear to be correct for each execution plan you provided, respectively.

    Can the Clustered Index Scan be optimized?...possibly, but unadvisable with the information you've provided. And if there was a way, likely it would either be an index or query change, which sounds out of the applicable options for you anyway, since this is a vendor application.

    If you truly proved out that this query is causing server contention, the only thing I can advise that you can change, with the given information, is increase your server resources. If you want to pursue additional help, we'll need additional information, even if anonymized. But this sounds like a vendor problem that you shouldn't be doing the work for them to fix their software for them anyway.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2