Why is a table scan for small data so slow
I've got a table with 200k rows which lives on an SSD. I'm doing a lookup using a comparison on varchar field where ~20% of entries are null, otherwise 10 numeric characters with random distribution.
I expect a search to take a bit of time without an index, but not 13s like it does now. The execution plan accounts all 13s to "Clustered Index Scan". What could be going wrong here? This amount of data can be read and grepped in less than a second, so I'm surprised with SQL Server doing >10x the work.
The profiler says: CPU 359, reads 11143.
Different slow run (3m 21s) plan: https://www.brentozar.com/pastetheplan/?id=rkNF4LF6K
Edit: After compressing the table at Page level I'm down to sub-second results. After turning off compression I'm still at sub-second. After a few hours, I'm back to very long delays, this time >1min)
So as I mentioned in the comments, you appear to be having a https://docs.microsoft.com/en-us/sql/relational-databases/performance/cardinality-estimation-sql-server?view=sql-server-ver15 issue which is resulting in the unnecessary memory grant to occur (as Martin pointed out). You should always use your execution plans as a comparison even if they both are doing the same operations, because the runtime statistics may be different between the slow and fast run, which appears to be the case here.
You can observe the cardinality estimate issue you're experiencing in this example by looking at the clustered index scan operation on the right side of your execution plan. It is showing Actual Number of Rows is 5, but the plan's Estimated Number of Rows is over 7,000. This is a significant overestimate that results in the query requesting and waiting for much more memory than it needs. (Sometimes you'll the see the opposite occur, where the cardinality estimate is an underestimate, and under requests memory and / or generates a less efficient execution plan as a result.)
Your issue likely results from https://www.sqlshack.com/how-to-use-sargable-expressions-in-t-sql-queries-performance-advantages-and-examples/ and hard to optimize predicates in your
WHEREclause, most likely these predicates
WHERE (REPLACE(MEDICARE_NO, ' ', '') LIKE '111111111%') AND (STATUS_CODE Not in ('D', 'X','I','Z')). Using functions in predicates can result in cardinality estimate issues like you're currently seeing. So replacing () the
REPLACE()function with a different solution would probably help fix your issue. You can either store the data already staged with spaces replaced or add a https://docs.microsoft.com/en-us/sql/relational-databases/tables/specify-computed-columns-in-a-table?view=sql-server-ver15 to your table that applies the
REPLACE()function. You can even make it a persisted computed column and index it, since the
REPLACE()function is deterministic.
Your other predicate
AND STATUS_CODE Not in ('D', 'X','I','Z'))could also be better rewritten to maximize performance.
NOT INis an inequality operator and makes it difficult for the engine to optimize for. You're better off using
INand listing the values that are valid, which would be an equality operation then. Additionally the
INclause is just syntactical sugar for multiple
ORclauses. Sometimes that also throws off the optimizer and you can try replacing them with using a
UNIONclause instead, for each value from your
INclause. Of course that would require you to repeat your query multiple times, so hurts maintainability and readability a little bit, but most times you can refactor the core part of your code to a CTE first then using a
UNIONagainst that CTE for each value in the
But I'd recommend focussing on eliminating your use of the
REPLACE()function in your
WHEREclause first. My instincts would say that's your bigger issue here. You can quickly prove it out too by testing with selecting your data into a temp table first, with
REPLACE(MEDICARE_NO, ' ', '') AS SomeColumnas a column you select into the temp table. Then use that temp table in your main query instead, so your main query's
WHEREclause now becomes
WHERE SomeColumn LIKE '111111111%'.