Azure MI - clustered and nonclustered indexes with the same columns - different execution plans
I have a problem to understand why I am getting 2 different execution plans based on whether clustered index is involved or not. It's happening in Azure managed instance, I have no possibility to test it on-prem.
The table with its clustered index goes as follows (it has approx 30+ additonal columns but I omitted those for clarity):
CREATE TABLE [schemaX].[TransactionJournal]( [Shi] [bigint] NOT NULL, [Ohs] [bigint] NOT NULL, [MatId] [bigint] NOT NULL, [Dop] [smallint] NULL, [OldShi] [bigint] NULL, [OldOhs] [bigint] NULL, [Qty] [decimal](18, 8) NULL, [Price] [decimal](18, 8) NULL, CONSTRAINT [PK_TransactionJournal] PRIMARY KEY CLUSTERED ( [MatId] ASC, [Shi] ASC, [Ohs] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ) GO
The nonclustered index has following definition:
CREATE NONCLUSTERED INDEX [IND_T_TransactionJournal01] ON [schemaX].[TransactionJournal] ( [Shi] ASC, [Ohs] ASC, [MatId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) GO
Now, when I issue following query:
SET STATISTICS TIME, IO ON; SELECT Shi, Ohs, MatId FROM [schemaX].[TransactionJournal] WITH(INDEX = 1) ORDER BY MatId ASC, Shi ASC, Ohs ASC OFFSET 2000000 ROWS FETCH NEXT 2000 ROWS ONLY
, I get Clustered index scan (why not seek?) on TransactionJournal which has around 190GB and over 250M records. For 2000 rows, I am waiting 30 minutes.
With following query:
SELECT Shi, Ohs, MatId FROM [schemaX].[TransactionJournal] WITH(INDEX = IND_T_TransactionJournal01) ORDER BY Shi ASC, Ohs ASC,MatID ASC OFFSET 2000000 ROWS FETCH NEXT 2000 ROWS ONLY
I get index seek on TransactionJournal and the 2000 rows come under 2 seconds.
I know I have different column order in these C and NC indexes and in the ORDER BY clauses of both queries (however ORDER BY column order is aligned with index column order in both cases), but I fail to understand why the difference is so huge (i.e. why I get scan in one case and seek in the other, although scan is not always bad and seek not always good).
EDIT: As per J.D.'s request, here are the XMLs representing the execution plans (for some reason, the pastetheplan.com website does not accept these XMLs as valid exec plans):
The one with nonclustered index seek: https://pastebin.com/ggQ7bVkv
The one with Clustered Index Scan: https://pastebin.com/8v4V7Ch2
Marcee last edited by
Why are you forcing the index it uses with an index hint instead of letting the optimizer decide that for itself? When you force it to use the clustered index, it appears to be hitting the https://www.brentozar.com/archive/2019/10/how-to-think-like-the-sql-server-engine-whats-the-tipping-point/ where it thinks a scan will be more efficient.
My guess is this is because your clustered index key definition is not ordering the columns the same way as your
ORDER BYclause in your first query. If you try re-writing your clustered index to match the
ORDER BYclause or re-write your query so the
ORDER BYclause matches the clustered index definition like the below, does it result in a Clustered Index Seek operation now?
SELECT Shi, Ohs, MatId FROM [schemaX].[TransactionJournal] WITH(INDEX = 1) ORDER BY Shi ASC, Ohs ASC, MatId ASC OFFSET 2000000 ROWS FETCH NEXT 2000 ROWS ONLY