Detail on MS SQL CTEs
I am trying to find out how MS CTEs actually access data. Trying to find out if CTEs will actually run slower when pulling data and what is the order that a CTE will pull data. Does a CTE pull the data when a view is called or does it happen in the last select statement when creating the result set? How often does the CTE have to make a database call for additional data?
inna last edited by
The main difference between https://www.erikdarlingdata.com/category/common-table-expression/ and temporary objects is that CTEs do not materialize a result set, and every time you reference them the inner query needs to be re-executed. I'm not bringing https://www.erikdarlingdata.com/category/table-variables/ into the picture because it introduces additional performance complexities.
Common Table Expression Problems
I'm using this demo as a portable example. There are no performance issues here because it's a one row table, but it's sufficient to get the point across.
CREATE TABLE #t ( id int NOT NULL PRIMARY KEY );
id = 1;
We have a one-row table, and now let's hit it with a CTE:
WITH t AS ( SELECT t.id FROM #t AS t ) SELECT t.* FROM t JOIN t AS t2 ON t2.id = t.id JOIN t AS t3 ON t3.id = t.id;
The resulting query plan looks like this:
For each join to the Common Table Expression, we end up hitting the base table. If you have CTEs that are referenced multiple times, you're usually better off using a #temp table to materialize the result. In Common Table Expressions with more complex queries in them, this can be especially painful.
While row goals do not materialize Common Table Expression results, they do (at least up to the current version of SQL Server) offer an https://dba.stackexchange.com/q/221555/32281 around the query inside of them.
As an example of how this can work, take a look at these two queries:
SELECT c = COUNT_BIG(*) FROM dbo.Users AS u JOIN dbo.Posts AS p ON p.OwnerUserId = u.AccountId JOIN dbo.Comments AS c ON c.PostId = p.ParentId JOIN dbo.Votes AS v ON v.PostId = c.PostId WHERE c.Score > 1000;
c = COUNT_BIG()
FROM dbo.Users AS u
JOIN dbo.Posts AS p
ON p.OwnerUserId = u.AccountId
SELECT TOP (2147483647)
FROM dbo.Comments AS c
JOIN dbo.Votes AS v
ON v.PostId = c.PostId
WHERE c.Score > 1000
) AS c ON c.PostId = p.ParentId;
Here are the query plans for them:
The join order is altered because SQL Server has to honor the https://sqlperformance.com/2018/02/sql-plan/setting-and-identifying-row-goals . There's no real performance difference here, but you get the idea. If there's a particular thing you want to happen together, this can be one way to achieve that without a temporary object, but sans the materialization.
For an example of tuning a query that misused Common Table expressions by using #temp tables instead, check out this post of mine:
There's a video over there that I can't embed here that walks through identifying things like poor cardinality estimates coming out of Common Table Expressions, and using each one as a stopping point to materialize results.