Dynamic sql and loops
I've got some batch job processes that are all built on the same pattern - load new data into a temp table, figure out what operations need to be done based on the data in the temp table, and then run some dynamic sql to shuffle the data in the temp tables into the final tables.
These operations can ultimately be operating on, say, 100 million rows.
The original code, made a dynamic INSERT statement and then ran sp_executeSql in a WHILE loop in batches of 100,000
I thought thousands of invocations of sp_executeSql might be more expensive than executing 1 statement with the WHILE loop built in, so I tried this instead. To my great surprise, it took 8 times longer to execute than running EXEC sp_executeSql many thousands of times.
Anyone have an idea why doing a WHILE in a dynamic sql statement would perform so badly?
SET @sql = ' DECLARE @minRowID int = 0, @maxRowID int = @batchSize, @totalInsertedCounts bigint = 0; WHILE (1=1) BEGIN INSERT INTO ' + @destinationTable + ' ('+ @columnList +') SELECT ' + @columnsForSelect + ' FROM ' + @fromTable + ' AS D (NOLOCK) INNER JOIN ' + @workTable + ' AS W (NOLOCK) ON D.[RowID] = W.[RowID_Source] WHERE W.RowID > @minRowID AND W.RowID
SET @totalInsertedCounts += @counts IF (@totalInsertedCounts % 500000 = 0) BEGIN DECLARE @percent float = ROUND(@totalInsertedCounts * 100 / CAST(@totalCounts as float), 3); DECLARE @percentStr varchar(20) = CAST (@percent as varchar); RAISERROR(''Insert completed percent: %s'', 10,1, @percentStr) WITH NOWAIT; END SET @minRowID += @batchSize SET @maxRowID += @batchSize
SET @counts = @totalInsertedCounts;
Based on the comment, I got the plans for the while-in and single statement (with many sp_executeSql execution).
https://www.brentozar.com/pastetheplan/?id=rkDUlLKb5 - while loop runs 1 dynamic statement
https://www.brentozar.com/pastetheplan/?id=S1sjxIFW5 - while loop in dynamic statement
It might be due to the query optimizer generating a new execution plan for each iteration of the loop as opposed to what happens when you execute the
sp_executesqlmany times. Check one of the https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-executesql-transact-sql?view=sql-server-ver15#remarks from the doc:
sp_executesql can be used instead of stored procedures to execute a Transact-SQL statement many times when the change in parameter values to the statement is the only variation. Because the Transact-SQL statement itself remains constant and only the parameter values change, the SQL Server query optimizer is likely to reuse the execution plan it generates for the first execution.
WHILEmight cause a different behavior resulting in bad performance. You should also check the execution plan from each method as it would improve your guesses when trying to tune a query.
See the following query from https://www.sqlshack.com/understanding-sql-server-query-plan-cache/ to check if an execution plan is being reused or not:
SELECT cplan.usecounts, cplan.objtype, qtext.text, qplan.query_plan FROM sys.dm_exec_cached_plans AS cplan CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS qtext CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qplan ORDER BY cplan.usecounts DESC