Report Uptime Metrics for SQL Server by Database/Application
I have a requirement to report monthly metrics on SQL Server uptime, by application/database. This would be at the cluster level. I.E., if a secondary replica went offline, but a primary was still available to process transactions, uptime would still be considered 100%.
I've looked at various tools that might serve this purpose, and a few come close. But, none of them seem to capture anything more than if the SQL Server service is online and accepting connections. They also fail to aggregate these metrics at the cluster/AG level. Meaning these uptime reports would take a hit if a secondary replica were to go offline.
For example, let's say a database goes offline or a log file fills up, and transactions are unable to process against a single database. Those tools would say that SQL Server is up, but I would still have people saying it was a database issue. Thus, these metrics would need to reflect that SQL was not fully up at that time.
The best idea of I'm come up with at this point is to create a SQL Agent job that inserts a record into a Canary table in each database, once a minute. Then at the end of the month, to query that table and divide the previous month's row count by the expected row count. I figured there was no better way to prove a database was actually available than to try to insert a row.
I already have the above solution developed, tested and working. But I'm curious if anyone knows of a better way to do this. Including any goods tools or DMVs I may have overlooked, that I could use to extrapolate end-user availability metrics of all databases on a SQL Instance?
This is the solution I'm using, which is working.
The below stored procedure can be used to populate a Canary table, and can be executed each minute from a SQL Agent job.
CREATE PROCEDURE [dbo].[SQLUptime] AS BEGIN SET NOCOUNT ON
DECLARE @DatabaseName nvarchar(128), @Query nvarchar(max); DECLARE cDatabases CURSOR FOR SELECT [name] FROM sys.databases WHERE database_id > 4 OPEN cDatabases; FETCH NEXT FROM cDatabases INTO @DatabaseName WHILE @@FETCH_STATUS = 0 BEGIN BEGIN TRY IF COALESCE(sys.fn_hadr_is_primary_replica(@DatabaseName),1) = 1 BEGIN SET @Query = 'USE [' + @DatabaseName + ']' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) + 'IF NOT EXISTS (SELECT 1 FROM sys.schemas WHERE name = N''dba'')' + CHAR(13)+CHAR(10) + 'BEGIN' + CHAR(13)+CHAR(10) + ' EXECUTE sp_executesql N''CREATE SCHEMA [dba]''' + CHAR(13)+CHAR(10) + 'END' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) + 'IF NOT EXISTS (SELECT 1 FROM sys.objects WHERE object_id = OBJECT_ID(N''[dba].[Canary]'') AND type in (N''U''))' + CHAR(13)+CHAR(10) + 'BEGIN' + CHAR(13)+CHAR(10) + ' CREATE TABLE dba.Canary (' + CHAR(13)+CHAR(10) + ' [Time] Datetime NOT NULL' + CHAR(13)+CHAR(10) + ' )' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) + ' CREATE CLUSTERED INDEX CIX_Canary_Time ON dba.Canary ([Time]);' + CHAR(13)+CHAR(10) + 'END' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) + 'INSERT INTO dba.Canary ([Time])' + CHAR(13)+CHAR(10) + 'VALUES (getdate());' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) + 'DELETE FROM dba.Canary' + CHAR(13)+CHAR(10) + 'WHERE [Time] < DATEADD(MONTH, -12, GETDATE());' /*PRINT @Query*/ EXECUTE sp_executesql @Query END END TRY BEGIN CATCH SELECT ERROR_NUMBER(), ERROR_STATE(), ERROR_SEVERITY(), ERROR_PROCEDURE(), ERROR_LINE(), ERROR_MESSAGE() END CATCH FETCH NEXT FROM cDatabases INTO @DatabaseName END CLOSE cDatabases; DEALLOCATE cDatabases;
The following query can then be used each month to report uptime based the number of rows successfully inserted into the Canary table during the previous month.
DECLARE @StartDate datetime, @CutoffDate datetime, @TotalMinutes int, @output int, @SQL nvarchar(500), @Parameters nvarchar(500), @Database nvarchar(128);
SET @StartDate = DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-1,0);
SET @CutoffDate = DATEADD(DAY,0,DATEADD(MONTH,1,@StartDate));
IF @CutoffDate > GETDATE()
SET @CutoffDate = GETDATE();
SET @TotalMinutes = DATEDIFF(MINUTE, @StartDate,@CutoffDate);
SET @Parameters = N'@retvalOUT int OUTPUT';
IF OBJECT_ID('tempdb..#UptimeMinutes') IS NOT NULL DROP TABLE #UptimeMinutes
CREATE TABLE #UptimeMinutes (
DECLARE cursorDatabases CURSOR
WHERE database_id > 4;
FETCH NEXT FROM cursorDatabases INTO @Database;
WHILE @@FETCH_STATUS = 0
SET @SQL = N'USE [' + @Database + '] SELECT @retvalOUT = COUNT(*) FROM dba.Canary';
EXEC sp_executesql @SQL, @Parameters, @retvalOUT=@output OUTPUT; INSERT INTO #UptimeMinutes (DatabaseName,UptimeMinutes) SELECT @Database, @output; FETCH NEXT FROM cursorDatabases INTO @Database;
SELECT SUM(UptimeMinutes) InstanceUptimeMinutes,
SUM(@TotalMinutes) AS TargetInstanceUptimeMinutes,
CONVERT(DECIMAL(10,3),(SUM(UptimeMinutes)/CONVERT(DECIMAL,SUM(@TotalMinutes))) * 100) AS InstanceUptimePercentage
@TotalMinutes AS TargetDatabaseUptimeMinutes,
CONVERT(DECIMAL(10,3),(UptimeMinutes/CONVERT(DECIMAL,@TotalMinutes)) * 100) AS DatabaseUptimePercentage