Report Uptime Metrics for SQL Server by Database/Application



  • I have a requirement to report monthly metrics on SQL Server uptime, by application/database. This would be at the cluster level. I.E., if a secondary replica went offline, but a primary was still available to process transactions, uptime would still be considered 100%.

    I've looked at various tools that might serve this purpose, and a few come close. But, none of them seem to capture anything more than if the SQL Server service is online and accepting connections. They also fail to aggregate these metrics at the cluster/AG level. Meaning these uptime reports would take a hit if a secondary replica were to go offline.

    For example, let's say a database goes offline or a log file fills up, and transactions are unable to process against a single database. Those tools would say that SQL Server is up, but I would still have people saying it was a database issue. Thus, these metrics would need to reflect that SQL was not fully up at that time.

    The best idea of I'm come up with at this point is to create a SQL Agent job that inserts a record into a Canary table in each database, once a minute. Then at the end of the month, to query that table and divide the previous month's row count by the expected row count. I figured there was no better way to prove a database was actually available than to try to insert a row.

    I already have the above solution developed, tested and working. But I'm curious if anyone knows of a better way to do this. Including any goods tools or DMVs I may have overlooked, that I could use to extrapolate end-user availability metrics of all databases on a SQL Instance?



  • This is the solution I'm using, which is working.

    The below stored procedure can be used to populate a Canary table, and can be executed each minute from a SQL Agent job.

    CREATE PROCEDURE [dbo].[SQLUptime]
    AS
    BEGIN 
        SET NOCOUNT ON
    
    DECLARE
        @DatabaseName nvarchar(128),
        @Query nvarchar(max);
    
    DECLARE cDatabases CURSOR FOR
        SELECT [name]
        FROM sys.databases
        WHERE database_id > 4
    
    OPEN cDatabases;
    FETCH NEXT FROM cDatabases INTO @DatabaseName
    
    WHILE @@FETCH_STATUS = 0
    BEGIN
        BEGIN TRY
            IF COALESCE(sys.fn_hadr_is_primary_replica(@DatabaseName),1) = 1    
            BEGIN
                SET @Query = 'USE [' + @DatabaseName + ']' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) + 
                    'IF NOT EXISTS (SELECT 1 FROM sys.schemas WHERE name = N''dba'')' + CHAR(13)+CHAR(10) + 
                    'BEGIN' + CHAR(13)+CHAR(10) +
                    '    EXECUTE sp_executesql N''CREATE SCHEMA [dba]''' + CHAR(13)+CHAR(10) +
                    'END' +  CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) +
            
                    'IF NOT EXISTS (SELECT 1 FROM sys.objects WHERE object_id = OBJECT_ID(N''[dba].[Canary]'') AND type in (N''U''))' + CHAR(13)+CHAR(10) +
                    'BEGIN' + CHAR(13)+CHAR(10) +
                    '    CREATE TABLE dba.Canary (' + CHAR(13)+CHAR(10) +
                    '        [Time] Datetime NOT NULL' + CHAR(13)+CHAR(10) +
                    '    )' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) +
            
                    '    CREATE CLUSTERED INDEX CIX_Canary_Time ON dba.Canary ([Time]);' + CHAR(13)+CHAR(10) +
                    'END' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) +
    
                    'INSERT INTO dba.Canary ([Time])' + CHAR(13)+CHAR(10) +
                    'VALUES (getdate());' + CHAR(13)+CHAR(10) + CHAR(13)+CHAR(10) +
            
                    'DELETE FROM dba.Canary' + CHAR(13)+CHAR(10) +
                    'WHERE [Time] < DATEADD(MONTH, -12, GETDATE());'
    
            /*PRINT @Query*/
            EXECUTE sp_executesql @Query
        END
        END TRY
        BEGIN CATCH
            SELECT
                ERROR_NUMBER(),
                ERROR_STATE(),
                ERROR_SEVERITY(),
                ERROR_PROCEDURE(),
                ERROR_LINE(),
                ERROR_MESSAGE()
        END CATCH
        
        FETCH NEXT FROM cDatabases INTO @DatabaseName
    END
    
    CLOSE cDatabases;
    DEALLOCATE cDatabases;
    

    END

    The following query can then be used each month to report uptime based the number of rows successfully inserted into the Canary table during the previous month.

    DECLARE
        @StartDate    datetime,
        @CutoffDate   datetime,
        @TotalMinutes int,
        @output       int,
        @SQL          nvarchar(500),
        @Parameters   nvarchar(500),
        @Database     nvarchar(128);
    

    SET @StartDate = DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-1,0);
    SET @CutoffDate = DATEADD(DAY,0,DATEADD(MONTH,1,@StartDate));

    IF @CutoffDate > GETDATE()
    SET @CutoffDate = GETDATE();

    SET @TotalMinutes = DATEDIFF(MINUTE, @StartDate,@CutoffDate);
    SET @Parameters = N'@retvalOUT int OUTPUT';

    IF OBJECT_ID('tempdb..#UptimeMinutes') IS NOT NULL DROP TABLE #UptimeMinutes

    CREATE TABLE #UptimeMinutes (
    DatabaseName nvarchar(128),
    UptimeMinutes int
    )

    DECLARE cursorDatabases CURSOR
    FOR
    SELECT [name]
    FROM sys.databases
    WHERE database_id > 4;

    OPEN cursorDatabases;
    FETCH NEXT FROM cursorDatabases INTO @Database;

    WHILE @@FETCH_STATUS = 0
    BEGIN
    SET @SQL = N'USE [' + @Database + '] SELECT @retvalOUT = COUNT(*) FROM dba.Canary';

    EXEC sp_executesql @SQL, @Parameters, @retvalOUT=@output OUTPUT;
    
    INSERT INTO #UptimeMinutes (DatabaseName,UptimeMinutes)
    SELECT @Database, @output;
    
    FETCH NEXT FROM cursorDatabases INTO @Database;
    

    END

    SELECT SUM(UptimeMinutes) InstanceUptimeMinutes,
    SUM(@TotalMinutes) AS TargetInstanceUptimeMinutes,
    CONVERT(DECIMAL(10,3),(SUM(UptimeMinutes)/CONVERT(DECIMAL,SUM(@TotalMinutes))) * 100) AS InstanceUptimePercentage
    FROM #UptimeMinutes;

    SELECT DatabaseName,
    DatabaseUptimeMinutes,
    @TotalMinutes AS TargetDatabaseUptimeMinutes,
    CONVERT(DECIMAL(10,3),(UptimeMinutes/CONVERT(DECIMAL,@TotalMinutes)) * 100) AS DatabaseUptimePercentage
    FROM #UptimeMinutes;

    CLOSE cursorDatabases;
    DEALLOCATE cursorDatabases;




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2