Should I offload my SQL Server backups to a secondary replica?
My production server doesn't have enough horsepower to run backups locally without causing impact to production. I'd like to offload them to my secondary replicas, to free up resources on my primary.
General questions and concerns I have are:
- What types of backups can I take on my secondary replicas?
- I take differential backups today, how will that work?
- How will this affect my RPO/RTO?
- How do I configure the backup preference?
- I've configured the backup preference, but it's not being adhered to. What did I do wrong?
- What about integrity checks? Shouldn't I take those where I take my backups?
- Will I break my log chains?
- Should I even consider offloading my backups, or should I look at other alternatives?
Types of backups you can take on a secondary replicas
You can take Copy-only full backups and log backups. You can not take a traditional full backup on a secondary replica because the database is read-only, and SQL Server cannot clear the differential bitmap flag.
Still want to take differential backups
You'll have to take this on your primary replicas, along with your full backups. Remember, you can only take Copy-only full backups on your secondary replica. To be able to reset the differential bitmap flag, you'll have to do that on your primary.
Impact to RPO/RTO
Your RPO/RTO can be greatly impacted depending on where you take your backups. Sycrnoization to secondary replicas can fall behind. Secondary replicase can be rebooted when your backup jobs are supposed to be running, plus any of a half-dozen other reasons those backups can become unreliable. Nothing would be worse than planning for a 15-minute RPO, only to find out your secondary replica was 30-minutes behind the last time a transaction log backup ran against it.
Configuring the backup preference
This is pretty easy to setup within the Availability Group wizard. You have a few options.
- Prefer Secondary - This is the most common setting if you're looking to offload your backups to a secondary replica.
- Secondary Only - be aware that if you choose this setting and no secondary replica available when it's time for you backup jobs to run, they'll simply not run for that period of time.
- Primary - This is a pretty safe setting. If you don't have a primary online and available to take backups, you have bigger issues to deal with than backup preference.
- Any Replica - I'd avoid this one. You either want your backups running on a primary or a secondary. Want you don't want is them flip flopping between the two randomly.
After setting the preference, you'll want to set the priority. This is a (0-100), 0 being the lowest, 100 being the highest. This setting is useful to help avoid backups running on all of your secondary replicas at the same time, with each thinking they are the highest priority. Typically, you'll set one replica to 100, and decrement other replicas, as appropriate.
I've configured the backup preference, but it's not being adhered to. What did I do wrong?
Setting up the backup preference and priority alone, is not enough to ensure your backups run where you think they will. Your backups jobs must be aware of this setting. Essentially, configuring these values on your availability only exposes these settings through a system function called sys.fn_hadr_backup_is_preferred_replica ( ‘dbname’ ). You'll want to make sure you backups jobs are checking the result of this function before attempting a backup against any database.
What about integrity checks? Shouldn't I take those where I take my backups?
Yes, you do want to run integrity checks where you're taking backups, so that you'll know you have valid backups that you'll be able to restore. However, in an availability group, you'll actually want to make sure you running integrity checks on all replicas, regardless of where you take your backups. Running integrity checks and taking backups on a secondary replica alone does not guarantee there is no corruption on your primary replica. You secondary replicas are only getting transaction log details from the primary. They would not know if the primary had a corrupted page sitting on disk. If you are not also running integrity checks on your primary, it could go undetected for weeks, months or longer. You may not find out until you no longer have valid backups to revert to.
You might say, "well, I'll just failover to my secondary replica". And that might work. But I'm not looking to that is my primary protection against corruption. I want to be alerted to it as soon as possible, and still have valid backups to go back to, just in case.
What about log chains
Be careful with Log Chains when taking log backups on availability groups. No matter what node you take a log backup from on an Availability Group, that log chain is persisted across all nodes. If you have log backups running on more than one node, for any reason, you'll need all of those in the event you need to restore to a point in time. Keep things simple here, limit all log backups to the same node.
My preference is to keep all of my backups running on my primary replica. If synchronization falls behind, for any reason, I know I can still rely on my backups to be the most up-to-date copy I can get. If corruption occurs, I'll be notified quickly and can react. I can also depend one automatic page repair from a secondary replica. If I'm running into an issue where it's a concern to run backups on my primary replicas, I'm questioning if the server has enough resources or if query or index tuning is required to help the server perform better. I'm not looking to limit my ability to recover a database properly, by offloading my backups to another server.
More information is available https://brendan-mccaffrey.com/2022/02/11/offloading-database-backups-to-secondary-replicas-should-i-or-shouldnt-i/ and https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/active-secondaries-backup-on-secondary-replicas-always-on-availability-groups?view=sql-server-ver15 .