|   | ![]() |
|
Backup Overview 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Administrator 1 tries to bring up the backup server and it fails. Scratching his head, he tries to figure out what went wrong. He then turns to the last backup which was taken 15 hours ago and attempts to restore it. It turns out that it is corrupted. Administrator 1 is now completely lost. Frantically paging through documents, books, and manuals, he attempts to find a way out. Meanwhile, the CEO is now standing there demanding to know why the system is still down after the backup plan submitted to him guaranteed complete recovery inside 15 minutes?Administrator 2 came in at the usual 8AM and immediately checked the backups as was the normal routine. Everything looked fine, but she was not sure. As was the normal morning routing, she took the backup of the online sales database since it was so critical over to the testing server and restored it. Problem! It appears that the database backup is corrupted. She immediately goes to the backup server to determine if it is still intact. It is. The restore job that would have overwritten and corrupted the backup server was scheduled to run in about a hour. This job was stopped. Management was immediately called to be informed of the problem while the administrator went back to the database that was running to determine if the production database was still intact or fixable online. The production database appeared to be intact, but inspecting the Event Logs, a couple of transitory disk errors were logged. She made the decision to switch over to the backup server and informed management. The sales applications were switched over to the backup server without any loss of service. The production server then had all of the unneccessary services and applications shut down while the administrator determined what rows existed in the database she had taken offline that were not reflected in the backup system now online. Two transactions were moved over and all data was captured. At about this time, the server with problems crashed with critical disk errors. Which administrators shoes would you have liked to be in? In an operational environment, things can get rather routine. This is no excuse for not following proper procedures. By restoring a backup of a critical system, you can detect errors that have crept in, but are not yet critical enough to bring a system down. This allows you to avoid the system outage. These types of things are what DBAs get paid for. Avoid a failure or prevent data loss in the case of a crash and you have earned every penny you have been paid for your time. Even if the system would have crashed, administrator 2 would have performed much better and gotten the system back online much more quickly than administrator 1. This is due to simply training and familiarity. If you do a restore of a database every day, when you have to do it under pressure, the restore is simple routine because you have done it so many times. The training also serves another extremely vital part in your backkup strategy. Doing constant restores allows you to validate the integrity of your backups. This process serves as a constant test of your backup plan. Conditions change over time and the backup plan you had devised may have holes in it 6 months from now. By constantly testing, you can discover and plug these holes before a crash exposes them at the wrong time.
Backup Overview 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
All content on this site, except where noted, represents an original work of Michael R. Hotek and is protected by applicable copyright laws. The SQL Server FAQ is the sole work of Neil Pike. No page, portion of a page, or download may be used for commercial purposes in whole or in part without the express, written permission of the applicable author.