Every company that uses any type of computer system absolutely must have a disaster recovery strategy. In other words, a plan must be in place to protect valuable data and get critical computers back in operation in the event of an operational mistake, system failure or building disaster such as fire, flood, tornado... You get the picture. It is fairly common for modern hard drives to fail due to the number of moving parts and tight tolerances inherent in their design. Before we talk about tape drives, backup sets, offline storage etc, I would like you to answer a few questions:
In the event of a system failure on the computer(s) that hold(s) your company's critical data, how many hours of data can you afford to lose?
In other words, lets say that your main server crashed right now. It is determined that the main drive where your data "lives" is toast, totally unrecoverable. If your server is backing up at night and it's now 3:00 PM, you have at least 7 hours of data. This could represent Office document creation and edits as well as data entered into your accounting system. In some cases it could also mean unprocessed on-line orders and email. What does this mean to your company's ability to conduct business? For some of you, it might be simply an inconvenience and mean a few extra hours overtime for the receptionist/bookkeeper. For others, it could be disastrous.
The next question: Once your critical server has crashed, how many hours can you afford for it to be down?
Even if you have up to the minute backups, it still might take several hours to get the recovered server back to where it needs to be to function properly.
The answer to these 2 questions should pretty much shape your disaster recovery strategy. The most basic strategy that I employ for my small business customers is an automated backup to tape that occurs nightly. There is usually a 1-2 week rotation of tapes. The most recent backup is physically taken off-site by the person responsible for managing the backups. Under this scenario, the data loss exposure is up to a full day of data with a system rebuild cycle of about 4 hours.
For companies that cannot afford to lose up to a day's work, a more aggressive backup strategy can be employed. Software can be installed on critical servers that create a system "snapshot" several times during the day. These snapshots contain a complete image of everything on the server including OS, programs and data. In the event of system failure, once the server is ready from a hardware perspective, a "bare metal" restore can be performed from the latest image file. This process can get the server fully functional with a minimal loss of data in just a few hours. If a complete server is kept in reserve, ready to deploy when needed, the restore process can take a little as an hour. For organizations that simply cannot be down at all, a complete fail over server can be kept in a remote data center. Thus when the primary server fails, the backup server simply takes over.
In addition to server backups, some companies may want to look at what is stored on the computers that access the servers. If your company does not have a "all critical data belongs on the server" policy, it could be very possible that valuable data is being stored on some of the local C: drives that make up the network. In that case, part of the backup strategy must include the client workstations as well.
A final point (and this is a very important one) is that periodic confidence tests must be performed to make sure that the data on the backup media can indeed be restored. A flawless, well though out backup strategy is of no use if the data cannot be restored when needed. A confidence test is a kind of a dry run restore to make sure the data will be recoverable.
I tried to cover some of the most important considerations when it comes to a disaster recovery plan in this post. Since a whole book could be written on the subject, I have just scratched the surface. If you would like some assistance planning your strategy please contact me:
quandtster@gmail.com
Saturday, January 20, 2007
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment