Implementing a bare metal backup system as part of your Disaster Recovery Planning can reduce your downtime by a day and half when you suffer a server failure.
Restoring a failed server is faster if you have implemented a backup system that allows you to restore directly from a recovery CD. A simple calculation shows this could easily be a saving of one and a half days and a 37.5% loss in productivity. That’s a significant reduction in your disaster recovery time.
Bare metal recovery
In this article I hope to convince you of the need to implement a backup policy that includes bare metal recovery for all computers: servers, desktops and laptops as part of your disaster recovery plan.
In previous posts I have explained the difference between backups and archives and the importance of implementing comprehensive archiving policies, so this article concentrates on backup policies.
However, just to re-iterate: backups are there to protect your computers, and the data they hold, from loss or corruption that might result from a variety of incidents: viruses, hacking, accidental or deliberate destruction or corruption. Archives are there to preserve documents and other information in a known state for retrieval at a later date.
The key difference is that Backups are short term and for use in an emergency; whereas Archives are long term and designed for easy searching and retrieval.
Consider the following scenario:
Imagine that you manage a successful small business with 100 employees: 20 of whom use a computer on a daily basis. To meet their needs, you have an in-house server running some form of office productivity software (Lotus Notes, Microsoft Small Business Server, etc), plus Line of Business applications like accounting, CRM and possibly production control. Your computer users (probably) have Windows computers loaded with Microsoft Office or Open Office, plus the client side of any LoB applications.(Note: the same principles apply if you use Macs)
The desktop computers are configured to store all data on the server and the server has one of the commonly used backup software suites running on it (e.g. Acronis). Your backup regime backs up the whole server once a week, with additional daily differential backups to tape or disk.
You might be thinking that you have your bases covered with this typical scenario, but have you? What would actually need to happen if something caused the server to fail, and how long would you be without its services?
Typical causes of server failure are:
Power supply or hard disk failure are the most common; though easily protected against in a decent server class machine.
Virus attack, buggy patch update, operating system corruption, buggy driver, hacker are the most likely causes.
Whatever; you’ve totally lost the server, so what do you do in this typical scenario?
- (Possibly) source and replace the failed hardware (1 to 4 days depending on your maintenance agreement. You do have a maintenance agreement don’t you !)
- Boot the server from the Recovery CD if there is one, or from the original installation CD if there isn’t (and you have it!).
- If there is a recovery CD, it should be a pretty easy process of following instructions to restore the server to the state of the last backup.
If there isn’t, you will probably need to:
- Re-install and configure the Operating System from the CD (2-3 hours)
- Update the operating to the patch state required by the Backup Software (this could take a day at least: depending on your network bandwidth)
- Re-install and configure the Backup software (maybe half a day)
- Run the Backup software and restore the server as before.
It is the steps of installing and patching the operating system and re-installing the Backup Software that take the extra time if you can’t restore direct from the bare metal.
The additional costs of not having bare metal recovery
Let’s assume that you didn’t have to replace hardware, but you did need to reinstall the operating system, its patches and the backup software before restoring the server. You could have added one to one and a half days to your downtime. During this period you have no email, no LoB applications, no files or documents. How much productivity will you lose?
Let’s say that 10 or those 20 computer users can still be 50% productive and the others 75%. That’s a total reduction in productivity of 37.5% in your office staff. If your server controlled your production environment then it could be much higher.
Only you can calculate the actual loss in productivity expressed as loss in revenue, but it doesn’t take much for this to be higher than the cost of implementing a proper bare metal backup environment.
What about the workstations?
I’ve concentrated on the server environment here because it tends to be the single point of failure, but the principle also applies to to desktop and laptop computers; particularly if it is not possibly to swap out failed computers or transfer the user of the failed computer to another similarly configured machine
Take a hard look at your current backup regime and play the game of consequences to assess whether it truly meets your needs. If it doesn’t, or you’re not sure: give me a call on 01480 476 297 and have a chat.
Agdon Associates and Business Continuity UK are no longer in business. This website is not being updated: it has been left online solely as a source of useful information on Business Continuity.