Almost every business-class device now includes some form of RAID array, designed to improve data access speeds and add a layer of protection against data loss should a hard disk drive fail. Depending on the size and type of array, you may find that the system could survive the failure of several physical disks, continuing to make data available until you can replace them.
First things first
The first task for any RAID failure is to assess the severity of the situation. It is a very good sign if the machine is still running as this suggests that you are only dealing with a few failed disks.
Log on to the affected system and consult the RAID controller status. This should clearly indicate the state of the array; the status will be reported as ‘degraded’ where a disk has failed. It should then be a case of simply replacing the failed drives with a spare and allowing the array to rebuild itself automatically.
A word of warning: Always ensure you have a full backup before pulling and replacing faulty drives. Another system failure in the middle of the rebuild process could result in complete* data loss, leaving you in a really bad place.
Another piece of advice: For the quickest, most effective RAID rebuild, consider locking users out of the system completely until the process is complete. This may be unpopular, but nowhere near as unpopular as losing the data altogether.
If you cannot log into the system, the rebuild process fails more than once or the array management system reports a complete failure, things are about to get a lot more complicated.
A complete RAID array failure
If the whole array has failed, the first thing you need to do is stop. Do not try and rebuild the array, repair drives or power the system on until you have managed to get advice from a RAID recovery specialist. The potential for causing lasting damage to the array at this point is enormous.
The good news is that although everything may sound hopeless, a specialist can probably still get your data back – even from a RAID 0 array. How you proceed beyond the initial call to an expert varies depends on their processes. RAID recovery experts can retrieve the data in your datacentre or back at their local office according to your needs and location for instance. Some providers may operate strictly on- or off-site services.
Whichever way you choose to proceed, the process will be roughly the same. Using proprietary disk recovery tools and their extensive experience, engineers will clone the faulty disk platters, creating an exact sector-by-sector copy. They then attempt to rebuild corrupt blocks and indexes on those cloned drives to restore drives to their original state.
Finally the rebuilt drives will be ready for re-insertion into the array back into your datacentre, where you can perform any additional recovery or reconfiguration. Your system should then be good to go back into production.
But again, do not attempt RAID array rebuilds without first seeking professional advice. And keep your fingers crossed that you never have to put these instructions into practice!
[Image credit: Neal Edgeworth, Flickr]