|
A wide variety of failures can cause
physical damage to storage media. CD-ROMs can have their metallic substrate or dye layer scratched off; hard disks can suffer any of several mechanical failures, such as head crashes and failed motors; tapes can simply break.
Physical damage always causes at least some data loss, and in many cases the logical structures of the file system are damaged as well.
This causes logical damage that must be dealt with before any files can be rescued from the failed media.
Most physical damage cannot be repaired by end users. For example, opening a hard disk in a normal environment can allow dust to settle on the surface, causing further damage to the platters and complicating the recovery process.
Furthermore, end users generally do not have the hardware or technical expertise required to make these repairs; therefore, data recovery companies are consulted. These firms use Class 100
clean room facilities to protect the media while repairs are being made.
The extracted raw image can be used to reconstruct useable data after any logical damage has been repaired. Once that is complete, the files may be in useable form. Data recovery success rates are below 100%.
Data recovery can even be accomplished for alternate digital media such as Flash cards for digital cameras, old tape drives, etc.
Logical damage
Far more common than physical damage is logical damage to a file system. Logical damage is primarily caused by power outages that prevent file system structures from being completely written to the storage medium, but problems with hardware (especially RAID controllers) and drivers, as well as system crashes, can have the same effect. The result is that the file system is left in an inconsistent state.
This can cause a variety of problems, such as strange behavior (e.g., infinitely
recurring directories, drives reporting negative amounts of free space), system crashes, or an actual loss of data. Various programs exist to correct these inconsistencies, and most operating systems come with at least a rudimentary repair tool for their native file systems.
Linux, for instance, comes with the fsck utility, Mac OS X has Disk Utility and Microsoft Windows provides chkdsk. Third-party utilities are also available, and some can produce superior results by recovering data even when the disk cannot be recognized by the operating system's repair utility.
Two main techniques are used by these repair programs. The first, consistency checking, involves scanning the logical structure of the disk and checking to make sure that it is consistent with its specification. For instance, in most file systems, a directory must have at least two entries: a dot (.) entry that points to itself, and a dot-dot (..) entry that points to its parent.
A file system repair program can read each directory and make sure that these entries exist and point to the correct directories. If they do not, an error message can be printed and the problem corrected. Both chkdsk and fsck work in this fashion. This strategy suffers from a major problem, however; if the file system is sufficiently damaged, the consistency check can fail completely.
In this case, the repair program may crash trying to deal with the mangled input, or it may not recognize the drive as having a valid file system at all.
The second technique for file system repair is to assume very little about the state of the file system to be analyzed, and using any hints that any undamaged file system structures might provide, rebuild the file system from scratch.
This strategy involves scanning the entire drive and making note of all file system structures and possible file boundaries, then trying to match what was located to the specifications of a working file system. Some third-party programs use this technique, which is notably slower than consistency checking.
It can, however, recover data even when the logical structures are almost completely destroyed. This technique generally does not repair the underlying file system, but merely allows for data to be extracted from it to another storage device.
While most logical damage can be either repaired or worked around using these two techniques, data recovery software can never guarantee that no data loss will occur. For instance, in the FAT file system, when two files claim to share the same allocation unit ("cross-linked"), data loss for one of the files is essentially guaranteed.
The increased use of journaling file systems, such as NTFS 5.0, ext3, and xfs, is likely to reduce the incidence of logical damage. These file systems can always be "rolled back" to a consistent state, which means that the only data likely to be lost is what was in the drive's cache at the time of the system failure.
However, regular system maintenance should still include the use of a consistency checker. This can protect both against bugs in the file system software and latent incompatibilities in the design of the storage hardware. One such incompatibility is the result of the disk controller reporting that file system structures have been saved to the disk when it has not actually occurred.
This can often occur if the drive stores data in its write cache, then claims it has been written to the disk. If power is lost, and this data contains file system structures, the file system may be left in an inconsistent state such that the journal itself is damaged or incomplete. One solution to this problem is to use hardware that does not report data as written until it actually is written.
Another is using disk controllers equipped with a battery backup so that the waiting data can be written when power is restored.
Finally, the entire system can be equipped with a battery backup (see UPS) that may make it possible to keep the system on in such situations, or at least to give enough time to shut down properly.

|