The first step in any forensic data recovery operation or computer forensic investigation is to create an exact duplicate of the media to be examined. As a rule, in most cases analysis should never be performed on the original media, as the investigative process can make irrecoverable changes to the source data. Since the original cannot be used, it becomes imperative to make an exact copy of the original that investigators can examine. This is commonly referred to as making a bitstream image. It also is possible to simply copy each bit of data from one hard disk to another, but problems arise when the hard disks are not exactly the same size as it can be hard to tell when the copied data ends and the data that was already on the disk begins. This problem is adverted by copying the data into a file called an image. Files are much easier to handle, and can be split and recombined if necessary for transport or storage. Some image files use compression schemes to decrease the size of the image while others do not.
There are a myriad of computer programs available to create these images, but by far the most ubiquitous is a program known as dd. dd began as a Unix utility that was used to perform low level reads and writes between various devices. It has spawned many offspring, including ports over to Windows and newer improved versions for Linux such as dcfldd developed specifically with forensic purposes in mind. dcfldd’s primary improvement over dd is the ability to hash files to ensure that they are exact copies of one another.
Cryptographic hash functions are mainstays in the computer forensic field. These functions take any size block of data, perform a one-way algorithm to it, and return a fixed length string called a hash value. These functions are constructed in such a way so that if a single bit of data is different between two files, they will have drastically different hash values. The two most popular hashing algorithms are known as MD5 and SHA-1. Both of these have been around for over ten years, and some weaknesses have been discovered in them, but they are still used widely by the forensic community. After a bitstream image has been made of the original data, one or both of these hash functions are commonly run on both sets of data, and the hash values compared. If they match, the image is an exact bit-for-bit copy of the original. This hash value is many times referred to as a digital fingerprint, as the chance of running across two different sets of data with the same hash value is nearly a mathematical impossibility.
One of the cardinal rules of a forensic analyst is to never alter the original evidence. Since most common operating systems, namely Windows, are constantly making changes to the active hard drive without the knowledge of the average user, it becomes important to take other steps to ensure that the original data is not inadvertently changed. Most forensic analysts would put forth that the safest way to image a hard drive is to physically remove it from the suspect machine and attach it to an examination machine via a hardware writer blocker. A hardware write blocker is like a one-way valve; it only allows information to flow from the hard drive to the computer, not the other way around. The computer can attempt to write things to the drive, and on some operating systems even appear to do so, but hardware write blockers do not allow any data to actually be modified. Some flash drives, memory cards and floppy disks have built in limited hardware write blocking in the form of a switch or tab on the side of the media. While this works in most cases, it is always seen as a good practice to use write blocking devices that have been vetted by the forensic community. In any case, it is preferable to have some sort of backup write blocking to ensure that no data is inadvertently altered.
While hardware write blockers are preferable in the vast majority of cases, they tend to be rather expensive and their price puts them out of reach for most students or the casual forensic investigator. The next most viable option is known as software write blocking. Software write blocking can be achieved in many ways. For example, in Linux it is possible to mount a device as read only; this blocks nearly all write attempts, but as there is nothing physically preventing writes, they still are possible. Some Linux distributions made with forensics in mind mount all devices as read-only by default; Helix takes this approach. If the investigator would like to perform the examination from a machine running Windows, it is possible to use the registry to write-protect all USB storage devices. When the hard drive is attached via a USB cable or USB card reader, this approach can be an inexpensive and effective alternative.
All the methods outlined above fall under the category of imaging colloquially known as ‘dead imaging.’ Dead imaging is when the hard drive that is to be imaged has been powered off by some means and will be imaged by the investigator outside of its original computer. This method has the advantage of performing the image with a set of known-good programs that have not been tampered with. If the imaging were to be done on the original computer, advanced users may have altered basic processes to hide data or a rootkit may interfere with the imaging process. Dead imaging when done properly is very unlikely to alter any of the original data.
The alternative to dead imaging is ‘live imaging.’ Live imaging is when the investigator is able to get to the computer to be imaged while the hard drive is still in the computer and powered on. This method, in which it is nearly impossible not to alter any the original data, is sometimes the only one available. For example, the disk may be encrypted, and if it is powered off all the data on it will be shielded from the investigator by the encryption. Also, if the system is powered off, all the data in the RAM will be lost to the investigator, because RAM is a form of volatile storage. Whether data should be acquired by live or dead imaging should be evaluated on a case-by-case basis, as each incident and the circumstances around it is unique. All the examples of imaging in this paper are of the dead imaging variety.
Another important consideration is where the image will be stored. It is considered a good practice in the forensic community to image to a wiped storage device to avoid any possibility of data contamination. While saving an image file on a disk with preexisting data on it is perfectly fine if the hash value of the image matches the original, it is generally seen as a better practice to eliminate all other data prior to imaging to the drive to quash any question of the image being tainted. Hence, many investigators use a completely separate hard drive for each image, never imaging to the hard drive containing the operating system for their examination machine.