Deleted File Recovery using foremost

For this example a program for Linux called foremost will be used to recover files, both existing and deleted, from a .dd image.  foremost is what is as known as a data-carving utility.  It operates by examining data, bit by bit, and extracting sets of data that meet a defined pattern.

foremost references its configuration file for a set of file headers, footers and other information that it uses to determine whether a set of data actually is a file or not.  Since foremost just looks at the file data, and not at any table entries, inodes, or anything similar, it can be used on virtually any image, whether it be of a hard drive, swap file, hibernation file, or RAM dump.  It also makes no distinction between existing and deleted files, as long as the deleted files’ headers have not been overwritten.  By default foremost can search for the following file types: jpg, gif, png, bmp, avi, exe, mpg, wav, riff, wmv, mov, pdf, ole, doc, zip, rar, htm, and cpp.  Other file types can be searched for by adding a custom definition to the configuration file.

foremost finds files by examining the image file for file headers.  Most types of files begin and end with a set string of bytes that indicates what type of file it is, and optionally also contain metadata about the file.  For example, .jpg files can contain information in them regarding what type of camera created the image.  foremost only looks at the header and footer; again, in the case of a .jpg, the header and footer are 0xffd8 and 0xffd9, respectively.  When it finds a header that appears in its list, it searches for a set number of bytes for the footer; if it does not find the footer after this number of bytes, it stops searching and moves on to the next header.  If it finds a matching footer, it carves out all the data between the header and footer and saves it as a file of the corresponding type.  As a result, foremost can only find contiguous files; therefore, if the image is heavily fragmented, it will not find many files.

The command that will invoke foremost is given below:

foremost –v –i /media/disk/test_image.dd –o /media/disk/foremost

The –v flag puts foremost in verbose mode, which means it displays information about its progress to the screen.  All that information can be found in an audit file that foremost creates, so –v can be omitted if the user desires.  The next two flags, -i and –o are the input image file and the output directory, respectively.  As entered above, this command will have foremost extract all file types that it recognizes; if the user wants to limit the search to a single file type, for example .jpg, adding the –t jpg flag would extract only files of type .jpg.  If the user has created a custom configuration file with additional file signature definitions, foremost can be directed to use that file with the –c flag followed by the path to the configuration file.  Like nearly all Linux programs, more information about foremost’s syntax and usage can be found by typing man foremost.

Once foremost starts to run, and the –v flag has been added, it will display on the screen each file that it recovers, showing the name it has given it (since this type of analysis has no way of determining the original filename), the size, the file offset, and a comments section, which can display information such as image dimensions.

Figure 13: foremost running
Figure 13: foremost running

Once foremost has finished running, it will display a summary of what it has found.  In this example, foremost was able to carve out 273 files of eight different types.

Figure 14: foremost summary
Figure 14: foremost summary


If the user then navigates to the directory where foremost saved its output, the audit file can be found.  This directory has the audit file containing all the information about foremost’s progress and a set of folders, one for each type of extracted file.  If the user navigates to the directory where the recovered jpgs reside using a graphical file browser, thumbnails of all the recovered images can be seen.  In many situations, this is the best way to view the images, as the generated names of the images will be meaningless.  However, if the user is looking for a specific image and knows its file size, it is possible to search through the files that way instead of visually inspecting them.

Figure 15: jpg directory of foremost output in a graphical file browser
Figure 15: jpg directory of foremost output in a graphical file browser

If the user is searching for a specific text document, he should look in the ole directory.  OLE was developed by Microsoft and stands for Object Linking and Embedding.  OLE allows embedding of documents within one another; an example of this is when a user places a picture inside a Microsoft Word document.  foremost will detect many Word documents as type ole instead of doc or docx.  Therefore, if a user is interested in text files, he should examine both the ole and doc directories, if they exist.

If the investigator is searching for a single document, one fast way of searching is to use the command grep.  In this example, supposed the investigator is looking for a “Risk Management Policy”.  To quickly search all the files instead of examining them all individually, he could use the command:

grep “Risk Management Policy” /media/disk/foremost/ole/*

This instructs grep to look inside all the files in the /media/disk/foremost/ole/ directory for the string “Risk Management Policy”.  In this case, the search was successful, and one file was found that contained that string.  By default, when multiple files are searched grep displays the name of the file in which the match was found.

Figure 16: grep search
Figure 16: grep search

Imaging Using dcfldd

In this example, a 128MB USB thumb drive will be imaged on a Linux system using dcfldd onto a 1GB USB thumb drive.  dcfldd is an improved version of dd; most of the syntax is identical, just a few functions have been added.  It is important to locate the name that Linux uses to refer to both the USB drives that will be used in the imaging process.  This can be done by entering sudo fdisk –l in a terminal window.  This will list all the disks that Linux sees, as well as where in the /dev directory it is located.  In this example, the USB drive that will be imaged is located at /dev/sdb, and the drive that the image will be saved on is /dev/sdc.

Figure 1: Displaying disk names
Figure 1: Displaying disk names

It is important to write protect the drive to be imaged as soon as possible after it has been attached to the computer.  While a properly configured forensic Linux machine will not write to the evidence disk, it is good to take precautions to block write attempts, both from the system and the user.  Now that the drive’s location is known, the next step is the change the permissions.

The command ls –lha /dev | grep sd will list all the files in the /dev folder that contain the letters sd.  Since all the disks being used contain sd in the name, this will filter out all the devices that are not of interest.  This command allows the user to view the permissions of the drives; as it is now, both the owner of sdb and root can write to sdb.  To change this, use the chmod command.  Entering sudo chmod 440 /dev/sdb sets the permissions for the disk sdb so that root and the owner can both only read, not write.  Enter ls –lha /dev | grep sd again to view the new permissions and verify that this is the case.

Figure 2: Displaying permissions
Figure 2: Displaying permissions

The next step is to use the dcfldd utility to create a copy of the drive.  In this case, an image will be created of the first partition on the sdb device, so the source will be /dev/sdb1.  By invoking the mount command, it can be seen that the destination drive has been mounted as /media/disk.  The command to create the image is as follows (enter as one line):

dcfldd if=/dev/sdb1 of=/media/disk/test_image.dd hash=md5,sha1 hashlog=/media/disk/hashlog.txt

Next each of the options in this dcfldd command will be discussed.  The if parameter identifies the source of the data to be imaged, in this case, /dev/sdb1.  The of option directs dcfldd where to write the output of the data acquisition.  One nice feature about dcfldd is that multiple of paths can be specified, allowing for multiple copies of the image to be created simultaneously.  This is useful if the examiner wants to create a local copy of the image, and a remote backup or archival copy on a network file server or local tape drive.  Special caution should be used when specifying if and of.  If the write blocking fails or is not used at all, switching these two parameters will result in the blank destination drive being copied on top of and overwriting the evidence drive.  Because of the dire consequences of such a mix-up, the original dd was jokingly thought to stand for ‘data destroyer.’  The next parameters are what make dcfldd so much better for forensic purposes than dd.  The hash attribute allows the user to specify what kind of cryptographic hash algorithms will be applied to the data.  The default is MD5, but in this example both MD5 and SHA-1 will be used.  The final attribute, hashlog, specifies where the output of the hashing should be directed; in this case, it will be to a text file in the same directory as the disk image.

While the image is being created, dcfldd will display a line that shows how many blocks have been written, and how many megabytes that corresponds to.  Once the image process has completed, a message will appear indicating how many complete blocks were copied.  The block size can be specified as a flag in the dcfldd command by adding bs=[block_size]; the default is 512 bytes.  If the number of blocks is followed by a +0, then exactly that many complete blocks of data were written.  If the number is followed by a +1, that means that that many complete blocks of data were written, plus one partial block of data.

Once the image has been created, it is very important that it be verified that it is indeed an exact, bit-for-bit copy of the original data.  There are a few ways that this can be done.  One method is to use dcfldd again.  If the following command is run, it will hash both the source (specified with if) and the file given by vf and report if their hash values match.  If they are the same, it will report Match; if not, it will report Mismatch.

dcfldd if=/dev/sdb1 vf=/media/disk/test_image.dd verifylog=/media/disk/verifylog.txt

Another method to verify that the two are identical is to directly hash both files and compare.  The programs md5 and sha1 perform their respective hash function on the file specified.  Referring back to the file that was imaged earlier in this example, if the user were to enter sudo md5 /media/disk/test_image.dd  /dev/sdb1, and compare the two returned hash values, they should be the same.  Also, because the hash flag was set when dcfldd was run, the hashlog file has the calculated hash values already, so those may be referenced as well.   If the hashes match, the image creation process was successful.  Otherwise, the whole process can be repeated; sometimes errors in copying the data will cause verification to fail.  Note that, if even one bit of data has been altered, the two sets of data will have drastically different hash values.

Figure 3: Hashing and comparing values
Figure 3: Hashing and comparing values

Imaging Using FTK Imager

AccessData produces a commercial forensic examination program called the Forensic Toolkit, or FTK.  While the FTK examination program costs thousands of dollars, AccessData also offers a no-cost companion program called FTK Imager.  FTK Imager is more flexible than dd in that it allows the user to create images of physical disks, logical drives, CD/DVD drives, and even folders.  It also can save the images in multiple formats, including the proprietary formats .e01 and SMART, and the old standby .dd.

Before starting the imaging process, first be sure that some sort of write protection is in place; see Write-Blocking Using the Windows Registry if you don’t have a hardware write blocker handy.  For this example I imaged a 1 GB USB flash drive.

Launch FTK Imager (if running Vista right-click and “Run as administrator” or FTK will not be able to see the physical disks) and select File > Create Disk Image.  A dialog box will appear like Figure 1 at the bottom of the post; for this example ‘Physical Drive’ should be selected.

Next, the drive to be imaged should be selected from the drop down box.  In this example, the examination workstation has three drives attached.  PHYSICALDRIVE0 is a RAID array that has been detected as a 499GB SCSI device, PHYSICALDRIVE1 is the 1GB USB flash drive, and PHYSICALDRIVE6 is a 500GB USB external hard drive.  For this scenario, PHYSICALDRIVE1 should be selected.

The next screen verifies that the image source chosen was PHYSICALDRIVE1, and then prompts the user to select where the image file should be saved.  Just like dcfldd, FTK Imager has the option to save the image to multiple places concurrently; this is useful if the investigator wants to save both a local copy of the image and a copy over the network to a file server (when saving several gigabytes of data across a LAN/WAN, it is important to be aware of the available bandwidth).  At the bottom of the screen several check boxes are present.  The ‘Verify images after they are created’ option is checked by default, and in the vast majority of cases should always be checked.  The ‘Create directory listings…’ option when checked will generate a .csv file with a list of all the files, including those that have been deleted, present in the image.  To add an image destination, click the ‘Add…’ button.

Select ‘Raw (dd)’ as the image type and click ‘Next >’.  At this screen some optional fields allow the investigator to enter information about the investigation, including case and evidence numbers, description, examiner name, and notes.  These can be filled in if desired, then click ‘Next >’.

This screen prompts the user to select both the image destination folder and filename.  At the bottom are two options.  The first is ‘Image Fragment Size (MB)’.  This field specifies the number of megabytes FTK Imager should split each chuck of the image file into; this can be helpful if the image is very large or will be transported or archived on CDs or DVDs.  If a value is entered in this field larger than the size of the data to be imaged only one file will be created and it will be the size of the data.  For our example, if the default value of 1500 MB is left, FTK Imager will create one 1GB file since the USB drive is only 1GB.  The second option deals with compression; dd images cannot be compressed, but some proprietary formats, like .e01, can.  Once the image destination folder and filename have been entered, the ‘Finish’ button is available and sends the user back to the previous screen when pressed.  At this point more image destinations can be added, or the ‘Start’ button can be pressed, which will begin the imaging process.

Once ‘Start’ is pressed, a box will appear with the elapsed time and the estimated time left.  Once the imaging finishes, FTK will begin verifying the image by hashing both the source device and the generated image with both the MD5 and SHA-1 algorithms.

Once the image has been created and verified, a window with the results of the image and the verification will appear; it lists things like the hash values of the source and destination and whether they match, the name of the generated image file, the number of sectors imaged, and if any bad sectors were found.  Another window will also appear showing the progress of the creation of the directory listing, if that option was checked.  Once these two boxes have been closed, the box that showed the progress of the image creation process will be visible again, this time with an ‘Image Summary…’ button.  This button will open a text file that has been created in the same directory as the image that lists all sorts of important information about the imaging process, including the optional case and investigator information that could have been entered in the imaging process, information about the physical geometry of the imaged disk, model and serial numbers if available, when the data acquisition and verification started and completed, and the hash values.  All this information is very valuable to have, especially if there is the possibility that the results of the forensic investigation could end up in a courtroom.  Figure 9 shows the Image Summary information for the created test image.

Now the investigator has a dd image of the USB drive suitable for examination by a wide range of forensic software and a log file of important information.  Using the dd image format has the benefit of being supported by virtually every forensic program, but it does not offer fancy settings like compression, which can be useful.  From here a variety of tools can be used to analyze this image, both proprietary and open-source.  Some of these tools and analysis methods will be examined in later posts.

Write Blocking Using the Windows Registry

It is possible to use the Windows registry to write protect USB mass storage devices.  An investigator can combine this USB write-blocking trick with an USB-IDE or USB-SATA adapter to protect the vast majority of evidence drives that he or she might encounter.  The write-blocking functionality was added with Windows XP SP2, and has worked with all subsequent Windows versions, including Windows Vista (but I have not tested this with Windows 7).   Below is a step by step guide to create a write-protect switch for USB devices on Windows.

  1. Select Start > Run or press Window's Key + R
  2. Type regedit in the box that pops up.  This opens up the Window’s Registry Editor.
  3. In the tree in the left pane of the editor, navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet and highlight the ‘Control’ key by clicking on it.
  4. Right-Click on the ‘Control’ key and select New > Key
  5. Name the new key StorageDevicePolicies
  6. Right-Click on StorageDevicePolicies and select New > DWORD
  7. Name it WriteProtect
  8. Right-Click on WriteProtect and select Modify
  9. Change the value of WriteProtect to a 1; this enables write protection
  10. Right-Click on StorageDevicePolicies and select Export.  This creates a .reg file that will apply this key to the registry when double-clicked.  Save this file on your Desktop as ‘USB Write Protection On’.
  11. Right-Click on WriteProtect and select Modify; change the value to 0.  This allows writes to occur once more.
  12. Right-Click on StorageDevicePolicies and select Export again.  Save this .reg file on your Desktop as ‘USB Write Protection Off’.
  13. Now simply double-click on either .reg file to enable or disable USB write protection.

Note: From my experience the write-protection only applies to devices plugged into the computer after the registry changes have been applied.  It may still be possible to write to the disk if it was attached prior to the “USB Write Protection On” file being applied.  Be sure to always apply this setting before plugging in any evidence items.

V. Write-blocking Using the Windows Registry

As mentioned earlier, it is possible to use the Windows registry to write protect USB mass storage devices. This functionality was added in with Windows XP SP2, and works with all subsequent Windows versions, including Windows Vista. Below is a step by step guide to create a write-protect switch for USB devices on Windows.

1. Select Start > Run or press

2. Type regedit in the box that pops up. This opens up the Window’s Registry Editor.

3. In the tree in the left pane of the editor, navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet and highlight the ‘Control’ key by clicking on it.

4. Right-Click on the ‘Control’ key and select New > Key

5. Name the new key StorageDevicePolicies

6. Right-Click on StorageDevicePolicies and select New > DWORD

7. Name it WriteProtect

8. Right-Click on WriteProtect and select Modify

9. Change the value of WriteProtect to a 1; this enables write protection

10. Right-Click on StorageDevicePolicies and select Export. This creates a .reg file that will apply this key to the registry when double-clicked. Save this file on your Desktop as ‘USB Write Protection On’.

11. Right-Click on WriteProtect and select Modify; change the value to 0. This allows writes to occur once more.

12. Right-Click on StorageDevicePolicies and select Export again. Save this .reg file on your Desktop as ‘USB Write Protection Off’.

13. Now simply double-click on either .reg file to enable or disable USB write protection.

Note: from my experience write-protection only applies to devices plugged into the computer after the registry changes have been applied. It may still be possible to write to the disk if it was attached prior to the “USB Write Protection On” file being applied. Be sure to always apply this setting before plugging in any evidence items.

Creating a Forensically-Sound Image

The first step in any forensic data recovery operation or computer forensic investigation is to create an exact duplicate of the media to be examined.  As a rule, in most cases analysis should never be performed on the original media, as the investigative process can make irrecoverable changes to the source data.  Since the original cannot be used, it becomes imperative to make an exact copy of the original that investigators can examine.  This is commonly referred to as making a bitstream image.  It also is possible to simply copy each bit of data from one hard disk to another, but problems arise when the hard disks are not exactly the same size as it can be hard to tell when the copied data ends and the data that was already on the disk begins.  This problem is adverted by copying the data into a file called an image.  Files are much easier to handle, and can be split and recombined if necessary for transport or storage.  Some image files use compression schemes to decrease the size of the image while others do not.

There are a myriad of computer programs available to create these images, but by far the most ubiquitous is a program known as dd.  dd began as a Unix utility that was used to perform low level reads and writes between various devices.  It has spawned many offspring, including ports over to Windows and newer improved versions for Linux such as dcfldd developed specifically with forensic purposes in mind.  dcfldd’s primary improvement over dd is the ability to hash files to ensure that they are exact copies of one another.

Cryptographic hash functions are mainstays in the computer forensic field.  These functions take any size block of data, perform a one-way algorithm to it, and return a fixed length string called a hash value.  These functions are constructed in such a way so that if a single bit of data is different between two files, they will have drastically different hash values.  The two most popular hashing algorithms are known as MD5 and SHA-1.  Both of these have been around for over ten years, and some weaknesses have been discovered in them, but they are still used widely by the forensic community.  After a bitstream image has been made of the original data, one or both of these hash functions are commonly run on both sets of data, and the hash values compared.  If they match, the image is an exact bit-for-bit copy of the original.  This hash value is many times referred to as a digital fingerprint, as the chance of running across two different sets of data with the same hash value is nearly a mathematical impossibility.

One of the cardinal rules of a forensic analyst is to never alter the original evidence.  Since most common operating systems, namely Windows, are constantly making changes to the active hard drive without the knowledge of the average user, it becomes important to take other steps to ensure that the original data is not inadvertently changed.  Most forensic analysts would put forth that the safest way to image a hard drive is to physically remove it from the suspect machine and attach it to an examination machine via a hardware writer blocker.  A hardware write blocker is like a one-way valve; it only allows information to flow from the hard drive to the computer, not the other way around.  The computer can attempt to write things to the drive, and on some operating systems even appear to do so, but hardware write blockers do not allow any data to actually be modified.  Some flash drives, memory cards and floppy disks have built in limited hardware write blocking in the form of a switch or tab on the side of the media.  While this works in most cases, it is always seen as a good practice to use write blocking devices that have been vetted by the forensic community.  In any case, it is preferable to have some sort of backup write blocking to ensure that no data is inadvertently altered.

While hardware write blockers are preferable in the vast majority of cases, they tend to be rather expensive and their price puts them out of reach for most students or the casual forensic investigator.  The next most viable option is known as software write blocking.  Software write blocking can be achieved in many ways.  For example, in Linux it is possible to mount a device as read only; this blocks nearly all write attempts, but as there is nothing physically preventing writes, they still are possible.  Some Linux distributions made with forensics in mind mount all devices as read-only by default; Helix takes this approach.  If the investigator would like to perform the examination from a machine running Windows, it is possible to use the registry to write-protect all USB storage devices.  When the hard drive is attached via a USB cable or USB card reader, this approach can be an inexpensive and effective alternative.

All the methods outlined above fall under the category of imaging colloquially known as ‘dead imaging.’  Dead imaging is when the hard drive that is to be imaged has been powered off by some means and will be imaged by the investigator outside of its original computer.  This method has the advantage of performing the image with a set of known-good programs that have not been tampered with.  If the imaging were to be done on the original computer, advanced users may have altered basic processes to hide data or a rootkit may interfere with the imaging process.  Dead imaging when done properly is very unlikely to alter any of the original data.

The alternative to dead imaging is ‘live imaging.’  Live imaging is when the investigator is able to get to the computer to be imaged while the hard drive is still in the computer and powered on.  This method, in which it is nearly impossible not to alter any the original data, is sometimes the only one available.  For example, the disk may be encrypted, and if it is powered off all the data on it will be shielded from the investigator by the encryption.  Also, if the system is powered off, all the data in the RAM will be lost to the investigator, because RAM is a form of volatile storage.  Whether data should be acquired by live or dead imaging should be evaluated on a case-by-case basis, as each incident and the circumstances around it is unique.  All the examples of imaging in this paper are of the dead imaging variety.

Another important consideration is where the image will be stored.  It is considered a good practice in the forensic community to image to a wiped storage device to avoid any possibility of data contamination.  While saving an image file on a disk with preexisting data on it is perfectly fine if the hash value of the image matches the original, it is generally seen as a better practice to eliminate all other data prior to imaging to the drive to quash any question of the image being tainted.  Hence, many investigators use a completely separate hard drive for each image, never imaging to the hard drive containing the operating system for their examination machine.

DFIR without the $