Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / NTFS

Recover Data From Corrupted Drives (File Systems: FAT32, NTFS): Part 1

4.83/5 (16 votes)
13 Dec 2013CPOL15 min read 46.6K   2K  
Recover data from corrupted drives, FAT32, NTFS.

Introduction 

One year ago I was working on an important project in my company. Suddenly, my computer hanged and I soon realized that my computer was seriously infected by virus, so I restarted my computer. After the restart there was an error written on my screen: "No Drive Found". I found that my Drive was corrupted, which was gruesome because I did not make a back-up of my coding. I asked my Uncle about what to do and he reminded me of drive recovery software, but I only found demos of them so I had decided to recover my valuable data myself and started researching about drives, file systems etc. Now I humbly write this article so that others can also solve problems related drive corruptions just as I did.

Recovering your data from corrupted hard drives is not a very easy or short task. In order to do this you have to understand hard drives, file systems, their structures, their features and how they work. In addition you have to understand the level of the corruption. The only thing that this project needs from you is your full concentration, dedication and programming skill. I will try my level best to provide you all information that you need to recover your valuable data without any commercial or ridiculous software available on the cyber market. So let our adventure begin; any suggestions, corrections and additions will be appreciated.

Some Basic Concepts

  1. Hard Drives are composed of several spinning magnetic disks like CDs in CD cases. Each Disk can store data on both sides and has two read and write HEADS. Data is stored on concentric rings called CYLINDERS. Cylinders can be sub divided into sectors or blocks.
  2. Important to note: Each SECTOR is 512 bytes long (by default).
  3. In computers, the smallest unit is 1 Bit. When you work with computers the smallest unit of data accessed and processed is 1 Byte. You cannot work on bits. When the data is read or written on a hard drive the smallest unit is a SECTOR. This means if you want to read or write 10 or 20 bytes, the computer actually reads or writes 512 bytes.
  4. Important to note: To recover data from a hard drive you work with sectors not bytes. e.g. you have to fetch 512 bytes (sector) in every shot. If you didn't understand, don't worry, I will show you in the coding section
  5. In order to locate any piece of data on a hard drive, it is uniquely addressable by cylinder, head and sector. This addressing Scheme is called a CHS Addressing Scheme. But it is being DEPRECATED because is is very complicated. So don't worry too much about it, but if you are interested please Google it for yourself.
  6. Since the CHS Addressing algorithm has been deprecated there is a new algorithm known as Logical Addressing Algorithm (LBA) LBA is a simple and easily understandable algorithm which maps the CHS Addressing scheme into a sequential set of addresses, like first sector = LBA(0), second sector = LBA(1), etc.
  7. NOTE: If you want to know your HD size then
  8. HD(Size) = Block Size * Total No. Of blocks.
  9. Where Block Size = 512 Bytes (Default)
  10. Before you begin the recovery process, you have to understand three things:
    • Master Boot Record (MBR): This is where you begin. It always exists in the first LBA or sector(512 bytes) in Drives. If this thing exists that drive is bootable and if not then that drive is non-bootable. Bootable Drives are those in which the OS exist. I will explain this later in the tutorial.
    • Volume Boot Record (VBR): A Volume Boot Record (VBR) (also known as a volume boot sector, a partition boot record or a partition boot sector) is a type of boot sector introduced by the IBM Personal Computer, which is used to create a partition of your hard drive (e.g., C:\, D:\, E:\, etc.) Every drive (C:\, D:\, etc,) has its own VBR. I will explain this in my second tutorial.
    • File Systems : File systems are basically an algorithm that provides some specifications about how files are stored and how to reduce Memory(Space) wastage and complexity that comes in storing and retrieving files. If you want to recover your files you will have to know what type of file systems exist in your drive (Either NTFS or FAT32). Here I will explain these two file systems and how to recover data when they are corrupted. I will explain this in my third tutorial.

What is data corruption?

Data or drive corruption is a situation in which your Operating System (OS) was unable to retrieve files and their information or properties. The reasons for this are many. It may be because your file system is corrupted, or it may be because your MBR or VBR is corrupted, or your Hard Drive or pen drive is physically distorted, etc. You can recover your data easily if your corrupted drive fulfills some requirements.

  • The drive and its sector can be recognized by your operating system (OS)
  • Your program can access a drive and its sector

The main aim of data recovery is to recover the remaining data that exists in your drive after distortion and corruption has occurred. That means you have to traverse each and every LBA to search your files, which your OS can no longer do.

Architecture

The below figure gives you a complete view how your MBR, VBR, and File System is placed in your drive.

Image 1

 

  • The above figure gives an abstract view of bootable drive
  • Every drive must contain a VBR in its first sector of a partition
  • The MBR consists of a Partition Table that contains an address to every VBR.
  • Every appropriate VBR consists of an address to the sector where your data begins
  • Files and Folders in your drive exist in a form of B+ Tree structures. I will explain this structure in my File System Tutorial
  • For drives that were non bootable like pen drives, no MBR existed only VBR and the rest existed

 

Level Of Corruption

In order to recover your data from a corrupted drive, first you have to know at what level your drive or data is corrupted. The level decides that how much you data you can recover. They are:

  1. Level 1: This is a simple one where only the partition table of your MBR is corrupted. At this point you traverse and search every LBA or sector to search for the VBR and once you find your VBR you can find you data.
  2. Level 2: In this level some fields of your VBR are corrupted, which makes your OS unable to recognize your file system. In this case you have to analyse the VBR to get the address field of your root directory in FAT 32 and MFT in NTFS. If you get it you will find it easy to search and recover your corrupted and deleted data, otherwise you have to traverse and search sector by sector.
  3. Level 3: In the above level we infer that every node that contains the address of its succeeding nodes like a tree, e.g., MBR contains the address of the corresponding VBR's, and VBR contains the address to the start of your file nodes, and so on. So if these addresses or references are corrupted then it will be very difficult to recover all data. In this case the recovery amount directly relates to knowledge, e.g. the deeper your knowledge about File Systems you have, the more data you can recover.
  4. Level 4: In this case it is not guaranteed that you can recover your data because the corruption could be physical damage to your hardware, like hardware burnt in fire, or that was distorted by falling down, or if water got in it etc. Well in this case you can recover data in data recovery labs.

 

So now we all have basic knowledge about Data and Drives. Since I told you earlier that the data recovery process was not a simple and easy task we will do it in parts. In my first tutorial I discuss Master Boot Records (MBR) and how to write applications that read and analyze MBR. Note that: MBR is an important part of the Operation Systems. Without MBR the OS can not be run. Please make your experiments on MBR carefully. This tutorial and attached demo project is only for knowledge purposes. This tutorial is not responsible for any damage you may create during your experiments or use of the demo project. Use at your own risk.

In my second tutorial I will explain Volume Boot Record (VBR) and its structure.

In my third tutorial I will explain root directories in FAT-32 and MFT in NTFS and how to search for your files and allow recovery by your applications.

Background

The Master Boot Record (MBR) always exists in the first sector(LBA(0)) of a particular drive (HD, PD, MC etc), if and only if that drive is bootable. It is wholly responsible to bootstrap into the operating system on a Basic I/O System (BIOS) based computer. The MBR contains a few things to help boot into the system.

  1. BootStrap Code: When your computer boots it needs to execute code to load the operating system or whatever software you want. This first set of code lives in the BIOS. This code checks what hardware is present and does a few tests to make sure everything is OK to boot. Then, according to the boot order you have specified (or your computer manufacturer has specified) it begins loading the first sector of various disks. When it finds one that is marked as an MBR it proceeds to transfer executions into it. This code is called a BootStrap Code (and is usually only 440 bytes). The Job of BootStrap code is to look through the partition table for the active partition (e.g. in which drive Boot files of the OS exist (C:\, generally). (See in the figure given below) To find the starting sector of active partition. Loads the copy of boot files from Partition to memory (e.g., NTLDR, Boot.ini, etc.) and transfer control to it and that's how your OS runs. You do not have to go deep in this section because in the recovery process it is not of any use but if your are interested in making your own OS then you will take it into consideration. The next six bytes is a boot code identifier like a product key.
  2. Partition Table: This is the section you have to take into consideration seriously because the partition table contains information about the partitions (like C:\, E:\, F:\, etc). The partition table is 64 bytes in size. A partition is a part of the hard drive that has been logically separated to act as its own volume as far as an operating system is concerned and can have an independent file system structure. Whichever partition is the active partition e.g. contains the operation system, the starting offset address of the sector, which contains the VBR of the partition including its size and information about whether or not that partition is NTFS or FAT formatted. To recover your files, first you need those partitions (Drives like C:\) that contain your files and for that you will need this partition table. (See in the figure given below)
  3. Disk Signaturee: MBR and VBR always contains a disk signature (55 AA) of at least two bytes. This signature identifies whether that sector contains MBR or VBR or not, e.g. if a particular sector of a drive contains a disk signature, it may contain MBR or VBR so to find MBR or VBR in our drive we will use this signature. (See in the figure given below)

Image 2

Here we can only deal with the Partition Table and the Disk Signature (Boot Signature).

Partition Table

There are two types of partition tables that exist

  • One is a Generic 64 bytes primary partition table
  • Other one is Extended Partition Table

 

Analysis of 64 Bytes Primary Partition Table

To investigate the master partition table, read between offset 1BEh and 1FDh taking the following structure of the generic partition table into consideration.

The 64 Bytes Primary Partition Table
Address (offsets) Length (in bytes) Partitions
1BE - 1CD 16 Bytes Partition 1 (C:\)
1CE - 1DD 16 Bytes Partition 2 (D:\) 
1DE - 1ED 16 Bytes Partition 3 (E:\)
1EE - 1FD 16 Bytes Partition 4 (F:\)

In a generic 64 bytes primary partition table, all of the four partitions exist into a single partition table. I will explain with the help of the figure given below.

Image 3

Now we analyze the structure of the 16 Bytes Partition table entry.

Let's take Partition 1 (C:\ Drive) and dissect it to see what it contains. The bytes that the C:\ Drive Partition Table entry contains are given clearly in the figure below.

Image 4

 

  1. Boot Indicator: This is the first byte of the partition table entry that indicates whether it is an active partition or not e.g. If it contains an operating system (OS File, Drivers etc.) or not. If this fields contains 80H (in hex) (in decimal=128) then it is an active partition e.g., mostly C:\ (partition 1) is a active partition (contains boot and system files of Windows). If a partition is an active partition then that partition in called a system partition. For non-active partitions the field is 00h.
  2. Starting CHS value: Since CHS Addressing Algorithm is being deprecated. You have nothing to do with this value. Ignore it
  3. Partition Type Descriptor: This field is one byte long and is an important field because it gives information about what type of file system is implemented in your Drive. Since every file system has its own algorithm it is very important to know about which file system is implemented in your hard drive. There are some hexadecimal flags given below which you may find in this field:
    • 00h ------------ No Partition (No File System)
    • 01h ------------ DOS FAT-12(File System)
    • 04h ------------ DOS FAT-16(No File System)
    • 05h ------------ Extended DOS 3.3(Extended Partition File System)
    • 06h ------------ DOS 3.31(Large File System)
    • 07h ------------ Windows NT(NTFS File System)
    • 0Bh ------------ Windows 95(FAT-32 File System)
    • For information about other File Systems please click here HEXADECIMAL FLAGS FOR PARTITION TYPES
  4. Ending CHS value: Ignore it.
  5. Address of sector containing VBR: Since I have explained earlier that every partition has a Volume Boot Record (VBR) at its first sector of partition. This field contains addresses to that VBR so this is an important field. It contains the address in hex of a sector and a sector is 512 bytes long, so first convert it in decimal and multiply it by 512 to get that byte offset of the VBR from the beginning.
  6. Size of partition: You will get the size of the partition from here. Since you know the address of the VBR of the partition and its size then you can deduce the address of last sector of partition given below:
    Address of last sector = address of VBR * (size of partition / 512)

 

Analysis of Extended Partition Table

In the Primary Partition Table we have seen that the 16 bit partition table entry exists in one 64 bit partition table but the Extended Partition Table is like a link list. There is more than one partition table containing one partition entry and a reference or address to another partition table. I will explain this to you with the help of a figure.

Image 5

The Extended Partition entries usually have Partition Types of either 05h or 0Fh; depending upon the size of the disk. The only way to figure out just how many Logical Drives there are within an Extended Partition is by jumping to each Extended partition table in the Extended Boot Records until you've found the last EBR table. EBRs can be described as being chained together by each link to the next EBR table from its previous link. Therefore, to obtain the complete layout of any hard disk that contains an Extended partition, you need a copy or a summary of the data in the Extended partition tables of each EBR as well as the Master Partition Table.

Using the Code

To get hands on MBR of you Bootable Drives, first you have to get access to the hardware, and for that you need Windows API. Without this you cannot get access to the hardware because of Windows security policies. So to import Windows DLL files in your C# code you will need to use the [DllImport] attribute.

C#
//
// Windows API Import
//
[DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
public static extern IntPtr CreateFile(string lpFileName, uint dwDesiredAccess, 
       uint dwShareMode, IntPtr lpSecurityAttributes, 
       uint dwCreationDisposition, uint dwFlagsAndAttributes, IntPtr hTemplateFile);

[DllImport("kernel32", SetLastError = true)]
public static extern unsafe bool ReadFile(IntPtr hFile, void* pBuffer, int NumberOfBytesToRead, 
       int* pNumberOfBytesRead, int Overlapped);

[DllImport("Kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
public static extern unsafe ulong SetFilePointer(IntPtr hFile, uint lDistanceToMove, int* lpDistanceToMoveHigh, 
       uint dwMoveMethod);

[DllImport("kernel32", SetLastError = true)]
public static extern bool CloseHandle(IntPtr hObject);

...

To access your drives you can use either use the registry or the WMI manager to get the active Drives on your computer.

C#
//
// WMI MANAGER
//
ManagementObjectSearcher MOSear = new ManagementObjectSearcher("SELECT * FROM Win32_DiskDrive");
foreach (ManagementObject obj2 in MOSear.Get())
{
    listBox1.Items.Add(obj2["Model"].ToString());
}

...

First you will have to access your drive by making handle to it using CreateFile function in kernel32.dll. Then read the sector in the buffer. You have to create a field structure and use marshal.ptrtostructure() to input fields from the sector into memory. See the code below:

C#
//
// partition table
//
...
    IntPtr File_Han = WindowAPICLass.CreateFile(@"\\.\PhysicalDrive" + i, 
       0x80000000, 1, IntPtr.Zero, 3, 0x20000000, IntPtr.Zero);
    if (File_Han.ToInt32() != -1)
    {
        int bytes_read = 0;
        byte[] byteArray = new byte[512];
        fixed (byte* ref_b = byteArray)
        {
            WindowAPICLass.ReadFile(File_Han, (void*)ref_b, 0x200, &bytes_read, 0);
            
      ....
      /////////////////
      ..........
      
            Marshal.Copy(byteArray, 446, outArray, 64);
            PartitionTable PTLB = new PartitionTable();
            PTLB = (PartitionTable)Marshal.PtrToStructure(outArray, typeof(PartitionTable));
      ............
      ////////////////
      ............          

        WindowAPICLass.CloseHandle(File_Han);
        
    }
}

...

Points of Interest

From this analysis I have learned that MBR is the most wonderful and important part of the Operation System. Windows API is something that must to be understood. It was a very adventurous experience for to me to analyze MBR, VBR, Root Directories in FAT 32 etc. and it was a pleasure to finally understand the most recognizable File Systems like NTFS, FAT32, etc.

History

  • Initial version: 10-dec-2013. Please wait for some time for my next article: "Analysis and Role of VBR in Recovery of Data."

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)