Introduction
One of the key features of the NTFS file system is its recoverability. Together with the repair program (e.g. chkdsk.exe), NTFS guarantees data consistency even after critical failures like power outages. For this purpose, NTFS employs special mechanism called journaling. Since it's logical that file system changes cannot be written immediately after they were created, journaling keeps track of those changes in a file called $LogFile. Understanding NTFS logging file is a first step to the volume repairing process.
So, what data are stored in logfile? As far as I know, logfile contains every change altering NTFS structure from MFT data stream through bitmap streams to index allocations. Also, some metadata describing logging state must be stored, like what record was logged last, or what records were not flushed onto volume. One thing you for sure won't find there are changes to non-resident data attributes (except MFT table of course).
Logfile Structure
Logfile is the third file of the MFT table and its size is variable across platforms. Usually, for performance reasons, the file is not fragmented. During logging, logfile does not expand, but wraps - record overflowing the end of the logfile continues from the beginning overwriting the older ones. Thus, 64MB logfile can for example hold changes made during 3 hours of internet browsing.
Each record written has its log sequence number - LSN. LSN can be easily computed from the record file offset and sequence number. As far as I know, sequence number is increased each time the logfile wraps or the volume is mounted. This number is used to distinguish between old and new records.
Basic structure of the logfile can be described as a set of pages, where first two pages are reserved for metadata, and the rest for the log records. Let's start with the metadata. Those pages start with RSTR (restart) signature and contain some basic information:
- System page size (of the partition creator)
- Log page size (can be equal the system page size)
- List of logging clients. I have never seen more than one client, the NTFS itself
- LSN of the snapshot for each client
- Logfile file size
- Version
Important information from the restart pages is LSN of the client's snapshot. Snapshot is taken regularly by logging facility and written to the logfile. Later on, this snapshot can be used during volume recovery process. Logfile uses two restart pages, it's because when failure occurs during writing to the first page, the second page has the backup (and vice versa). When the PC is shut down properly, the restart pages should be equal.
Remaining pages are for the logged records. From the beginning is so called buffer zone. Each record is firstly written on one of the pages in the buffer zone and then the whole page is copied to its original position in the log.
< metadata > < buffering > < log area >
+-----+-----+-----+-----+ ~ +-----+-----+-----+ ~
|RSTR |RSTR |RCRD |RCRD | |RCRD |RCRD |RCRD |
+-----+-----+-----+-----+ ~ +-----+-----+-----+ ~
| ^
+------ Snapshot LSN ------------+
Every page holding records (RCRD signature) starts with header, which tells us the following:
- LSN of the last record on the page
- LSN of the last record which ends on this page (sometimes can be same as previous)
- Page count and page position
This information is for organizational purposes. On each page, there are change records chained into transactions. Each change written to the logfile consists of two parts: log record header and client's data. Log record header holds the information about current transaction:
- ID of the transaction
- LSN of the previous record in the transaction
- LSN of the record, which should be undone next in this transaction
- Length of the client's data
Client's data follows immediately after log record header and contains the following information:
- Type of operation to be made when redoing this change - redo operation
- Type of the operation to be made when undoing this change - undo operation
- Data for the redo operation
- Data for the undo operation
- Target attribute to which operation will be applied
- Target virtual cluster number (VCN) where the operation will be applied
- Target block in the VCN
- Target record offset
- Target attribute offset
After each transaction, volume should be in consistent state. Top level action, such as appending to the file, can be logged in the following transaction (list with all operations is in source code):
- Set new size of the attribute
- Update attribute value
- Commit
That should be almost everything about basic logfile structure. Now I want to mention a bit the structure of the snapshot. Its LSN can be found on restart pages and has the following structure:
- LSN of the checkpoint record. Checkpoint record is the entry point for the analysis pass of the recovery process. Tables mentioned below, does not reflect state made by records written after this checkpoint record.
- Table with open attributes, holds information about non resident attributes being used when snapshot was taken.
- Table with attribute names, contains names of attributes from open attributes table.
- Dirty page table, holds one entry for each page which was dirty (modified) at the time the snapshot was taken.
- Transaction table, contains one entry for every not committed transaction at the time the snapshot was taken.
Using the Code
First of all, we need a journal file. However, it's hidden and not directly accessible. That's why I also made a small dump_logfile.py script for dumping its content to the regular file:
> dump_logfile.py -p \\.\C:
Logfile processing is kept in Logfile.py module. For demo purposes, project contains main file where I put the basic usage of the LogFile
class. So once the logfile is prepared, we can retrieve metadata from the first two pages:
with open(args.file[0], "rb+") as logfile_stream:
lrb, lrbb = logfile.get_lsn_restart_blocks(logfile_stream)
Return values lrb
and lrbb
contain restart metadata. The first one always holds context with the higher current LSN. The second one serves as backup. LSN snapshot is our start point. From that LSN, we will start parsing remaining records from the logfile.
Pass one of the context to the Logfile
class. You can pass either most up-to date one - lrb
or the backup lrbb
. If you will use the first one, the script may crash, because some required data may not be written in logfile. Thus, I recommend to use backup:
journal = logfile.LogFile(logfile_stream, lrbb)
Passing the restart context is important, because from its structures, the LogFile
class knows how to handle input file and work with LSN numbers.
Now we can for example dump to file last LSN from every log page:
npages = (journal.lcb.file_size - logfile_stream.tell()) / \
journal.lcb.log_page_size
for _ in range(int(npages)):
page = logfile_stream.read(journal.lcb.log_page_size)
page_header = layout.RecordPageHeader.from_buffer_copy(page)
pages_stream.write(
"{:>10};{:>18};{:>18};{}\n".format(
hex(logfile_stream.tell() - journal.lcb.log_page_size),
hex(page_header.copy.last_lsn),
hex(page_header.last_end_lsn),
hex(page_header.flags)))
Output in my case looks like this:
Page offset;Last LSN;Last end LSN;Flags
...
0x49a000; 0x14b0935e3; 0x14b0935cb;0x3
0x49b000; 0x14b09374b; 0x14b09374b;0x3
0x49c000; 0x1490939e3; 0x1490939e3;0x1
0x49d000; 0x149093bf2; 0x149093bf2;0x1
...
My logfile is currently logging to page at file offset 0x49b000
. And after several seconds, this is no longer true, since this page has no longer records starting with 0x149:
0x49a000; 0x14b0935e3; 0x14b0935cb;0x3
0x49b000; 0x14b0937ca; 0x14b0937bf;0x3
0x49c000; 0x14b0939f0; 0x14b0939df;0x3
0x49d000; 0x14b093bf3; 0x14b093b45;0x3
0x49e000; 0x14b093dbe; 0x14b093db3;0x1
0x49f000; 0x14b093f12; 0x14b093f12;0x3
0x4a0000; 0x1490941fa; 0x1490941e9;0x1
0x4a1000; 0x1490943e3; 0x1490943e3;0x1
To observe transactions written in the logfile, just pick one LSN from the dumped pages and pass it to the log records generator:
with open(pages_file, "w") as records_stream:
records_stream.write("""LSN;Previous LSN;Undo next LSN;"""
"""Redo operation;Undo operation;Transaction\n""")
for record, data in journal.records(0x1480557170):
client_header = layout.ClientLogHeader.from_buffer_copy(data)
records_stream.write(
"{:>18};{:>18};{:>18};{};{};{}\n".format(
hex(record.this_lsn),
hex(record.client_previous_lsn),
hex(record.client_undo_next_lsn),
layout.LOG_OPERATION.get(client_header.redo_op),
layout.LOG_OPERATION.get(client_header.undo_op),
record.transaction_id))
Output can be somehow confusing, but it shows change records linked to the transactions using LSN numbers. Last number is the ID of transaction, however it may be equal for each records. It's because there are no parallel transactions:
LSN;Previous LSN;Undo next LSN;Redo operation;Undo operation;Transaction
...
0x1480569bf2; 0x0; 0x0;SetBitsInNonresidentBitMap;ClearBitsInNonresidentBitMap;24
0x1480569c08; 0x1480569bf2; 0x1480569bf2;Noop;DeallocateFileRecordSegment;24
0x1480569c14; 0x1480569c08; 0x1480569c08;AddIndexEntryAllocation;DeleteIndexEntryAllocation;24
0x1480569c2c; 0x1480569c14; 0x1480569c14;InitializeFileRecordSegment;Noop;24
0x1480569c5c; 0x1480569c2c; 0x1480569c2c;SetNewAttributeSizes;SetNewAttributeSizes;24
0x1480569c6f; 0x1480569c5c; 0x1480569c5c;UpdateNonResidentValue;Noop;24
0x1480569c84; 0x1480569c6f; 0x1480569c6f;SetNewAttributeSizes;SetNewAttributeSizes;24
0x1480569c97; 0x1480569c84; 0x0;ForgetTransaction;CompensationLogRecord;24
...
0x14805715e4; 0x0; 0x0;OpenAttributeTableDump;Noop;24
0x1480571ff2; 0x14805715e4; 0x14805715e4;AttributeNamesDump;Noop;24
...
From the output, I selected two transactions (transaction ends with ForgetTransaction
operation). The first one (0x1480569bf2
) firstly sets bits in bitmap (probable allocates new file records from MFT), then updates index for some other file (parent directory) (0x1480569c14
). Then initializes newly allocated file record (0x1480569c2c
) and updates size of its standard attribute (0x1480569c84
). But that is only my guess :). The second transaction (0x14805715e4
) is actually a start of creating a snapshot. Start with dump of open attributes, then dump of names, the next record could be dump of dirty pages and the last one dump of transaction table.
Links
History
I found a bug in my code, where last LSN of buffer page was overwritten with the wrong LSN. This bug also caused problems during iterating over log records. The code was updated and the bug should be gone.