Introduction
Alternate Data Stream (ADS) is somewhat of a controversial subject. Though I have never come across a situation where I have faced problems with this feature, I have come across plenty of websites which explicitly state that ADS is evil!
If you haven't heard the term Alternate Data Stream before, then this is for you: usually, when you store something in a file, say a text file, that is all that the file contains. However, in NTFS, there exists a feature by which you can save a completely different set of data as an alternate stream. To make matters worse, this 'alternate' stream does not get reported by Explorer or DOS commands like 'dir'.
So, it is possible to have a file with an alternate stream of data which can be hundreds of megabytes in size but with no data in the primary data stream. Oops, I forgot to mention about the 'primary data stream'. This is the default stream to which data is normally written to / read from, when you save something in a file.
You might find some novel uses for alternate data streams, e.g., storing meta-data about the file. Though alternate data streams are not displayed directly by Windows, it is possible to get hold of them using a handful of Win32 API calls. This is what I am going to discuss in this article. A side note: there are similar articles in the internet, but most of them are in C#, and I feel C++ is best suited when we want to work at the Win32 API level, so here is an article which is totally MFC based.
The Basics
There are two sets of APIs which can be used to read data streams. The first one consists of calling the APIs - FindFirstStreamW()
and FindNextStreamW()
. This set of functions is quite similar to the FindFirstFile
type of APIs. However, the problem is that it works only on Windows 2003!
The next set of functions belong to the Backup API. They are BackupRead()
and BackupSeek()
. This is what we are going to use in our program.
Before we go into any more detail, let's quickly see how we can create a file which has an alternate data stream. Typing the following 'notepad.exe c:\a_new_file.txt:TheStream' will create an alternate data stream called 'TheStream'.
While in notepad, type in something and save the file. Now, go to Windows Explorer and see the properties of the file, the file size would be zero.
Type the same thing and voila, you'll see the contents in Notepad.
Main files:
- AltDataStream.h and AltDataStream.h.cpp - Defines a small class called
CAltDataStream
which does the work of enumerating the data streams of a file.
The main function of this class is GetStreams()
which returns a list of pointers to a byte array.
This is how the function works: every stream (data) is preceded by the stream header. This header contains information like the name and size of the stream. So, if we have two alternate data streams, then first we will encounter the stream header of the first stream followed by the data for it. Just after the data of the first stream, we will encounter the stream header of the second stream followed by its data. So it is just a simple matter of reading in the important bits and skipping over the actual data which we don't need, since we want to only get the name and the size of the streams. However, the function in our class does read a portion of the data, 64 bytes to be precise, which acts as a preview of the actual data.
So, to begin, we need to first open a file and get a file handle; we do this by using the good old CreateFile()
API.
This is immediately followed by a call to the BackupRead()
function. It is important to read a little more than the size of the WIN32_STREAM_ID
structure defined in winbase.h (just include Windows.h).
These are the parameters of BackupRead()
:
HANDLE hFile
- handle to the file we opened.
LPBYTE lpBuffer
- buffer where the data will be stored.
DWORD nNumberOfBytesToRead
- how many bytes do we want to read.
LPDWORD lpNumberOfBytesRead
- how many bytes were actually read.
BOOL bAbort
- initially pass FALSE
; when we are done with reading, pass TRUE
.
BOOL bProcessSecurity
- we are not going to need this.
LPVOID* lpContext
- helps the API to keep track of its usage. Make sure you initialize to NULL
before making the first call to BackupRead()
. Save this pointer and pass it to all subsequent BackupRead
and BackupSeek()
calls.
Now that we have read the first stream header, we store the meta-data, i.e., the name of the stream and the size of the stream. We can then go and seek (BackupSeek()
) till the end of the data stream. Just beyond this is the next stream header, so we repeat the process. BackupRead()
returns FALSE
when there is no more data to read.
Running the demo
To run the demo application, create a file which has one or more alternate data streams. Extract the file ADS.EXE from the demo.zip file and run it as "ads.exe c:\stream.txt".
Important - Alternate data streams work only on NTFS volumes, so all this is not going to work on a FAT32 volume.
Please mail you comments to siddharth_b@yahoo.com. I'd love to hear from you.