Introduction
It seems to me that you've always wanted to create some backup/restore feature with incremental processing of each file. Here is a small library that demonstrates the power of IRdcLibrary
COM API, ready to be used.
Background
COM, Win32 and C++, as usual. I don't tolerate more beginners than me here :P.
Using the Code
The DIFF
library can read/write from files and memory. For reading data, you use Microsoft's IRdcFileReader interface
. For your convenience, I provide 2 helper classes in the code that can provide the IRdcFileReader interface
:
MemoryRdcFileReader
, which can read from memory. The memory must persist while this interface
exists (i.e., the interface
does not copy the memory).
Constructor:
MemoryRdcFileReader(const char* dd,unsigned long long size);
FileRdcFileReader
, which can read from an existing HANDLE
. The handle must have been opened with GENERIC_READ
and must be seekable.
Constructor:
FileRdcFileReader(HANDLE hF);
For writing data, my functions require an instance of a DIFFWRITER
class. I also provide a FileDiffWriter
(which can provide a DIFFWRITER
from a HANDLE
which must have been opened with GENERIC_WRITE
) and a MemoryDiffWriter
which writes to a vector<char>
. These helper classes provide the member function GetReader(IRdcFileReader**)
that returns a MemoryRdcFileReader
or a FileRdcFileReader
.
Incremental Backup
Let's say you have a file named 1.txt in c:\test, and you have backed it in c:\test2. Now let's say that you have modified 1.txt in the original c:\test directory, so c:\test\1.txt is newer. Your incremental backup steps now are (error checking is removed for simplicity):
CComPtr<IRdcFileReader> fil1;
CComPtr<IRdcFileReader> fil2;
fil1.Attach(new DIFF::FileRdcFileReader(hX1));
fil2.Attach(new DIFF::FileRdcFileReader(hX2));
DIFF::DIFF d;
DIFF::FileDiffWriter sig1(hS1);
DIFF::FileDiffWriter sig2(hS2);
d.GenerateSignature(fil1,sig1);
d.GenerateSignature(fil2,sig1);
This will save in the files the RDC signature for both old 1.txt and new 1.txt.
Your next step is now to generate a diff between 2 files. My library provides you with 2 options: Either you create a self-contained diff
, or a standalone-diff
.
A self-contained diff
allows you to reconstruct the updated file from the source file offline.
A standalone diff
is a lot smaller, but it needs parts of the remote file. So if you are to use standalone diff
s, you have to implement an IRdcFileReader
yourself that downloads the requested portions of the updated file from a server. Since this needs a server, we will focus on self-contained diff
s here.
Now, you want to create the diff
between the old and the new file.
CComPtr<IRdcFileReader> xsig1;
CComPtr<IRdcFileReader> xsig2;
sig1.GetReader(&xsig1);
sig2.GetReader(&xsig2);
DIFF::MemoryDiffWriter diff1_2_sc;
d.GenerateDiff(xsig1,xsig2,r2,diff1_2_sc));
This generates a self-contained diff
(if you pass NULL
instead of r2
in the 3rd parameter, it generates a standalone diff
). Instead of uploading now your new file c:\test\1.txt, you can upload this diff
file which is smaller. How smaller it is depends on how much the files differ.
Diff Restore
Now say that you have the original (master) backup of 1.txt and you want to restore the newer 1.txt. You need the original 1.txt and a self-contained diff
file:
DIFF::FileDiffWriter reco1(hX3);
d.Reconstruct(r1,xdiff1,0,reco1);
And that's all. Your newer 1.txt file has been reconstructed by the original file and the diff
. If the diff
is a standalone diff
, you would have to pass a reader to DIFF::Reconstruct
's 3rd parameter which would be asked to provide portions of the file.
Points of Interest
- The API requires Vista or newer.
- The library runs ONLY in multithreaded COM environment (
CoInitializeEx(0, COINIT_MULTITHREADED);
) - The constructor throws an exception if the COM object cannot be created.
- There's a similarity API I must investigate more.
- The library uses smart pointers, both COM ATL and C++.
Have fun!
History
- 21st March, 2016: Implemented as .h file, with some bugfixes
- 21st October, 2015: A few bugfixes and some experimental multiple-sig stuff
- 6th January, 2015: First release (and Happy NY:))