Our latest addition to the open-source projects we feature on the site is tarlib. This is intended as a (small) C++ library that you can use in Windows applications that need to handle TAR files. Of course, most zipping tools (for Windows) support TAR archives, so if you just need to extract or create a TAR archive, you can use one of them (my favorite is 7-zip). But when you need to do this in your app, things could get a bit more complicated. Of course, there are already available solutions. You can use for instance LZMA SDK (from 7-zip) or the commercial library Chilkat. My proposal is a library with a simple API that enables you to process TAR files with ease.
TAR Description
If you need tarlib, you must already know something about TAR files. Anyways, you can get more information in the following articles:
Here is a short summary of the TAR format:
- TAR archives consist of a series of objects, most common being files and folders
- Each such object is preceded by a header (of 512 bytes)
- The information in the header is encoded in ASCII and numbers are written in the octal base
- The file data is written unaltered, but it is rounded up to a multiple of 512 bytes
- The end of the file is marked with at least two consecutive entries filled with zeros
- There are different version of the TAR archives (UNIX V7, “old GNU” and GNU, STAR and POSIX) and different implementation
Library
tarlib
is written in C++ with Visual Studio and requires minimum Windows XP (because of file system APIs that it uses and that were introduced with WinXP). The library is provided as a pack of C++ files (headers and cpps) that you can include in your application.
Note that:
The current version (v1.1):
- is able to read (and process) existing TAR files
- does not support creation of TAR files
- supports parsing tar objects representing files and folders (as these are the most common objects on Windows at least)
Library API
There are a few classes/structures the library provides for handling TAR files.
tarFile
: is the representation of a tar file.
bool open(std::string const &filename
, tarFileMode
mode, tarFormatType
type) opens the specified TAR file for reading or writing (not supported in v1.1) bool extract(std::string const &folder)
extracts the content of the archive (files and folders) to the specified destination tarEntry get_first_entry()
retrieves the first entry in a tar archive tarEntry get_next_entry()
retrieves the next entry in a tar archive void rewind()
re-positions the file cursor at the beginning of the archive
tarEntry
: represents an object in a TAR file. it contains the header for the entry and methods to process the entry:
bool is_empty()
indicates whether this is an empty entry (empty entries are used to mark the end of the archive) bool is_md5()
indicates whether this is an entry that contains the MD5 hash of the actual TAR file (always found at the end of the archive) void rewind()
re-positions the file cursor ar the beginning of the object’s data (so you can read it again) bool extract(std::string const &folder)
extracts the current entry (file or folder) to the specified folder size_t read(char* buffer, size_t chunksize = tarChunkSize)
reads from the current position in the object’s data to the provided buffer; this function does not read past the end of the object’s data static tarEntry makeEmpty()
creates a tarEntry
representing an empty object static tarEntry makeMD5(char* buffer, size_t size)
creates a tarEntry
from a buffer containing the MD5 hash for the TAR object
Examples
Example 1: Extract a TAR archive to a specified folder using the tarFile
:
void extract1(std::string const &filename, std::string const &destination)
{
tarFile tarf(filename, tarModeRead);
tarf.extract(destination);
}
Example 2: Extract a TAR archive to a specified folder using a loop that iterates through the entries of the TAR archive:
void extract2(std::string const &filename, std::string const &destination)
{
tarFile tarf(filename, tarModeRead);
tarEntry entry = tarf.get_first_entry();
do
{
if(entry.header.indicator == tarEntryDirectory)
{
createfolder(path_combine(destination, entry.header.filename));
}
else if(entry.header.indicator == tarEntryNormalFile ||
entry.header.indicator == tarEntryNormalFileNull)
{
entry.extractfile_to_folder(destination);
}
entry = tarf.get_next_entry();
} while(!entry.is_empty());
}
Example 3: A simplified version of the 2nd example:
void extract3(std::string const &filename, std::string const &destination)
{
tarFile tarf(filename, tarModeRead);
tarEntry entry = tarf.get_first_entry();
do
{
entry.extract(destination);
entry = tarf.get_next_entry();
} while(!entry.is_empty());
}
Example 4: Explicitly process the entries of a TAR file (no auto-extraction to disk, can be in memory processing):
void extract4(std::string const& filename)
{
tarFile tarf(filename, tarModeRead);
std::list<tarEntry> entries;
tarEntry entry = tarf.get_first_entry();
do
{
entries.push_back(entry);
entry = tarf.get_next_entry();
} while(!entry.is_empty());
for(std::list<tarEntry>::iterator it = entries.begin();
it != entries.end();
++it)
{
tarEntry& entry = *it;
if(entry.header.indicator == tarEntryNormalFile ||
entry.header.indicator == tarEntryNormalFileNull)
{
entry.rewind();
char chunk[8*1024];
size_t total = 0;
do
{
size_t readBytes = entry.read(chunk, sizeof(chunk));
total += readBytes;
}while(total < entry.header.filesize);
}
}
}
CodeProject