Introduction
For the beginning .NET developer, one of the single most important underlying concepts is called streams. A stream can be written to and read from. Conceptually, a pipe has two ends just like a network connection. In any event, the overall concept involves input and output. This article is therefore meant for the beginner to gain a sharper focus on I/O as a whole, with the end goal of approaching streams, encryption, remoting, and overall .NET network programming. The System.IO
namespace contains and defines the classes used to navigate and manipulate files, directories, and drives. The file system classes are separated into two types of classes: information and utility. Most of the informational classes drive from the FileSystemInfo
base class. These classes expose all the system information about file system objects, particularly files, directories and drives. These classes are named FileInfo
and DirectoryInfo
. In addition, the DriveInfo
class represents a drive in the file system, but although it is still an informational class, it does derive from the FileSystemInfo
class because it does not share the common sorts of behavior (i.e., you can delete folders, but not drives. The utility classes provide static
methods to perform certain operations on file system objects such as files, directories, and file system paths.
A Note on File System Management
The .NET Framework’s I/O is built on top of standard Win32 functionality. Instead of having to work with and manipulating HANDLES –that is, via calls to APIs like OpenFile
, ReadFile
, Write
File
, and CloseHandle
– you are provided with higher level abstractions that manage HANDLE manipulation transparently, not obviously. A set of methods across the FileInfo
and DirectoryInfo
classes (and a few on FileStream
) enable you to interact with the file system. Consider the following code, but do not try to understand it yet, as its parts will be explained:
using System;
using System.IO;
public sealed class Program {
public static void Main()
{
DirectoryInfo dInfo = new DirectoryInfo(@"C:\Program Files\");
DirectoryInfo[] dirs = dInfo.GetDirectories("*", SearchOption.TopDirectoryOnly);
Console.WriteLine("{0} subdirectories:", dInfo.FullName);
foreach (DirectoryInfo subDir in dirs)
{
Console.WriteLine(" {0}", subDir.Name);
}
FileInfo[] files = dInfo.GetFiles();
Console.WriteLine("files:");
foreach (FileInfo file in files)
{
Console.WriteLine(" {0} ({1} bytes)", file.Name, file.Length);
}
}
}
Yields this output:
C:\Program Files\ subdirectories:
Business Objects
CE Remote Tools
Common Files
Internet Explorer
Microsoft SQL Server
Microsoft SQL Server Compact Edition
Microsoft Synchronization Services
etc.
files:
desktop.ini (174 bytes)
How to Get Information about a File
To obtain information about a specific file, create a new FileInfo
object by using the path to the file and then access the FileInfo
object’s properties. Now recall that the FileSystemInfo
class functions is the base class and provides the basic functionality for all informational file system classes. Below is a list of the most important FileSystemInfo
properties:
Attributes
: gets or sets FileAttributes
of the current file or directory.
CreationTime
: gets or sets the time the current file or directory was created.
Exists
: determines whether the file or directory exists.
FullName
: gets the full path to the file or directory.
LastAccessTime
: gets or sets the time file or directory was accessed.
LastWriteTime
: gets or sets the time the file or directory was last written to.
Name
: gets the name for the file or directory.
So we know that a FileInfo
(class) object must be created in order to access its properties, in this case Exists
, FullName
, and Name
(of the FileSystemInfo
class). Admittedly this is very basic code and the experienced developer should overlook it:
using System;
using System.IO;
public sealed class Program {
public static void Main() {
FileInfo theFile = new FileInfo(@"C:\Windows\System32\Config.NT");
if (theFile.Exists)
{
Console.WriteLine("Filename: {0}", theFile.Name);
Console.WriteLine("Path: {0}", theFile.FullName);
}
}
}
OUTPUT
Filename: Config.NT
Path: C:\Windows\System32\Config.NT
In addition to accessing data about file, the FileInfo
object allows operations to be performed on the file. Again, once a valid FileInfo
object is obtained, all you have to do is call the CopyTo
method to make a copy of your file, as the code shows below:
using System;
using System.IO;
public sealed class Program {
public static void Main() {
FileInfo theFile = new FileInfo(@"C:\Windows\System32\Config.NT");
theFile.CopyTo(@"c:\file.txt");
}
}
For output:
c:\Windows\Microsoft.NET\Framework\v2.0.50727>csc.exe CopytheFile.cs
c:\Windows\Microsoft.NET\Framework\v2.0.50727>type c:\file.txt
REM Windows MS-DOS Startup File
REM
REM CONFIG.SYS vs CONFIG.NT
REM CONFIG.SYS is not used to initialize the MS-DOS environment.
REM CONFIG.NT is used to initialize the MS-DOS environment unless a
and so on ….
The DirectoryInfo
class follows the same underlying concept and provides the basic functionality to access and manipulate a single directory in the file system. Below are the DirectoryInfo
class’s most important properties and methods:
DirectoryInfo Properties
Parent
: gets the DirectoryInfo
object for the parent directory of the current directory.
Root
: gets the root part of the directory’s path as a string.
DirectoryInfo Methods
Create
: creates the directory described in the current DirectoryInfo
object.
CreateSubDirectory
: creates a new directory as a child directory of the current directory.
GetDirectories
: retrieves an array of DirectoryInfo
objects that represent sub-directories.
GetFiles
: retrieves an array of FileInfo
objects that represent all the files in the current directory.
GetFileSystemInfos
: retrieves an array of FileSystemInfo
objects of both files and subdirectories in the current directory.
MoveTo
: moves the current directory to a new location.
As any computer user knows, accessing the files in a directory is much like accessing file information. So if you want to enumerate the files in a directory, then create a valid directory object by using the path to the directory. Once the object is created (by calling the ‘new
’ operator), you then call the GetFiles
method:
using System;
using System.IO;
public sealed class Program {
public static void Main() {
DirectoryInfo theDir = new DirectoryInfo(@"C:\Windows\");
Console.WriteLine("Directory: {0}", theDir.FullName);
foreach (FileInfo file in theDir.GetFiles())
{
Console.WriteLine("File: {0}, file.Name);
}
}
}
OUTPUT
Directory: C:\Windows\
File: bfsvc.exe
File: bootstat.dat
File: bthservsdp.dat
File: csup.txt
File: DtcInstall.log
File: explorer.exe
File: fveupdate.exe
and so on …
And finally in this section, the DriveInfo
class models a drive and provides methods and properties to query for drive information. Use DriveInfo
to determine what drives are available, and what type of drives they are. You call the static GetDrives
method of the DriveInfo
class and loop through the array of DriveInfo
objects returned by GetDrives
. When outputting this information to console, you print the property sought after the delimiter:
using System;
using System.IO;
public class Program {
public static void Main() {
DriveInfo[] drives = DriveInfo.GetDrives();
foreach (DriveInfo drive in drives)
{
Console.WriteLine("Drive: {0}", drive.Name);
Console.WriteLine("Type: {0}", drive.DriveType);
}
}
}
OUTPUT
Drive: C:\
Type: Fixed
Drive: D:\
Type: Removable
Drive: E:\
Type: Removable
Drive: F:\
Type: CDRom
Reading and Writing Files: Understanding Streams
A stream is a bidirectional pipe connecting a source and a destination and as such, serves as a common way to deal with both sequential and random access to data within the .NET Framework. In the .NET Framework, streams begin with an abstract
base class that provides the basic interface and implementation for all streams in the Framework. Examine the Stream
properties:
CanRead
: determines whether the stream supports reading.
CanSeek
: determines whether the stream supports seeking.
CanTimeout
: determines whether the stream can time out.
CanWrite
: determines whether the stream can be written to.
Length
: gets the length in bytes of the stream.
Position
: gets or sets the virtual position of the cursor for determining where in the stream the current position is.
ReadTimeout
: gets or sets the stream’s timeout for read operations.
WriteTimeout
: gets or sets the stream’s position for write operations.
Stream’s Methods
Close
: closes the stream and releases any resources associated with it.
Flush
: clears any buffers within the stream and forces changes to be written the underlying system or device.
Read
: performs a sequential read of a specified number of bytes from the current position.
ReadByte
: performs the read of a single byte and updates the position by moving it by one.
Seek
: sets the position within the stream.
SetLength
: specifies the length of the stream.
Write
: writes information to the stream as a number of bytes.
WriteByte
: writes a single byte to the stream and updates the position.
Reading Data
Each stream has a single data source, representing a backing store of bytes that can read from or written to. This is why streams have a common base class: working with data as a flow is a common way to work with data. The source has an end point, which is called End-of-Stream (EOS) and is represented in code as a -1. You can read single bytes or blocks of bytes at a time. Earlier in this article, I created a file by copying the contents of config.nt to c:\file.txt. Now consider this code that can be written using the instance methods int Read (byte[] buffer, int offset, int Count)
or int ReadByte()
:
using System;
using System.IO;
public sealed class Program {
public static void Main()
{
using (Stream s = new FileStream(@"c:\file.txt", FileMode.Open))
{
int read;
while ((read = s.ReadByte()) != -1)
{
Console.Write("{0} ", read);
}
}
}
}
You will notice the FileMode
Enumeration. This enumeration derives from classes that facilitate reading and writing data. Most operations being the File
class, which can perform several types of operations:
- Atomic operations to read or write all the contents of a file.
- Operations to create or open files for writing.
The File Class Static/Shared Methods
AppendAllText
: appends a specified string
into an existing file
AppendText
: opens a file (or creates a new file) and returns a StreamWriter
that is prepared to allow Text to be appended to a file.
Copy
: copies file to a new file.
Create
: creates a new file and returns a FileStream
object.
CreateText
: creates or opens a file and returns a StreamWriter
object that is ready to have text written to it.
Move
: moves a file from one place to another.
Open
: opens an existing file and returns a FileStream
object.
OpenRead
: opens an existing file and returns a read-only FileStream
object.
OpenText
: opens an existing file and returns a StreamReader
object.
OpenWrite
: opens an existing file for writing and returns a StreamWriter
object.
ReadAllBytes
: opens a file, reads the contents of the file into a buffer array, and closes the file in one atomic operation.
ReadAllLines
: opens a file, reads the contents of the file into an array of strings (one per line) and closes the file in one atomic operation.
ReadAllText
: opens a file, reads the contents of it into a string
and closes the file in one atomic operation.
WriteAllBytes
: opens file, writes the contents of a byte array into it, and closes the file in one atomic operation.
WriteAllLines
: opens a file, writes the contents of a string
array into it and closes the file in one atomic operation.
WriteAllText
: opens a file, writes the contents of a string
into it, and closes the file in one atomic operation.
The FileAccess
enumeration provides members that are used to determine the rights required when opening a file. Below are the FileAccess
members:
Read
: specifies that the file should be opened for read-only access.
Write
: specifies that the file should be opened to be written to.
ReadWrite
: specifies full access.
Now recall the code above and consider the FileMode
enumeration:
Append
: opens a file and moves the pointer in the FileStream
to the end of the file. Can be used with only FileAccess.Write
.
Create
: creates a new file. If the file already exists, it is overwritten.
CreateNew
: creates a new file. If the file already exists, it is overwritten.
Open
: opens an existing file.
OpenOrCreate
: opens an existing file. If the file does not exists, it is then created.
Truncate
: opens an existing file but empties the existing file so that it is zero bytes long.
An excellent tool that analyses .NET Framework classes is the .NET Reflector tool. It will both analyze and disassemble a class in order to find its data members and method members. Here is the FileInfo
class completely disassembled:
public sealed class FileInfo : FileSystemInfo
{
private string _name;
public FileInfo(string fileName);
private FileInfo(SerializationInfo info, StreamingContext context);
internal FileInfo(string fullPath, bool ignoreThis);
public StreamWriter AppendText();
public FileInfo CopyTo(string destFileName);
public FileInfo CopyTo(string destFileName, bool overwrite);
public FileStream Create();
public StreamWriter CreateText();
[ComVisible(false)]
public void Decrypt();
public override void Delete();
[ComVisible(false)]
public void Encrypt();
public FileSecurity GetAccessControl();
public FileSecurity GetAccessControl(AccessControlSections includeSections);
public void MoveTo(string destFileName);
public FileStream Open(FileMode mode);
public FileStream Open(FileMode mode, FileAccess access);
public FileStream Open(FileMode mode, FileAccess access, FileShare share);
public FileStream OpenRead();
public StreamReader OpenText();
public FileStream OpenWrite();
[ComVisible(false)]
public FileInfo Replace(string destinationFileName,
string destinationBackupFileName);
[ComVisible(false)]
public FileInfo Replace(string destinationFileName,
string destinationBackupFileName, bool ignoreMetadataErrors);
public void SetAccessControl(FileSecurity fileSecurity);
public override string ToString();
public DirectoryInfo Directory { get; }
public string DirectoryName { get; }
public override bool Exists { get; }
public bool IsReadOnly { get; set; }
public long Length { get; }
public override string Name { get; }
}
The FileStream Class
The FileStream
class provides the basic functionality to open file streams for reading and writing. Below are the FileStream
type’s properties:
CanRead
: determines whether the stream supports reading (inherited from the Stream
class.
CanSeek
: determines whether the stream supports seeking (inherited from the Stream
class).
CanTimeout
: determines whether the stream can time out (inherited from the Stream
class).
CanWrite
: determines whether the stream can be written to (inherited from the Stream
class)
Handle
: gets the streams underlying file handle.
Length
: get the length (in bytes) of the stream (inherited from the Stream
class).
Name
: gets the name of the file (inherited from the Stream
class).
Position
: gets or sets the virtual cursor for determining where in the stream the current position is. The value of the position cannot exceed the stream’s length (inherited from the Stream
class)
ReadTimeout
: gets or sets the stream’s timeout for read operations (inherited from the Stream
class).
WriteTimeout
: gets or sets the stream’s timeout for write operations (inherited from the Stream
class).
The FileStream’s Methods
Close
: closes the stream and releases any resources associated with it.
Flush
: clears any buffers within the stream and forces changes to be written to the underlying system or device.
Lock
: prevents other processes from changing all or part of the file.
Read
: performs a sequential read of a specified number of bytes from the current position to the end of the read upon completion of the operation.
ReadByte
: performs the read of a single byte and updates the position by moving it by one. This is identical to calling Read
to read a single byte (inherited from the Stream
class, as are Close
and Flush
.
Seek
: sets the position within the stream. (Inherited from the Stream
class).
SetLength
: specifies the length of the stream.
UnLock
: allows other processes to change all or part of the underlying file.
Write
: writes information to the stream as a number of bytes and updates the current position to reflect the new write position.
Here is the code to exemplify the seek concept:
using System;
using System.IO;
public sealed class Program {
public static void Main()
{
using (Stream s = new FileStream(@"c:\note.txt", FileMode.Open))
{
s.Seek(8, SeekOrigin.Current);
Console.WriteLine(s.ReadByte());
s.Seek(0, SeekOrigin.Begin);
Console.WriteLine(s.ReadByte());
s.Seek(-1, SeekOrigin.End);
Console.WriteLine(s.ReadByte());
}
}
}
OUTPUT
98
35
10
The StreamReader Class
The StreamReader
class provides the basic functionality to write data from a derived class. Here is an example of writing data, copying one stream to another:
using System;
using System.IO;
public sealed class Program {
public static void Main()
{
using (Stream from = new FileStream(@"C:\note.txt", FileMode.Open))
using (Stream to = new FileStream(@"C:\Note2.txt", FileMode.OpenOrCreate))
{
int readCount;
byte[] buffer = new byte[1024];
while ((readCount = from.Read(buffer, 0, 1024)) != 0)
{
to.Write(buffer, 0, readCount);
}
}
}
}
Notice the output. The file Note.txt is a Perl script meant to calculate the CRC of a file:
C:\Windows\MICROS~1.NET\FRAMEW~1\V20~1.507>type c:\note.txt
#! /usr/bin/perl -w
# computes and prints to stdout the CRC-32 values of the given files
use lib qw( blib/lib lib );
use Archive::Zip;
use FileHandle;
my $totalFiles = scalar(@ARGV);
foreach my $file (@ARGV) {
if ( -d $file ) {
warn "$0: ${file}: Is a directory\n";
next;
}
my $fh = FileHandle->new();
if ( !$fh->open( $file, 'r' ) ) {
warn "$0: $!\n";
next;
}
binmode($fh);
my $buffer;
my $bytesRead;
my $crc = 0;
while ( $bytesRead = $fh->read( $buffer, 32768 ) ) {
$crc = Archive::Zip::computeCRC32( $buffer, $crc );
}
printf( "%08x", $crc );
print("\t$file") if ( $totalFiles > 1 );
print("\n");
}
C:\Windows\MICROS~1.NET\FRAMEW~1\V20~1.507>type c:\note2.txt
#! /usr/bin/perl -w
# computes and prints to stdout the CRC-32 values of the given files
use lib qw( blib/lib lib );
use Archive::Zip;
use FileHandle;
and so on ….
To the beginning .NET developer, it should be clear how to perform a simple operation like reading from a file, because opening a file is common occurrence. In the simplest sense, you ask the File
class to open a stream by specifying the path to the file. When you open a file to read its contents, you use the FileMode.Open
enumeration to get read-only access to the file. The File.Open
method returns a FileStream
object. A file stream is just that: a stream, and you can read it by calling the Read
or ReadByte
method of the Stream
class. For reading the file, you can simply create a new StreamReader
object that wraps the FileStream
, as shown below:
using System;
using System.IO;
public sealed class Program
{
public static void Main()
{
FileStream myFile = File.Open(@"C:\Note.txt", FileMode.Open, FileAccess.Read);
StreamReader rdr = new StreamReader(myFile);
Console.Write(rdr.ReadToEnd());
rdr.Close();
myFile.Close();
}
}
Executing this file results in the same Perl script that forms the contents of Note.txt. Now if we use this code above to read one byte at a time:
using System;
using System.IO;
public sealed class Program {
public static void Main()
{
using (Stream s = new FileStream(@"c:\Note.txt", FileMode.Open))
{
int read;
while ((read = s.ReadByte()) != -1)
{
Console.Write("{0} ", read);
}
}
}
}
The output we get is just that:
35 33 32 47 117 115 114 47 98 105 110 47 112 101 114 108 32 45 119 13 10 35 32 9
9 111 109 112 117 116 101 115 32 97 110 100 32 112 114 105 110 116 115 32 116 11
1 32 115 116 100 111 117 116 32 116 104 101 32 67 82 67 45 51 50 32 118 97 108 1
17 101 115 32 111 102 32 116 104 101 32 103 105 118 101 110 32 102 105 108 101 1
15 13 10 117 115 101 32 108 105 98 32 113 119 40 32 98 108 105 98 47 108 105 98
32 108 105 98 32 41 59 13 10 117 115 101 32 65 114 99 104 105 118 101 58 58 90 1
and so on …..
So let’s try reading from a stream and casting to char
s:
using System;
using System.IO;
public sealed class Program {
public static void Main()
{
using (Stream s = new FileStream(@"c:\Note.txt", FileMode.Open))
{
int read;
while ((read = s.ReadByte()) != -1)
{
Console.Write("{0} ", (char)read);
}
}
}
}
OUTPUT
! / u s r / b i n / p e r l - w
# c o m p u t e s a n d p r i n t s t o s t d o u t t h e C R C -
3 2 v a l u e s o f t h e g i v e n f i l e s
u s e l i b q w ( b l i b / l i b l i b ) ;
u s e A r c h i v e : : Z i p ;
u s e F i l e H a n d l e ;
and so on ..
Now with a different Note.txt (a log file created by a malware detection tool), let’s use StreamReader
to read entire lines at a time:
using System;
using System.IO;
using System.IO.Compression;
using System.Text;
public sealed class Program {
public static void Main()
{
Stream s = new FileStream(@"c:\Note.txt", FileMode.Open);
using (StreamReader sr = new StreamReader(s, Encoding.UTF8))
{
string line;
while ((line = sr.ReadLine()) != null)
{
Console.WriteLine(line);
}
}
}
}
The good old output:
Full Scan: running (events: 10, objects: 28968, time: 00:10:13)
5/24/2009 4:58:23 PM Task completed
5/24/2009 4:54:53 PM Task started
Full Scan: running (events: 10, objects: 28968, time: 00:10:13)
5/24/2009 9:52:28 PM Detected: http://www.viruslist.com/en/advisories/30285
c:\program files\microsoft office\office12\winword.exe
5/24/2009 9:52:17 PM Detected: http://www.viruslist.com/en/advisories/33954
and so on . . . .
So while the StreamReader
class properties are BaseStream
, CurrentEncoding
, and EndOfStream
, the StreamWriter
class properties are AutoFlush
, BaseStream
, Encoding
, and NewLine
.
The StreamReader
class methods are:
Close
: closes the reader and the underlying stream.
Peek
: returns the next character in the stream without moving the stream’s current position.
Read
: reads the next set of characters in the stream.
ReadBlock
: reads the next block of characters in the stream.
ReadLine
: reads the next line of characters in the stream.
ReadToEnd
: reads all the characters through to the end of the stream.
Understanding Readers and Writers
There are two general readers and writers in the .NET Framework: text and binary. StreamReader
and StreamWriter
are classes that enable you to write to and read from streams. This is the purpose of the reader and writer classes. The text-based classes perform automatic encoding and decoding of text, revealing how to accurately convert raw bytes back into their encoded format upon reading (and doing the reverse while writing. Similarly, the binary-based classes enable you to read and write values in the underlying stream as any primitive data type. The StreamReader
class therefore derives from the TextReader
class. The StreamWriter
derives from the abstract TextWriter
class. These abstract
classes represent the basic interface for all text-based readers and writers. For example, there are additional StringReader
and StringWriter
classes, the purpose of which is to write to and read from in-memory strings. To read a string
using StringReader
, use code like the following:
using System;
using System.IO;
using System.Text;
public sealed class Program {
public static void Main() {
string s = @"Hi there
this is a multiline
text string";
StringReader sr = new StringReader(s);
while (sr.Peek() != -1)
{
string line = sr.ReadLine();
Console.Write(line);
}
}
}
Conversely, you can write a string
using StringWriter
(StringWriter
uses StringBuilder
so you can work with large string
s):
using System;
using System.IO;
using System.Text;
public sealed class Program {
public static void Main() {
StringWriter writer = new StringWriter();
writer.WriteLine("Introducing a way to efficiently write to");
writer.WriteLine("standard output by using the StringWriter");
writer.WriteLine("class that derives from the StringBuilder");
Console.WriteLine(writer);
}
}
OUTPUT
Introducing a way to efficiently write to
standard output by using the StringWriter
class that derives from the StringBuilder
The BinaryReader
and BinaryWriter
classes can be used to handle getting binary data to and from streams. For example, if you want to create a new file to store binary data, use the BinaryWriter
class to write various types of data to a stream line to do so:
using System;
using System.IO;
public sealed class Program {
public static void Main() {
FileStream fs = File.Create(@"c:\somefile.bin");
BinaryWriter writer = new BinaryWriter(fs);
long number = 100;
byte[] bytes = new byte[] { 10, 20, 50, 100 };
string s = "is it practical?";
writer.Write(number);
writer.Write(bytes);
writer.Write(s);
writer.Close();
}
}
Compressing Streams
The .NET Framework provides two methods for compressing data: GZIP
AND DEFLATE
. Both of these compression methods are industry standard compression algorithms that are also free from patent protection. Therefore you are free to use either to compress your data. I will create a file named comp.txt, place it in the C directory, and then use the GZIP algorithm to compress the data. Here is some example code that I wrote to compress this data. The result is a WinZip file if you have the typical WinZip archive tool installed on your system:
using System;
using System.IO;
using System.IO.Compression;
public sealed class Program {
public static void Main() {
FileStream sfile = File.OpenRead(@"C:\comp.txt");
FileStream dfile = File.Create(@"C:\comp.txt.gz");
GZipStream cStream = new GZipStream(dfile, CompressionMode.Compress);
int theByte = sfile.ReadByte();
while (theByte != -1)
{
cStream.WriteByte((byte)theByte);
theByte = sfile.ReadByte();
}
}
}
The decompression of a file involves using CompressionMode.Decompress
.
History
- 27th May, 2009: Initial post