Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

A Faster File.Copy

4.88/5 (26 votes)
23 May 2014CPOL2 min read 107.8K  
Improving file transfer speed for large files

Introduction

I develop and maintain a Media Automation system for a satellite broadcaster. I often need to transfer files about 5GB and larger from one location to another. I used File.Copy or File.Move making the typical assumption that Microsoft .NET had already written highly efficient code for me. I was wrong! I typically achieved transfer speeds of about 60GB/hour over a Gigabit network connection depending on other traffic and disk activity. I thought I was doing reasonably well. I wasn't actually waiting for these files while they were going through various stages of processing so I didn't really care.

Then I had to do a mass conversion of a 50TB data store of 15,000 files. I thought 1000 hours! That would take 40 days! Well, it was a lot of data so I better get started. It was a two step process where I sent the file to a server where they were processed. An outside vendor provided a program which handled the processing and sent them back to the data store. I was only achieving 25GB/hour so I was beginning to get worried. Then I discovered they were only achieving 5GB/hour writing back. Well, I raised a stink quick. After a lot of wrangling, they changed some settings and all of a sudden they were achieving 160GB/hour!

Well, now I was flabbergasted. How did they achieve more than I was? I wiped the dust off of 45 years of software development and began a deeper consideration of system architecture and software processing. I did a quick search on CodeProject and MSDN without finding anything. The first thing that came to mind was using larger buffer sizes in file read and write to reduce overhead.

Wow! My first test was written in a few minutes and achieved 250GB/hour. Now that was more like it.

I need to explain the production environment so you have a better understanding of the throughput numbers. My numbers may not duplicate on your system. My network source for the files was a Harmonic MediaGrid, which is a high density parallel network storage system. It is designed to serve multiple servers with high speed access. The destination was a Linux based media playout server. This system is not in production yet so there was little contention for resources. I was working on a dual quad core XEON Windows 2008 Server with 16GB RAM. The program performance used an average 25% CPU of all eight cores with this routine. The gigabit network card used an average 555Mb/sec for both simultaneous send and receive.

After I did some enhancements to my program, I retested the File.Move throughput at a shockingly slow 25GB/hour. The 1M buffer size achieved 256GB/hour. A 512KB buffer was almost as fast. Any buffer size smaller than that has rapid throughput drop off.

I hope you enjoyed my little story and it brought a smile to your face. Here is the code:

C#
/// <summary> Time the Move
/// </summary> 
/// <param name="source">Source file path</param> 
/// <param name="destination">Destination file path</param> 
public static void MoveTime (string source, string destination)
{
    DateTime start_time = DateTime.Now;
    FMove (source, destination);
    long size = new FileInfo (destination).Length;
    int milliseconds = 1 + (int) ((DateTime.Now - start_time).TotalMilliseconds);
    // size time in milliseconds per hour
    long tsize = size * 3600000 / milliseconds;
    tsize = tsize / (int) Math.Pow (2, 30);
    Console.WriteLine (tsize + "GB/hour");
}

/// <summary> Fast file move with big buffers
/// </summary>
/// <param name="source">Source file path</param> 
/// <param name="destination">Destination file path</param> 
static void FMove (string source, string destination)
{
    int array_length = (int) Math.Pow (2, 19);
    byte[] dataArray = new byte[array_length];
    using (FileStream fsread = new FileStream 
    (source, FileMode.Open, FileAccess.Read, FileShare.None, array_length))
    {
        using (BinaryReader bwread = new BinaryReader (fsread))
        {
            using (FileStream fswrite = new FileStream 
            (destination, FileMode.Create, FileAccess.Write, FileShare.None, array_length))
            {
                using (BinaryWriter bwwrite = new BinaryWriter (fswrite))
                {
                    for (; ; )
                    {
                        int read = bwread.Read (dataArray, 0, array_length);
                        if (0 == read)
                            break;
                        bwwrite.Write (dataArray, 0, read);
                    }
                }
            }
        }
    }
    File.Delete (source);
}

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)