Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / file
Print

MergeFiles

5.00/5 (3 votes)
3 Jun 2016CPOL1 min read 10.4K  
A C function that merges the content from an arbitrary number of text files into a Character-Separated-Variable-Width result file

Background

The other day, the following question was posted in QA:
How do I merge two files with two columns side by side in one file?[^]

In my opinion, this description can be abstracted to something like "form a Character-Separated-Variable-Width file from multiple text files, using the content of each file to populate each column".

Introduction

I love a good exercise, particularly if it gives me an opportunity to flex my rusty C-fu. So, in response, I wrote such a function my way.

  • Mine accepts an array containing an arbitrary number of file names.
  • Rather than failing, it reports file open errors, but otherwise treats unreadable files as empty.
  • It accepts a sequence of characters to use as the column separator.
  • A parameter specifies whether or not a "header line" containing the names of the files should be included.
  • It does not require that the input files all have the same number of lines -- "empty" values will be written to the output when input files run out of data.
  • The return value is the total number of lines written to the output.

I'm also considering adding the ability for it to put QUOTEs around values if specified. Another potential feature is the ability to filter out empty lines.

Using the Code

C
result = MergeFiles ( argc - 2 , argv [ 1 ] , argv [ 2 ] , "\t" , false ) ;

The result could be something like:

f:\>FileMerge CON A.txt B.txt C.txt
A.txt   B.txt   C.txt
AAAAA   BBBBB   CCCCC
AAAAA   BBBBB
AAAAA           CCCCC
AAAAA   BBBBB
AAAAA   BBBBB   CCCCC
        BBBBB
                CCCCC

                CCCCC

                CCCCC

f:\>

MergeFiles

Here's the function:

C
int
MergeFiles
(
  int   Count
,
  char* Dest
,
  char* Source[]
,
  char* Delimiter
,
  bool  Headers
)      
{
  int result = 0 ;
  
  FILE* dst ;
  
  if ( ( ( dst = fopen ( Dest , "w" ) ) ) == NULL )
  { 
    printf ( "\nError opening %s %d" , Dest , errno ) ;
    
    result = 0 - errno ;
  } 
  else
  {     
    int i ;      
    int j = 0 ;      
    
    FILE** src = (FILE**) calloc ( Count , sizeof(FILE*) ) ;
    
    if ( Delimiter == NULL )
    {
      Delimiter = "" ;
    }
                       
    for ( i = 0 ; i < Count ; i++ )
    {                  
      if ( Headers )
      {
        if ( i > 0 )
        {     
          fprintf ( dst , "%s" , Delimiter ) ;      
        }

        fprintf ( dst , "%s" , Source [ i ] ) ; 
      }
      
      if ( ( ( src [ i ] = fopen ( Source [ i ] , "r" ) ) ) == NULL )
      {
        printf ( "\nError opening %s %d" , Source [ i ] , errno ) ;
      }
      else
      {
        j++ ;
      }
    }  

    if ( Headers )
    {
      fputc ( '\n' , dst ) ;      
      
      result++ ;
    }
    
    while ( j > 0 )                    
    {                                    
      for ( i = 0 ; i < Count ; i++ )    
      {                                  
        if ( i > 0 )
        {
          fprintf ( dst , "%s" , Delimiter ) ;      
        }
         
        if ( src [ i ] != NULL ) 
        {                                
          while ( 1 )
          {            
            int c = getc ( src [ i ] ) ;
                      
            if ( c == '\n' )
            {
              break ;
            }
            else if ( c == EOF )
            {   
              fclose ( src [ i ] ) ;
              
              src [ i ] = NULL ;
              
              j-- ;
              
              break ;
            }   
            else
            {
              fputc ( c , dst ) ;
            }
          }
        }     
      }
      
      fputc ( '\n' , dst ) ;      
      
      result++ ;
    }

    free ( src ) ;

    fclose ( dst ) ;
  } 
    
  return ( result ) ;
}   

The only C compiler I have handy that supports bool is:

gcc version 3.2 (mingw special 20020817-1)

I used the -std=gnu99 switch.

History

  • 2016-06-03: First published

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)