Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Column Definition Control

0.00/5 (No votes)
9 Jul 2002 1  
Presents a control used to specify column locations in a text file to be imported into your application.

Contents

Introduction

One of the principal requirements of most applications is to be able load data from persistent storage. In general, this is accomplished through the use of files of a predefined format, possibly XML. However, an important feature of many applications is the ability to import data that is stored in a format other than the one used primarily by the application, and in most cases this may be from ubiquitous formats such as plain text (as is discussed in this article).

In general, text files containing record-based data come in two forms, either where the columns are of a fixed width, or where they are separated by a predefined character.  This article presents a control (pictured above) that allows the user to insert column markers over a text file that has tabular data defined in columns of fixed width (as in the first case), and then converts this data into an array-like data structure from where it can be manipulated by your application.

There are two main parts to this project:

  1. The fixed-width column definition control, CFixedWidthColsFileCtrl
  2. The CFileContents class, used to do the actual processing

Using the code

There are two main ways in which this code can be used:

  1. With a full user interface provided by the column definition control presented in this article.
  2. Without a user interface, for instance importing a file again without redefining the columns, using the CFileContents class that is described towards the end of this article.

Since the focus of this article is the control, we'll be looking more closely at the first usage, however we'll briefly discuss how the code can be used without the user interface.

The control can be used however you wish, as part of a dialog for example, using standard techniques. Since importing a text file can be quite confusing to the user, we feel that a wizard would be the best way to approach the problem, as is demonstrated by the demo project. The details of implementing a wizard-structured dialog is out of the scope of this article, but you might find the article 'blah blah blah' by A N Other useful. The method used to place the control on a dialog or property page is shown below:

Using the control, a la Method 1

  1. First of all, you will need to retrieve the filename of the text file the user wishes to import. This can be done using the method demonstrated in the demo project, or you may already know the filename. You will need to add the following files to your project:

    • FileContents.cpp and FileContents.h
    • FixedWidthColsFileCtrl.cpp and FixedWidthColsFileCtrl.h

    This will add two classes to your project:

    CFileContents This class is used to do the actual processing of the file, and can be used without the CFixedWidthColsFileCtrl class in order to do batch processing or avoid the user interface component.
    CFixedWidthColsFileCtrl This class provides the user interface as shown on the second page of the demo project's wizard. It displays a given file in an edit control-like manner, and allows the user to click in the display to insert column markers which are then used to process the file and tabulate the data.

  2. Next, you need to create a dialog or property page where you place the control, and insert a Custom Control object that will be replaced with the CFixedWidthColsFileCtrl when the application is run. You will need to set the 'Class' property of the control to MFCFixedWidthColsFileCtrl

    Another important stage is to set the Style attribute of the custom control to be 0x50b30000, which corresponds to the following flags: WS_VISIBLE | WS_CHILD | WS_BORDER | WS_TABSTOP | WS_GROUP | WS_VSCROLL | WS_HSCROLL. Whilst not doing this doesn't seem to cause problems if you are only using a standard dialog, there are issues with a property page or wizard that are resolved by setting the style bits correctly:

    This is, of course, not the correct way to do this. However, attempting to use ModifyStyle or ModifyStyleEx in PreCreateWindow, for some reason, doesn't seem to solve the problem. Of course, this leaves the possibility that the values of these constants may change (though unlikely), but there seems to be no other way out.

  3. Now, using ClassWizard, we can create a class for this dialog (simply double-click on the dialog to open ClassWizard):

  4. The next stage is to create a member variable in your dialog class that represents the column definition control. To do this, first add the following line to the top of the .h file for you dialog:

    #include "FixedWidthColsFileCtrl.h"

    Finally, in your class definition, add the following declaration:

    CFixedWidthColsFileCtrl m_ctrlFixedWidthColsFile;
  5. We need to make sure that the control placeholder is replaced with the above instance of the control by adding the following to the DoDataExchange function (changing IDC_CUSTOM_FIXED_WIDTH_COLS_FILE to the ID of the control you defined in stage 2):

    DDX_Control(pDX, IDC_CUSTOM_FIXED_WIDTH_COLS_FILE, m_ctrlFixedWidthColsFile);
  6. The CFileContents class is used to perform the actual tabulation operation, and is used by the control, so either an instance of this class needs to be created as a member variable of the dialog class, or it a pointer/reference to an instance needs to be passed in when the dialog is created. The latter method is generally the better option, and can be achieved by altering the contructor declaration to the following:

    CMyDialogClass(CFileContents &fileContents);

    Now, a member variable is needed to store the object passed in (in the dialog class definition):

    CFileContents &m_FileContents;

    Finally, in the constructor definition we need to initialise this variable:

    // if you've created a dialog:
    
    CMyDialogClass::CMyDialogClass(CFileContents &fileContents, CWnd *pParent) : 
                                 CDialog(...), m_FileContents(fileContents)
    
    // if you've created a property page:
    
    CMyPropPageClass::CMyPropPagePClass(CFileContents &fileContents) : 
                                 CPropertyPage(...), m_FileContents(fileContents)
  7. Before the LoadFile method of the CFixedWidthColsFile class is called, the file you wish to display has to have been read into the CFileContents object that is passed in.  Code similar to the following can be used to achieve this:

    CFileContents fileContents;
    CMyDialogClass dlgMyPage(fileContents);
    
    // load the text file given by csFilename into the FileContents class:
    
    if (!fileContents.ReadFile(csFilename))
    {
        // an error has occurred.
    
    }
    else
    {
        // display the dialog:
    
        dlgMyPage.DoModal();
    }

    For a full example, see the demo project, which uses a wizard. In this situation, the CFileContents class is created as part of the application class (though obviously in a full application would either be a local variable or part of some other class) and then passed into the Wizard property page. Then, the first page of the Wizard gets a filename from the user and reads the selected file into the CFileContents class.

  8. When either the OnInitDialog (in the case of a dialog) or the OnSetActive (in the case of a property sheet) function is called, the control needs to process the file that has been read  by our m_FileContents class. This is achieved via the LoadFile function:

    ctrlFixedWidthColsFile.LoadFile(&m_FileContents);

    Note: Before the CFixedWidthColsFile::LoadFile method is called, the file you wish to display has to have been read into the CFileContents object that is passed in.

  9. When the dialog closes, we need to tabulate the data according to the columns that have been marked by the user using the control. We can do this through the PopulateFields function using just a single line of code placed in the OnOK (in the case of a dialog) or OnWizardNext (in the case of a wizard) function:

    ctrlFixedWidthColsFile.PopulateFields();

In terms of putting the control onto the dialog, and loading a file into it, that's all we need to do. When the dialog or wizard page ends, the CFileContents object that was passed in during initialization will now contain a fully tabulated data-set that can be accessed using the GetField (see reference) function and converted into a format that is more suitable for your particular application.

One slight limitation of this class is that it does not offer type-classing for the fields. As a result, all data is held in CStrings within the CFileContents object, meaning that you will have to perform type-conversions yourself, though this shouldn't cause many problems.

Using the CFileContents class without the UI, a la Method 2

An alternative to the method just described is to use the CFileContents class alone. This would provide a way of converting column delimited text files into a tabulated data structure without any user interaction.

When used in this mode, the method of use is similar to that described above, except for two main factors:

  1. The CFixedWidthColsFile class is not involved, therefore you don't need to add the corresponding .h/.cpp files to your project.
  2. All of the 'action' usually takes place in a single function, and may even (in some circumstances) occur within a separate worker thread, though the class has not been tested in this scenario and does not contain inherent thread safety.

Here is the basic method:

  1. First you need to either create or select a function where the necessary processing will take place. Within scope you need to have access to the filename of the file that processing will take place on, and also the column positions of the column divides that separate the data into columns.

    At the top of your file, you need to include the FileContents.h file:

    #include "FileContents.h"
  2. Next, you need to create an instance of the CFileContents class and into this instance read the desired file:

    CFileContents fileContents;
    fileContents.ReadFile(csMyFilename);
  3. Finally the file needs to be processed. For processing to occur, you need to build a list of the positions of the column separators. This is passed to the PopulateFields function of the CFileContents class via means of a CDWordArray. Here is some sample code that marks columns as starting at character positions 0, 12, 30 and 60:

    CDWordArray colPositions;
    colPositions.SetSize(3);     // note that this will imply FOUR columns
    
    
    colPositions.SetAt(0, 12);
    colPositions.SetAt(1, 30);
    colPositions.SetAt(2, 60);

    This array can now be passed to the PopulateFields function:

    fileContents.PopulateFields(&colPositions);

As before, the data has been now been processed and is ready to be accessed by the remainder of the function using the GetField (see reference) function of CFileContents.

Reference

The public interface for both the CFixedWidthColsFileCtrl class and the CFileContents class are described below.

CFixedWidthColsFileCtrl : CWnd

Function Description
CFixedWidthColsFileCtrl( )

Constructs a CFixedWidthColsFileCtrl object.

Parameters: None

Return Value: None

LoadFile(...) Loads a file from a CFileContents object into the control ready for display. This is the file over which the user will select where the columns should be located.

Parameters: CFileContents *pFileContents - the object that represents the file to be loaded into the control. The pointer must be non-NULL and point to a CFileContents class whose ReadFile() function has already been called and a file successfully loaded.

Return Value: None

PopulateFields( ) This method calls the PopulateFields(...) function of the CFileContents object that is being manipulated by the control, passing it the columns defined by the user. It causes the data in the CFileContents object to be tabulated. If this function is called multiple times, the data is simply re-tabulated.

Parameters: None

Return Value: None

CFileContents

Function Description
CFileContents( ) Constructs a CFileContents object.

Parameters: None

Return Value: None

ChangeTabSize(...) Sets the tab size to use when processing files. When the PopulateFields method is called, all tabs are expanded, this function sets the number of spaces that is equivalent to one full tab.

Parameters: int nTabSize - the new tab size

Return Value: None

GetField(...) Retrieves the data stored with a field.

Parameters: int nLine - the line number of the field to retrieve (analogous to the row in a table), zero-based.
int nField - the number of the field belonging to the line to retrieve (analogous to the column in a table), zero-based.

Return Value: A CString containing the data retrieved.

GetLine(...) Retrieves an entire line as initially read in. Note that if the class is in the fixed-width columns mode (the only one currently available), this function will also expand the tabs in the line.

Parameters: int nLine - the line number to retrieve - zero-based.

Return Value: A CString containing the retrieved line with tabs expanded.

GetMaxLineLength() Returns the length of the longest line imported from the text file.

Parameters: None

Return Value: An int containing the length of the longest line read in from the text file.

GetNumberOfLines() Returns the number of lines in the file that was read in by ReadFile(...).

Parameters: None

Return Value: An int containing the number of lines read in from the file.

PopulateFields(...) This function is one of the most important parts of the class. It tabulates the data in the file using the location of the column delimiters as passed in via the pColumns parameter.

Parameters: CDWordArray *pColumns - an array containing the positions of column lines within the file. The values should be zero-based, and should not include a 0 value for the first column. As a result, if there are n values within the array, a table with n + 1 columns will be generated.  The values in the array are the first character position of the next column, so for example if the array contains a single item of value 5, the columns will be between 0 and 4 for the first column, and 5 onswards for the second.

Return Value: None

ReadFile(...) This function is, again, one of the most important parts of the class. It takes a filename and reads the data from the file into memory for processing.

Parameters: CString csFileName - the filename of the file to load.
bool bFixedWidthCols = true - this parameter has a default value of true and should be omitted as its intended functionality has not yet been implemented (see below).

Return Value: A bool indicating success or failure.

Future improvements / To-Do

One feature that could be added to the classes provided here is the ability to tabulate a file based upon a character delimiter rather than by fixed width columns. This is the reason for the bFixedWidthColumns parameter in the ReadFile method of the CFileContents class.  The class was used in a similar way to this previously, here is a screenshot from a program that uses the classes in this way, to give you an idea of a possible layout for your wizard/dialog page:

Credits

The source code for this article was completely written by Cathy. The following people have helped or contributed in some way:

  • CFECFileDialog, CFileEditCtrl, and CPJAImage are classes written by PJ Arends, who also helped solve some initial problems with the code.
  • CDynamicItems was written by David Excoffer.

Conclusion

We hope that this code is useful. As far as we know there are no problems/bugs with the code, but please feel free to inform us of bugs/fixes via the comments section at the bottom of this article or by emailing Cathy at CatherinePelham@msn.com - constructive criticism is welcomed.

Updates

  • 12 July 2002 - Source files updated.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here