Contents
One of the principal requirements of most applications is to be able load
data from persistent storage. In general, this is accomplished through the
use of files of a predefined format, possibly XML. However, an important feature of many
applications is the ability to import data that is stored in a format other than
the one used primarily by the application, and in most cases this may be from
ubiquitous formats such as plain text (as is discussed in this article).
In general, text files containing record-based data come in two forms, either
where the columns are of a fixed width, or where they are separated by a
predefined character. This article presents a control (pictured above)
that allows the user to insert column markers over a text file that has tabular
data defined in columns of fixed width (as in the first case), and then converts
this data into an array-like data structure from where it can be manipulated by
your application.
There are two main parts to this project:
- The fixed-width column definition control,
CFixedWidthColsFileCtrl
- The
CFileContents
class, used to do the actual processing
There are two main ways in which this code can be used:
- With a full user interface provided by the column definition control
presented in this article.
- Without a user interface, for instance importing a file again without
redefining the columns, using
the
CFileContents
class that is described towards the end of this
article.
Since the focus of this article is the control, we'll be
looking more closely at the first usage, however we'll briefly discuss how the
code can be used without the user interface.
The control can be used however you wish, as part of a dialog for example,
using standard techniques. Since importing a text file can be
quite confusing to the user, we feel that a wizard would be the best way to
approach the problem, as is demonstrated by the demo project. The details
of implementing a wizard-structured dialog is out of the scope of this article,
but you might find the article 'blah blah blah' by A N Other useful. The
method used to place the control on a dialog or property page is shown below:
-
First of all, you will need to retrieve the filename of the text file the
user wishes to import. This can be done using the method demonstrated in
the demo project, or you may already know the filename. You will need to
add the following files to your project:
- FileContents.cpp and FileContents.h
- FixedWidthColsFileCtrl.cpp and FixedWidthColsFileCtrl.h
This will add two classes to your project:
CFileContents |
This class is used to do the actual processing of the file,
and can be used without the CFixedWidthColsFileCtrl class in order to
do batch processing or avoid the user interface component. |
|
CFixedWidthColsFileCtrl |
This class provides the user interface as shown on the
second page of the demo project's wizard. It displays a given file in
an edit control-like manner, and allows the user to click in the display to
insert column markers which are then used to process the file and tabulate
the data. |
-
Next, you need to create a dialog or property page where you place the control, and
insert a Custom Control object that will be replaced with the CFixedWidthColsFileCtrl
when the application is run.
You will need to set the 'Class' property of the control to MFCFixedWidthColsFileCtrl
Another important stage is to set the Style attribute of the custom control to be 0x50b30000
,
which corresponds to the following flags: WS_VISIBLE | WS_CHILD | WS_BORDER | WS_TABSTOP |
WS_GROUP | WS_VSCROLL | WS_HSCROLL
. Whilst not doing this doesn't seem to
cause problems if you are only using a standard dialog, there are issues with a
property page or wizard that are resolved by setting the style bits correctly:
This is, of course, not the correct way to do this. However, attempting to
use ModifyStyle
or ModifyStyleEx
in
PreCreateWindow
, for some reason, doesn't seem to solve the
problem. Of course, this leaves the possibility that the values of these
constants may change (though unlikely), but there seems to be no other way out.
-
Now, using ClassWizard, we can create a class for this dialog (simply double-click
on the dialog to open ClassWizard):
-
The next stage is to create a member variable in your dialog class that represents the
column definition control. To do this, first add the following line to the top of the
.h file for you dialog:
#include "FixedWidthColsFileCtrl.h"
Finally, in your class definition, add the following declaration:
CFixedWidthColsFileCtrl m_ctrlFixedWidthColsFile;
-
We need to make sure that the control placeholder is replaced with the above instance of
the control by adding the following to the DoDataExchange
function
(changing IDC_CUSTOM_FIXED_WIDTH_COLS_FILE
to the ID of the control you
defined in stage 2):
DDX_Control(pDX, IDC_CUSTOM_FIXED_WIDTH_COLS_FILE, m_ctrlFixedWidthColsFile);
-
The CFileContents
class is used to perform the actual tabulation operation,
and is used by the control, so either an instance of this class needs to be created as
a member variable of the dialog class, or it a pointer/reference to an instance needs to
be passed in when the dialog is created. The latter method is generally the better option,
and can be achieved by altering the contructor declaration to the following:
CMyDialogClass(CFileContents &fileContents);
Now, a member variable is needed to store the object passed in (in the dialog class
definition):
CFileContents &m_FileContents;
Finally, in the constructor definition we need to initialise this variable:
CMyDialogClass::CMyDialogClass(CFileContents &fileContents, CWnd *pParent) :
CDialog(...), m_FileContents(fileContents)
CMyPropPageClass::CMyPropPagePClass(CFileContents &fileContents) :
CPropertyPage(...), m_FileContents(fileContents)
-
Before the LoadFile
method of the CFixedWidthColsFile
class is
called, the file you wish to display has to have been read into the
CFileContents
object that is passed in. Code similar to the following
can be used to achieve this:
CFileContents fileContents;
CMyDialogClass dlgMyPage(fileContents);
if (!fileContents.ReadFile(csFilename))
{
}
else
{
dlgMyPage.DoModal();
}
For a full example, see the demo project, which uses a wizard. In this situation,
the CFileContents
class is created as part of the application class (though obviously in a
full application would either be a local variable or part of some other class) and then
passed into the Wizard property page. Then, the first page of the Wizard gets a filename
from the user and reads the selected file into the CFileContents
class.
-
When either the OnInitDialog
(in the case of a dialog) or the OnSetActive
(in the case of a property sheet) function is called, the control needs to process the file that
has been read by our m_FileContents
class. This is achieved via the
LoadFile
function:
ctrlFixedWidthColsFile.LoadFile(&m_FileContents);
Note: Before the CFixedWidthColsFile::LoadFile
method is called, the file you wish to display has to have been read into the
CFileContents
object that is passed in.
-
When the dialog closes, we need to tabulate the data according to the columns that have been
marked by the user using the control. We can do this through the PopulateFields
function
using just a single line of code placed in the OnOK
(in the case of a dialog) or
OnWizardNext
(in the case of a wizard) function:
ctrlFixedWidthColsFile.PopulateFields();
In terms of putting the control onto the dialog, and loading a file into it, that's all we need to
do. When the dialog or wizard page ends, the CFileContents
object that was passed in during
initialization will now contain a fully tabulated data-set that can be accessed using the
GetField
(see reference) function and converted into a format that is more suitable for your particular
application.
One slight limitation of this class is that it does not offer type-classing for the fields. As a
result, all data is held in CString
s within the CFileContents
object, meaning that you will have to perform type-conversions yourself, though
this shouldn't cause many problems.
An alternative to the method just described is to use the CFileContents
class alone. This
would provide a way of converting column delimited text files into a tabulated data structure without
any user interaction.
When used in this mode, the method of use is similar to that described above, except for two
main factors:
- The
CFixedWidthColsFile
class is not involved, therefore you don't need to add the
corresponding .h/.cpp files to your project.
- All of the 'action' usually takes place in a single function, and may even (in some
circumstances) occur within a separate worker thread, though the class has not been tested in
this scenario and does not contain inherent thread safety.
Here is the basic method:
First you need to either create or select a function where the necessary processing
will take place. Within scope you need to have access to the filename of the file that
processing will take place on, and also the column positions of the column divides that separate
the data into columns.
At the top of your file, you need to include the FileContents.h file:
#include "FileContents.h"
-
Next, you need to create an instance of the CFileContents
class and into this instance
read
the desired file:
CFileContents fileContents;
fileContents.ReadFile(csMyFilename);
-
Finally the file needs to be processed. For processing to occur, you need to build a list of the
positions of the column separators. This is passed to the PopulateFields
function of the
CFileContents
class via means of a CDWordArray
. Here is some sample code that marks
columns
as starting at character positions 0, 12, 30 and 60:
CDWordArray colPositions;
colPositions.SetSize(3);
colPositions.SetAt(0, 12);
colPositions.SetAt(1, 30);
colPositions.SetAt(2, 60);
This array can now be passed to the PopulateFields
function:
fileContents.PopulateFields(&colPositions);
As before, the data has been now been processed and is ready to be accessed by the remainder of the
function using the GetField
(see reference) function of CFileContents
.
The public interface for both the CFixedWidthColsFileCtrl
class and
the CFileContents
class are described below.
Function |
Description |
CFixedWidthColsFileCtrl( ) |
Constructs a CFixedWidthColsFileCtrl object.
Parameters: None
Return Value: None |
LoadFile(...) |
Loads a file from a CFileContents object into the control ready
for display. This is the file over which the user will select where
the columns should be located.Parameters: CFileContents *pFileContents
- the object that represents the file to be loaded into the control. The pointer must be
non-NULL and point to a CFileContents class whose ReadFile()
function has already been called and a file successfully loaded.
Return Value: None |
PopulateFields( ) |
This method calls the PopulateFields(...) function of the
CFileContents object that is being manipulated by the control, passing
it the columns defined by the user. It causes the data in the
CFileContents object to be tabulated. If this function is called
multiple times, the data is simply re-tabulated.Parameters: None
Return Value: None |
Function |
Description |
CFileContents( ) |
Constructs a CFileContents object.Parameters: None
Return Value: None |
ChangeTabSize(...) |
Sets the tab size to use when processing files. When the
PopulateFields method is called, all tabs are expanded, this function
sets the number of spaces that is equivalent to one full tab.
Parameters: int nTabSize - the new tab size
Return Value: None |
GetField(...) |
Retrieves the data stored with a field. Parameters: int
nLine - the line number of the field to retrieve (analogous to the row
in a table), zero-based.
int nField - the number of the field belonging to the line to
retrieve (analogous to the column in a table), zero-based.
Return Value: A CString containing the data retrieved. |
GetLine(...) |
Retrieves an entire line as initially read in. Note that if the
class is in the fixed-width columns mode (the only one currently available),
this function will also expand the tabs in the line. Parameters:
int nLine - the line number to retrieve - zero-based.
Return Value: A CString containing the retrieved line with
tabs expanded. |
GetMaxLineLength() |
Returns the length of the longest line imported from the text file.
Parameters: None
Return Value: An int containing the length of the longest
line read in from the text file. |
GetNumberOfLines() |
Returns the number of lines in the file that was read in by ReadFile(...) .
Parameters: None
Return Value: An int containing the number of lines read in
from the file. |
PopulateFields(...) |
This function is one of the most important parts of the class. It
tabulates the data in the file using the location of the column delimiters
as passed in via the pColumns parameter.Parameters:
CDWordArray *pColumns - an array containing the positions of column
lines within the file. The values should be zero-based, and should not
include a 0 value for the first column. As a result, if there are n
values within the array, a table with n + 1 columns will be
generated. The values in the array are the first character position of
the next column, so for example if the array contains a single item of value
5, the columns will be between 0 and 4 for the first column, and 5 onswards
for the second.
Return Value: None |
ReadFile(...) |
This function is, again, one of the most important parts of the class.
It takes a filename and reads the data from the file into memory for
processing. Parameters: CString csFileName - the filename of
the file to load.
bool bFixedWidthCols = true - this parameter has a default value of
true and should be omitted as its intended functionality has not yet
been implemented (see below).
Return Value: A bool indicating success or failure. |
One feature that could be added to the classes provided here is the ability
to tabulate a file based upon a character delimiter rather than by fixed width
columns. This is the reason for the bFixedWidthColumns
parameter in
the ReadFile
method of the CFileContents
class.
The class was used in a similar way to this previously, here is a screenshot
from a program that uses the classes in this way, to give you an idea of a
possible layout for your wizard/dialog page:
The source code for this article was completely written by Cathy. The
following people have helped or contributed in some way:
CFECFileDialog
, CFileEditCtrl
, and CPJAImage
are classes
written by PJ Arends, who also helped solve some initial problems with the
code.
CDynamicItems
was written by David Excoffer.
We hope that this code is useful. As far as we know there are no
problems/bugs with the code, but please feel free to inform us of bugs/fixes via
the comments section at the bottom of this article or by emailing Cathy at
CatherinePelham@msn.com - constructive criticism is
welcomed.
- 12 July 2002 - Source files updated.