Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Excel user defined functions unlimited

0.00/5 (No votes)
30 Nov 2006 2  
Describes how to return tables of values as the result of user defined functions (UDF) in Excel.

Introduction

This article demonstrates how to overcome the limitations of user defined functions (UDF) in Excel:

  • When used conventionally (i.e., not as an array function), UDFs can return only a single value.
  • When more than one value should be returned, this is achieved by selecting the correctly sized range in advance and concurrently pressing <CTRL><SHIFT><ENTER>. This tells Excel that the formula should be used as an Array Formula.

The example shows how to return tables of values (e.g., results of a database query) as a result of a single function entry that was entered conventionally (i.e., without pressing <CTRL><SHIFT><ENTER>). The results stay linked with the data source, and therefore are updated automatically. This way, it is possible to create functions that always return the current result of a database query or a calculation that returns more than a single value!

Background

Most of the end users of database applications conduct further data processing, preferably in Excel. Standard ways of exporting data (file exports) always involve tedious and repetitive manual steps.

User defined functions are an ideal way to accomplish that task:

  • they are easy to use and well known to Excel users,
  • parameters (e.g., date) can easily be updated,
  • updates are done automatically.

System requirements and installation

  • Microsoft Excel 2002 or higher.
  • Microsoft JET 4.0 OLE DB Provider. This provider is automatically installed with Microsoft Office, otherwise it can be downloaded from here.

Demo project installation

Download the demo project, and copy the files excelmvf.dll, excelmvf.xla, excelmvf.mdb, adortl60.bpl, dbrtl60.bpl, and rtl60.bpl to C:\excelmvf. It is important to strictly stick to that path since the absolute path is used in the declaration of the DLL-functions within excelmvf.xla (the alternative would be to copy the files into your system directories, but that makes cleaning up more difficult).

To start the add-in, you can either open the excelmvf.xla file via the FileOpen-dialog or use the add-in manager.

Excel Security Settings have to be set to allow execution of macros. If the functions getCustomers and getOrderItems show up in the Excel's function input dialog within the section "User defined functions", then the functions are ready to be used.

Function syntax and parameters

  • getCustomers(fieldList as String, sortingCriteria as String, header as Boolean)
    • fieldList: String with comma or ; delimited list of columns to be displayed.
    • sortingCriteria: String with comma or ; delimited list of sorting criteria, optionally including sorting direction asc or desc.
    • header: If True (default): column headers are displayed, and if False: no column headers are displayed.
  • getOrderItems(startdate as Date, enddate as Date, fieldlist as String, sortingCriteria as String, header as Boolean)
    • fieldList, sortingCriteria, and header as explained above.
    • Startdate and enddate: Date values to confine results of orders processed within the specified range.

Examples

  • =getcustomers("firstname,lastname,address,postal code";"firstname desc").
  • =getOrderItems("2004/09/10";"2004/09/30";"customer,orderdate, description, amount,price,totalprice";"orderdate asc").

Building the sources

There are two versions of the sample sources available:

  • excelmvfVC2005.zip contains a version for Microsoft VC++ 2005 Express Edition (this edition is free and can be downloaded here).
  • excelmvf.zip contains a version for C++ Builder 6.0.

The main difference is the access to the database. Some proprietary VCL classes are used in the C++ builder version, whereas the VC version uses basic ADO.

For the Visual C++ version, follow these steps:

  • Download and install Microsoft VC++ 2005 Express Edition and the SDK (follow the instructions given in VC++ 2005 for that).
  • Unzip the contents of excelmvfVC2005.zip to a directory of your choice.
  • Open the VC solution excelmvf; you should now be ready to build the project successfully. To enable debugging of the DLL, you must set the path to your Excel installation in the project settings.

For the Borland C++ Builder Version:

  • Unzip the contents of excelmvf.zip to a directory of your choice.
  • In C++ Builder, open the project file excelmvf.bpr. You should now be ready to build the project successfully. To enable debugging of the DLL, you must set the path to your Excel installation (menu: Run, Parameters).

The example add-in is designed in a way to make adaptations for own purposes as easy as possible. For a detailed description of all the steps necessary to add a new function, see Adding function getArticles. For a description of the mechanism used to circumvent the user defined function restrictions going on here...

Full article: Excel functions unlimited

Developers of database-related applications regularly get confronted with user requests to make the data available in MS Excel for further processing. File based exports (CSV, XLS) often don't do the job because there is no way to select the data and they do not provide automatic updates. Excel queries on the contrary are difficult to set up and manage. The usage of user defined functions would provide a simple and flexible way to get the data into Excel sheets but there are some built-in limitations to Excel functions which prevent the usage of tables as return values. The main limitation is the fact that Excel does not allow functions to write values into cells outside the range that was selected upon the function call. The following example will demonstrate this:

Public Function myuserdefinedfunction() As Variant
    On Error GoTo myerr
        Application.ActiveCell.Offset(1, 0).Value = 100
        myuserdefinedfunction = myrange
        Exit Function
    myerr:
    MsgBox Err.Number & " " & Err.Description
End Function

When this function is called from inside the code editor everything will work fine and the value of 100 will be written beneath the cell which was active at call time. But if the function is called from within a cell (=myuserdefinedfunction()) an error will occur and the cell beneath will stay empty.

The second difficulty is to return more than a single value as the result of a function call. Although this is always possible with matrix formulas, the following example will show you the limitations:

Public Function myMatrixformula(base As Integer) As Variant
    Dim retval(4, 4) As Variant
    For i = 0 To 4
        For j = 0 To 4
            retval(i, j) = base ^ i
        Next j
    Next i
    myMatrixformula = retval
End Function

This function fills a two-dimensional array with values depending on the value of the parameter base. If a range of 6x6 cells is selected within Excel and the expression "=myMatrixformula(2)" is entered by simultaneously pressing <CTL><SHIFT><ENTER> the function will be used as a matrix formula and show the following result:

Although the function only returns an array of 5x5 values, Excel fills the whole range that was selected at call time. If the function does not provide enough values, the redundant cells are filled with #NA (not available) values. If the range selected is too small, Excel would only show as many values as are necessary for filling this range and suppress the rest (without an error message or any other hint). The requirement of pressing <CTRL><SHIFT><ENTER> simultaneously is another drawback of matrix formulas that does not make them user friendly and therefore should be avoided. So, by now we have defined the prerequisites that our functions should fulfill:

  • no range should be selected in advance,
  • it should not be necessary to press <CTRL><SHIFT><ENTER> to enter the function,
  • if the size of the returned data-table changes due to insertions or deletions in the database or due to changes in the functions parameters, the size of the range in Excel should be adjusted automatically.

Besides these functional requirements, there are some other functionalities that these functions should contain:

  • Number and order of the columns to be displayed should be changeable.
  • Sorting should be possible.
  • Column headers should be displayed or not.
  • Filtering should be possible.

To demonstrate the functionality of these functions, I have developed a sample add-in that returns values from a simple MS-Access database.

The getCustomers function returns all the rows from the database table, customers. The parameter fieldlist expects a list of columns to be displayed, the parameter sortingCriterias expects a list of sorting parameters including sorting directions and the parameter Headers decides whether column headers should be displayed or not. All the parameters are optional, so if you enter =getCustomers(), the function returns the whole table of customers with no sorting including column headers. Function getOrderItems is similar to getCustomers, but it additionally provides a possibility to filter the values to be displayed by entering the parameters startDate and endDate. All the parameters are optional too, so =getOrderItems() displays a table of all single items in all orders in the database:

Although the functions were entered in a single cell (getcustomers() in cell A1 and getOrderItems() in cell A10), the results look like as if they were entered as a matrix formula with the correct range selected in advance. Parameter changes can be carried out by editing a random cell in the result table and simultaneously pressing<CTRL><SHIFT><ENTER>, even if the size of the result table changes due to the parameter changes. This behavior is achieved through the following trick:

  • Within the initial function call the database query is executed to get the size of the result table. The formula and the parameters are stored for later usage and the function returns an arbitrary value (#n/a).
  • Excel automatically triggers a Calculate event after each function call. This event is caught from the add-in.
  • During the processing of the Calculate event a range big enough to hold the result table is selected and the previously stored formula is inserted as an array formula into this range. Since the original function call is already processed there is no write protection for cells outside the function scope any more.
  • The insertion of the array formula within the Calculate event immediately triggers a further call of the function. The size of the selected array now matches the size of the required array and thus the result can be inserted.

Since the processing is usually very fast, repeated function calls are not visible to the user. The functions behave just like any other Excel function, but now they have the ability to return table data.

The collaboration between the DLL functions and the calculate event handling is somewhat tricky and it took some time especially to explore the behavior of Excel, but now, as the mechanism is up and running it is quite easy to use for any view or table in any database environment. I have tried to build the example add-in in a way that it can serve as a kind of framework for the development of own add-ins. C++ development skills are still required but with the help of this article customization can be done rather easily.

The add-in consists of the following components:

  • A Microsoft Excel Add-in file (excelmvf.xla, stands for Excel Multivalue Formulas).
  • A dynamic link library (excelmvf.dll), written in C++.

Excelmvf.xla is very simple, since it mostly contains function declarations only. There are functions getCustomers and getOrderItems declared as public so that they appear within Excel as user defined functions and the functions getCustomers_dll and getOrderItems_dll respectively, which are declared as private. The reason for not immediately declaring the DLL functions as public is because of the hidden parameter currentCell, which passes an IDispatch pointer to the calling cell to the DLL function. More on this later. To adapt this xla simply add the public and the associated private declarations of your functions.

An important role is played by the function setExcelApp, which is automatically called at startup and it passes a pointer to the Excel application object to the dynamic link library. This pointer is later on used in the DLL to initialize the event handling mechanism.

The dynamic link library excelmvf.dll is the main component of the add-In. It consists of a DLL main module and several classes, of which the most important are explained below.

In the DllEntryPoint function, which is also automatically called when the DLL is loaded, the connection string for the ADO database connection is assembled from the path of the DLL module and the file name of the database, in our case excelmvf.mdb. The usage of ADO (or OLE-DB, I never look through the Microsoft notations...) makes the use of different database environments very easy since only this connection string has to be changed to work with different database products. The following is an example for a connection string to an Oracle database via the Microsoft OLE-DB Bridge for ODBC:

Provider=MSDASQL.1;Password=mypassword;Persist Security Info=True;
                           User ID=myuser;Data Source=mydatasource.

Besides this adoption of the data source, there are no further changes required for database access.

The functions getCustomers_dll and getOrderItems_dll make up the interface to Excel, since they are called when the user enters the functions getCustomers and getOrderItems respectively in Excel. These functions will be referred to as excelmvf functions. Since they are essential to understand the mode of operation, the simplest function getCustomers_dll is shown in listing 1.

VARIANT __export __stdcall getCustomers_dll( char* pfieldlist, 
              char* psortcriterias, bool pheader, VARIANT pcell)
{
    bool rangeSizechanged=false;
    VARIANT retval,result;
    VARIANT callingCellAddress,callingCellWorksheet,
        callingCellFormula,callingCellRow,callingCellCol;
    VARIANT excelColRange,excelrangeNumCols, excelRowRange, 
        excelrangeNumRows;
    HRESULT hr;

    HWND funcparamWindow = 0;
    //the function argument window is of class bosa_sdm_XL9. 

    //If a window of this class is found the user enters

    //arguments in this window. In this case we always 

    //return #NA

    funcparamWindow = FindWindow("bosa_sdm_XL9",NULL);
    if (funcparamWindow){
        retval.vt = VT_ERROR;
        retval.scode=excelmvf::VALNOTAVAILABLE;
        return retval;
    }

    //read values we need to know for storage in the 

    //FuncParam variable to later access this information

    //in the ExcelEventHandler

    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &callingCellAddress,pcell.pdispVal,L"Address",0);
    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &callingCellFormula,pcell.pdispVal,L"Formula",0);
    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &callingCellCol,pcell.pdispVal,L"Column",0);
    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &callingCellRow,pcell.pdispVal,L"Row",0);
    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &callingCellWorksheet,pcell.pdispVal,L"Worksheet",0);

    //query current range and determine number 

    //of cols and rows if pcell is part of an array 

    //formula, Count delivers size of array

    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &excelColRange,pcell.pdispVal,L"Columns",0);
    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &excelrangeNumCols,excelColRange.pdispVal,L"Count",0);
    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &excelRowRange,pcell.pdispVal,L"Rows",0);
    TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
        &excelrangeNumRows,excelRowRange.pdispVal,L"Count",0);

    //fetch data

    const TCustomerDataset* dataset = requestController.getCustomers(
                      std::string(psortcriterias ? psortcriterias :""));
    //if the formula is returned in an Array of Variants 

    //(MultiArrayFormula) then read the formula from the 

    //first element of the array

    if (callingCellFormula.vt == (VT_ARRAY | VT_VARIANT)){
        //it's an 2-dimensional array of Variants (dim1=rows, dim2 cols)

        //Lbounds of Dimensions start with 1

        long indices[] = {1,1};
        hr = SafeArrayGetElement(callingCellFormula.parray,
                                indices ,&callingCellFormula);
    }
    //compare the size of the range with the required size

    if (dataset!=NULL && !dataset->isempty() && dataset->isvalid() ){
        TRangeSize neededRangeSize = dataset->getRangeSize(
                     std::string(pfieldlist ? pfieldlist:""),pheader);
        if(neededRangeSize.rows != excelrangeNumRows.lVal || 
                  neededRangeSize.cols != excelrangeNumCols.lVal)
            rangeSizechanged=true;
    }
    else
        rangeSizechanged = false;
    //if the function call was not invoked by 

    //the ExcelEventDispatchers calculate-event

    //and we are working on the active cell, 

    //store all function parameters for later

    //processing in ExcelEventDispatchers calculate event

    if (TRequestController::funcParams.getCalculating()==false && 
                                        rangeSizechanged == true){
        //store parameters for later processing 

        //in the Calculate Event

        TRequestController::funcParams.setFuncParams(NULL,
               NULL,
               std::string(pfieldlist ? pfieldlist : ""),
               std::string(psortcriterias ? psortcriterias :""),
               pheader,
               excelmvf::getCustomers,
               callingCellFormula,
               callingCellAddress,
               callingCellCol,
               callingCellRow,
               excelrangeNumCols,
               excelrangeNumRows,
               callingCellWorksheet
               );
        TRequestController::funcParams.setState(true);

        //the correct values are retrieved by a second 

        //function call triggered by the Calculate Event

        //in the meantime we return N/A

        retval.vt = VT_ERROR;
        retval.scode=excelmvf::VALNOTAVAILABLE;
        return retval;
    }

    //the function was either called by automatic calculation 

    //or after rewriting the formula in the Calculate-Event

    //which triggers a further function call. Return the 

    //results in an Array

    TRequestController::funcParams.setCalculating(false);

    if (dataset==NULL || dataset->isempty() || 
                          (dataset->isvalid() == false)){

            retval.vt=VT_ERROR;
            retval.scode=excelmvf::VALNOTAVAILABLE;

            return retval;
    }
    retval = dataset->getResultArray(
                std::string(pfieldlist ? pfieldlist : ""),
                pheader
                );

    return retval;
}

Listing 1: function getCustomers_dll

The main functions collaborate closely with the TExcelEventDispatcher::Calculate function, which is automatically called from the Excel event mechanism after each call of any Excel function. This function is implemented in the class TExcelEventDispatcher, which is derived from Borland's TEventDispatcher class and uses the pointer to the Excel-Application object passed from the .xla file at startup to establish a connection with Excel's event mechanism. Listing 2 shows this function:

void __fastcall TExcelEventDispatcher::Calculate(TVariant* params)
{
    VARIANT result, retval;
    HRESULT hresult;

    if(!deleting){
       if (TRequestController::funcParams.getState()){
           try{
                VARIANT worksheet = 
                    TRequestController::funcParams.getWorksheet();
                VARIANT address = 
                    TRequestController::funcParams.getAddress();
                VARIANT formula = 
                    TRequestController::funcParams.getFormula();
                const excelmvf::TDatasetBase* dataset;
                switch ( TRequestController::funcParams.getFormulaID()){
                   case excelmvf::getOrderItems:{

                       dataset = dynamic_cast<CONST TOrderItemDataset*>(
                           requestController.getOrderItems(
                               TRequestController::funcParams.getStartdate(),
                               TRequestController::funcParams.getEnddate(),
                               TRequestController::funcParams.getSortparams())
                           );
                       break;
                       }
                   case excelmvf::getCustomers:
                       dataset = dynamic_cast<CONST TCustomerDataset*> (
                            requestController.getCustomers(
                                TRequestController::funcParams.getSortparams())
                            );
                       break;
                }

                // Data available

                if (!(dataset==NULL ||
                      dataset->isempty() ||
                      (dataset->isvalid() == false))){

                      VARIANT formulaLocation, neededcols, neededrows;
                      VARIANT excelColRange,excelRowRange,
                                  excelrangeNumCols,excelrangeNumRows;
                      VARIANT emptyVariant;
                      emptyVariant.vt=VT_EMPTY;

                      VARIANT EnableCalculation;
                      EnableCalculation.vt=VT_BOOL;

                      //get range represented by column and row value

                      //ATTENTION: parameters must be passed to 

                      //Autowrap in reverse order!

                      hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
                                    &formulaLocation,
                                    worksheet.pdispVal,
                                    L"Cells",
                                    2,
                                    TRequestController::funcParams.getCol(),
                                    TRequestController::funcParams.getRow());

                      //get required number of rows and colums

                      TRangeSize neededRangeSize = dataset->getRangeSize(
                                   TRequestController::funcParams.getFieldlist(),
                                   TRequestController::funcParams.getHeader());

                      //if the current range has exactly one column and one 

                      //row, it is a new user input

                      if (TRequestController::funcParams.getNumCols().lVal == 1 && 
                            TRequestController::funcParams.getNumRows().lVal == 1){
                         neededrows.vt = VT_I4;
                         neededcols.vt = VT_I4;
                         neededrows.lVal = neededRangeSize.rows;
                         neededcols.lVal = neededRangeSize.cols;
                         //extend range to needed size

                         //(attention: parameters in reverse 

                         //order -> limitation of AutoWrap)

                         hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
                                                       &formulaLocation,
                                                       formulaLocation.pdispVal,
                                                       L"Resize",
                                                       2,
                                                       neededcols,
                                                       neededrows);

                         TRequestController::funcParams.setCalculating(false);
                         TRequestController::funcParams.setState(false);
                         //write Formula as Array

                         hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYPUT,
                                                        &result,
                                                        formulaLocation.pdispVal,
                                                        L"FormulaArray",
                                                        1,
                                                        formula);

                      }
                      //if it is a multicolum range then the 

                      //request originates from an array formula 

                      //with changed parameters delete Formulaarray 

                      //and rewrite formula to restart the retrieval process

                      else{

                         //create a range with the first cell

                         hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
                                        &formulaLocation,
                                        worksheet.pdispVal,
                                        L"Cells",
                                        2,
                                        TRequestController::funcParams.getCol(),
                                        TRequestController::funcParams.getRow());
                         //resize the range to the current size for deletion

                         hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
                                     &formulaLocation,
                                     formulaLocation.pdispVal,
                                     L"Resize",
                                     2,
                                     TRequestController::funcParams.getNumCols(),
                                     TRequestController::funcParams.getNumRows());

                         //delete the current formula array

                         //set static variable deleting to true 

                         //to avoid reentering this code due to automatic 

                         //Excel recalculation after deleting the range

                         deleting=true;
                         hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYPUT,
                                                        NULL,
                                                        formulaLocation.pdispVal,
                                                        L"Value",
                                                        1,
                                                        emptyVariant);
                         deleting=false;
                         //create a range with the first cell

                         hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
                                        &formulaLocation,
                                        worksheet.pdispVal,
                                        L"Cells",
                                        2,
                                        TRequestController::funcParams.getCol(),
                                        TRequestController::funcParams.getRow());

                         TRequestController::funcParams.setCalculating(false);
                         TRequestController::funcParams.setState(false);

                         //rewrite formula to restart the whole retrieval process

                         hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYPUT,
                                                        &result,
                                                        formulaLocation.pdispVal,
                                                        L"Formula",
                                                        1,
                                                        formula);
                         //select the first cell

                         //(otherwise the range from the former 

                         //delete operation would persist)

                         hresult = TComHelper::AutoWrap(DISPATCH_METHOD,
                                                        NULL,
                                                        formulaLocation.pdispVal,
                                                        L"Select",
                                                        0);
                      }
                }

            }
            catch(const TCOMException& e){
                  VARIANT excelMainWindowHWND;
                  hresult = TComHelper::AutoWrap(DISPATCH_PROPERTYGET,
                                                &excelMainWindowHWND,
                                                pExcelAppIDispatch,
                                                L"HWND",
                                                0);
                  int errcode = HRESULT_CODE(e.getErrCode());
                  if (errcode==1004)
                        //this error occurs on writing into an protected area

                        //(e.g. into an area with another formula array)

                        MessageBox(excelMainWindowHWND.plVal,
                                   excelmvf::WriteError.c_str(),
                                   excelmvf::appName.c_str(),
                                   MB_ICONEXCLAMATION);
                  else
                        MessageBox(excelMainWindowHWND.plVal,
                                   e.getErrMsg().c_str(),
                                   excelmvf::appName.c_str(),
                                   MB_ICONEXCLAMATION);
                  TRequestController::funcParams.setCalculating(false);
                  
            }
       }
    }
}

Listing 2: function Calculate is called whenever an Excel Calculate event occurs

At first glance, the frequent use of the function TComHelper::Autowrap in both listings stands out. The purpose of this function is to call Excel functions or read/set property values of Excel objects within the DLL. It needs a pointer to the IDispatch interface of the object whose methods need to be called or properties need to be read/written. In the case of excelmvf functions, we have to interfere with the range from which the function call originates. The pointer to this range is passed as the last argument pcell from the excelmvf functions.

It is helpful to distinguish two types of function calls to the excelmvf functions:

  • Type 1 calls originate from a range, which comprises of a single cell. Calls of that type arise from a new user input or, as we will see, as a result of the processing calls of type 2.
  • Type 2 calls originate from a range that is made up of at least two adjacent cells. They arise from recalculations of the already existing excelmvf functions due to Excel's recalculation mechanism or user changes in the function parameters.

The first few lines in listing 1 figure out whether Excel's function input dialog is open. Excel makes function calls during parameter validation and shows the intermediate results in this dialog. Because only single values can be displayed here excelmvf functions always return #n/a when the function call originates from this dialog.

In the next few lines properties of the calling Excel range are read out for later validation and storage. Then, a call to the requestController.getCustomers function returns a pointer to a DataSet object that represents the results of the database query.

Afterwards, the type of the calling range is analyzed. Only for Type 2 calls the calling range is of type SAFEARRAY-VARIANT and the original formula has to be read out from the first cell of the array. This is not necessary for Type 1 calls.

In the next step, the size of the calling range and the required size for the return values are compared. The size always differs for Type 1 calls, but is different for Type 2 calls when the size of the result table is changed (e.g., when new records are added or when the user changes the parameters of the function).

If size-comparison says that the current range is not properly dimensioned to hold the database query the function stores all the required values in the (static) TrequestController::funcParams struct and then returns #n/a value(s).

Excel now triggers a calculate event and therefore our TExcelEventDispatcher::Calculate function is called (Listing 2). The event has to be processed when the deleting flag is not set (this flag is explained later on) and the state of the funcParams struct is set in the previous call to the excelmvf function. If this is not the case the function deals with an event triggered by a conventional Excel function like e.g. SUM() and has nothing to do.

After deciding which of the excelmvf functions were originally called by querying the FunctionID, a pointer to the respective dataset is retrieved by calling the appropriate get function of the requestController. If a valid dataset is returned, the size of the result set is determined. At this point, it is important if the excelmvf function call was of Type 1 or Type 2.

Type 1 calls (originating from a single cell) are now processed by extending the range in Excel to the size needed to contain the results. Then the original function is copied into this extended range as a FormulaArray thus triggering an immediate recall of the original excelmvf function. In this second call the comparison between the selected range and the needed range does not return a difference any more and therefore the results can be inserted. Voil�.

A call of Type 2 needs some additional processing to turn it into a call of type 1. The range originally selected has to be deleted, since it is not correctly dimensioned to take the results any more. This is done by simply inserting an empty VARIANT value into the whole original array. Before doing so the deleting flag has to be set to true to avoid processing the triggered calculate event. Afterwards the deleting flag is reset to false and the original formula is inserted into the upper left cell of the original range. This results in a function call of Type 1, which is then processed as described above.

Before I describe the necessary adoptions to the above code to fit our own needs some important classes that deal with data handling should be introduced. As you could already see in the listings there are several calls to the methods of TDataset objects. These calls retrieve either the information about the size of the result table or the result table itself. TDataset derived classes thus represent the result of a database query which in turn are made up of records represented by classes derived from the class TRecord. To create a new function that retrieves data from a different table or view it is necessary to derive new classes from TDataset and TRecord respectively. Since TDataset expects its respective TRecord class as a template parameter upon derivation we first have to derive from TRecord. A look at the TCustomerRecord header file shows what has to be considered next.

The static array of TFieldDescriptor values contains descriptions of the fields contained in the database query. At least one TFieldDescriptor value has to be defined for each field but defining more of them with different values for the fieldNameAlias member opens the possibility to identify fields with different synonyms. In our example, the postal code field will show up in the result list when the user enters one of the following values in the fieldList parameter: Postal code, Postalcode, ZIP, Postleitzahl, PLZ. The defaultFieldList string contains a list of fields that should be displayed as default i.e. when the user does not enter a value for the fieldList parameter. In the private section of the class all the fields of the query are defined as VARIANT members. To make them accessible to the respective TDataset derivation, without having to define getter-functions for all the query fields, the TDataset class is declared as a friend class. And finally the [] operator has to be defined the same way as it is done in TCustomerRecord.cpp.

Now the new dataset class can be derived from TDataset. Besides the previously derived derivation of the TRecordset class as template parameter the following member variables and member functions are needed to be defined or overloaded: The SQL member contains the SQL query to retrieve the desired dataset from the database. For the TCustomerDataset it simply is:

Select * from customers

The retrieve function fetches the results of the query from the database and stores them in a std::set container. The implementation can strictly follow the example from TCustomerDataset, only the while (!query->Eof) loop has to be filled with the number and names of the fields retrieved from the query. Finally, it is also needed to overload the < operator. This is necessary to avoid duplicate and performance decreasing database queries within short intervals. This needs a small explanation: as mentioned above in the description of the code in listing 1 and 2 there are two calls of the getRangeSize function within a very short interval (one in the excelmvf function and one in the calculate event). To be able to tell the range size of the resulting table it is required to execute the database query when the first call is executed. To avoid a further query in the second call the results of the query are stored in a local cache made up of std::set container for a short period of time (in this implementation for 10 seconds, can be set in the static member variable TrequestController::cachingperiod). The < operator is required to find the dataset in the container within this cachingperiod.

The functions that are defined in the example add-in and all the functions that may be added are bound together by the class TRequestController. For each function to be implemented this class contains a respective getter function, in our example they are getCustomers and getOrderItems. Besides that it contains the containers to implement the local cache for avoiding dispensable database queries.

To avoid time consuming debugging sessions when trying to implement a excelmvf function I have described the whole process of adding a new function in detail. You can find this description here (Adding function getArticles)

An add-in based on this framework is in use in a 160+ employees company, where it is very well accepted. The implementation does not have any negative impacts on Excel's stability or performance. This encouraged me to publish this article and start an open source project (Excel Multivalue formula framework at sourceforge.net), which will hopefully collect all enhancements that are to be done based on the idea of Excel multivalue formulas.

Points of interest

Besides overcoming the mentioned Excel limitations, the example demonstrates:

  • how to catch and process Excel events within a DLL implemented in C++.
  • how to interact with Excel objects within a DLL implemented in C++ using their COM-interface.

History

  • 2005/08/20: Article released.
  • 2006/11/30: Version for VC++ 2005 Express Edition (available freely here) added.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here