Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / productivity / Office

Excel-addin for Database Querying by User Defined Functions

4.63/5 (19 votes)
2 Nov 2007GPL318 min read 1   3.4K  
DBFuncs is an Excel-addin for database querying by user defined functions
Image 1

Introduction

IMPORTANT: DBFuncs has been merged with DBSheets and has moved here!

DBFuncs is an Excel-addin for database querying by userdefined functions. This is opposed to the Excel integrated Microsoft-Query, which is integrated statically into the worksheet and has some limitations in terms of querying possibilities and flexibility of constructing parameterized queries (Microsoft-Query allows parameterized queries only in simple queries that can be displayed graphically).

Furthermore it includes the possibility for easy filling of so called "data bound controls" (ComboBoxes and Listboxes) with data from queries.

DBFuncs has been tested extensively (actually it's in production) only with Excel XP and Microsoft-SQL Server, other databases (MySQL, Oracle, PostgreSQL, DB2 and Sybase SQL Server) have just been tested with the associated Testworkbook DBFuncsTest.xls.

To use that Testworkbook you'll need the pubs database, where I have scripts available for Oracle, Sybase, DB2, PostgreSQL and MySql on my website (the Microsoft-SQL Server version can be downloaded here).

You can also download the latest version of DBFuncs (packed in 7-ZIP format) from my website, it is distributed under the GNU Public License.

There are two principal ways to query data with DBFuncs:

  1. A (fast) list-oriented way using DBListFetch.
    Here the values are entered into a rectangular list starting from the TargetRange cell (similar to Microsoft-Query, actually the QueryTables Object is used to fill the data into the Worksheet).
  2. A record-oriented way using DBRowFetch.
    Here the values are entered into several ranges given in the Parameter list TargetArray. Each of these ranges is filled in order of appearance with the results of the query.

These user-defined functions insert the queried data outside their calling cell context, which means that the target ranges can be put anywhere in the workbook (even outside of the workbook).

Additionally, some helper functions are available:

  • chainCells, which concatenates the values in the given range together by using "," as separator, thus making the creation of the select field clause easier
  • concatCells simply concatenating cells (making the "&" operator obsolete)
  • DBString, building a quoted string from an open ended parameter list given in the argument. This can also be used to easily build wildcards into the String
  • DBinClause, building an SQL in clause from an open ended parameter list given in the argument
  • DBDate, building a quoted Date string (format YYYYMMDD) from the date value given in the argument
  • MarktwertHolen, fetching Date-oriented market data into a single cell

Finally, there is a supporting tool available for building queries and placing them into DBListFetch or DBRowFetch functions (similar to Microsoft-Query).

Using the Code

DBListFetch

VB.NET
DBListFetch (Query, ConnectionString (optional), TargetRange,
             FormulaRange (optional), ExtendDataArea (optional),
             HeaderInfo (optional), AutoFit (optional),
             AutoFormat (optional), ShowRowNum (optional))

The select statement for querying the values is given as a text string in parameter Query. This text string can be a dynamic formula, i.e. parameters are easily given by concatenating the query together from other cells, e.g.
"select * from TestTable where TestName = "&A1

The connection string is either given in the formula, or for standard configuration can be left out and is then set globally inside the DBfuncs.xla addin (module DBfunctions, top):

VB.NET
Public Const ConstConnString = "provider=SQLOLEDB,Server=..."
Public Const CnnTimeout = 10

The returned list values are written into the Range denoted by TargetRange. This can be:

  • just any range, resulting data being copied beginning with the left-uppermost cell
  • a self-defined named range (of any size) as TargetRange, which resizes the named range to the output size. This named range can be defined (and set as function parameter) either before or after results have been queried

There is an additional FormulaRange that can be specified to fill "associated" formulas (can be put anywhere (even in other workbooks), though it is only allowed outside of the data area). This FormulaRange can be:

  • either a one dimensional row-like range or
  • a self-defined named range (of any size extent, columns have to include all calculated/filled down cells) , which resizes the named range to the output size. This named range can be defined (and set as function parameter) either before or after results have been queried. Watch out when giving just one cell as the named range, this won't work as it's not possible in VBA to retrieve another assigned name of a cell and a hidden name is used to store the last extent of the formula range. The workaround is to assign at least two cells (columns or rows) to that name.

It has formulas usually referring to cell-values fetched within the data area. All Formulas contained in this area are filled down to the bottom row of the TargetRange. In case the FormulaRange starts lower than the topmost row of TargetRange, then any formulas above are left untouched (e.g. enabling possibly different calculations from the rest of the data). If the FormulaRange starts above the TargetRange, then an error is given and no formulas are being refreshed down. If a FormulaRange is assigned within the data area, an error is given as well.

In case TargetRange is a named range and the FormulaRange is adjacent, the TargetRange is automatically extended to cover the FormulaRange as well. This is especially useful when using the compound TargetRange as a lookup reference (Vlookup).

The next parameter ExtendDataArea defines how DBListFetch should behave when the queried data extends or shortens:

  • 0: DBListFetch just overwrites any existing data below the current TargetRange.
  • 1: inserts cells of just the width of the TargetRange below the current TargetRange, thus preserving any existing data. However any data right to the target range is not shifted down along the inserted data. Beware in combination with a FormulaRange that the cells below the FormulaRange are not shifted along in the current version !!
  • 2: inserts whole rows below the current TargetRange, thus preserving any existing data. Data right to the target range is now shifted down along the inserted data. This option is working safely for cells below the FormulaRange.

The parameter headerInfo defines whether Field Headers should be displayed (TRUE) in the returned list or not (FALSE = Default).

The parameter AutoFit defines whether Rows and Columns should be autofitted to the data content (TRUE) or not (FALSE = Default). There is an issue with multiple autofitted target ranges below each other, here the autofitting is not predictable (due to the unpredictable nature of the calculation order), resulting in not fitted columns sometimes.

The parameter AutoFormat defines whether the first data row's format information should be autofilled down to be reflected in all rows (TRUE) or not (FALSE = Default).

The parameter ShowRowNums defines whether Row numbers should be displayed in the first column (TRUE) or not (FALSE = Default).

DBRowFetch

VB.NET
DBRowFetch (Query, ConnectionString (optional),
    headerInfo(optional/ contained in paramArray), TargetRange (paramArray))

For the query and the connection string the same applies as mentioned for DBListFetch.
The value targets are given in an open ended parameter array after the query, the connection string and an optional headerInfo parameter. These parameter arguments contain ranges (either single cells or larger ranges) that are filled sequentially in order of appearance with the result of the query.
For example:

VB.NET
DBRowFetch("select job_desc, min_lvl, max_lvl, job_id from jobs " &_
           "where job_id = 1",,A1,A8:A9,C8:D8)

This would insert the first returned field (job_desc) of the given query into A1, then min_lvl, max_lvl into A8 and A9 and finally job_id into C8.

The optional headerInfo parameter (after the query and the connection string) defines, whether field headers should be filled into the target areas before data is being filled.
For example:

VB.NET
DBRowFetch("select job_desc, min_lvl, max_lvl, job_id from jobs",_
           ,TRUE,B8:E8, B9:E20)

This would insert the headers (job_desc, min_lvl, max_lvl, job_id) of the given query into B8:E8, then the data into B9:E20, row by row.

The orientation of the filled rows is always determined by the first range within the TargetRange parameter array: if this range has more columns than rows, data is filled by rows, else data is filled by columns.
For example:

DBRowFetch("select job_desc, min_lvl, max_lvl, job_id from jobs",_
           ,TRUE,A5:A8, B5:I8)

This would fill the same data as above (including a header), however column-wise. Typically this first range is used as a header range in conjunction with the headerInfo parameter.

Beware that filling of data is much slower than with DBlistFetch, so use DBRowFetch only with smaller data-sets.

chainCells(Range)

VB.NET
chainCells(ParameterList)

chainCells "chains" the values in the given range together by using "," as separator. Its use is mainly to facilitate the creation of the select field clause in the Query parameter, e.g.

VB.NET
DBRowFetch("select " & chainCells(E1:E4) & " from jobs where job_id = 1",_
           ,A1,A8:A9,C8:D8)

Where cells E1:E4 contain job_desc, min_lvl, max_lvl, job_id respectively.

concatCells

VB.NET
concatCells(ParameterList)

concatCells concatenates the values in the given range together. Its use is mainly to facilitate the building of very long and complex queries:

VB.NET
DBRowFetch(concatCells(E1:E4),,A1,A8:A9,C8:D8)

Here, cells E1:E4 contain the constituents of the query respectively.

Both chainCells and concatCells work with matrix conditionals, i.e. matrix functions of the form: {=chainCells(IF(C2:C65535="Value";A2:A65535;""))} that only chain/concat values from column A if the respective cell in column C contains Value.

DBinClause

VB.NET
DBinClause(ParameterList)

Creates an in clause from cell values, strings are created with quotation marks, dates are created with DBDate (see there for details, formatting is 0).

VB.NET
DBinClause("ABC", 1, DateRange)

The above would return "('ABC',1,'20070115')", if DateRange contained <date month="1" day="15" year="2007">15/01/2007 as a date value.

DBString

VB.NET
DBString(ParameterList)

This builds a Database compliant string (quoted) from the open ended parameter list given in the argument. This can also be used to easily build wildcards into the String, like:

VB.NET
DBString("_",E1,"%")

When E1 contains "test", this results in '_test%', thus matching in a like clause the strings 'stestString', 'atestAnotherString', etc.

DBDate

VB.NET
DBDate(DateValue, formatting (optional))

This builds from the date/datetime/time value given in the argument based on parameter formatting either

  1. (default formatting = 0) A simple datestring (format 'YYYYMMDD'), datetime values are converted to 'YYYYMMDD HH:MM:SS' and time values are converted to 'HH:MM:SS'.
  2. (formatting = 1) An ANSI compliant Date string (format date 'YYYY-MM-DD'), datetime values are converted to timestamp 'YYYY-MM-DD HH:MM:SS' and time values are converted to time time 'HH:MM:SS'.
  3. (formatting = 2) An ODBC compliant Date string (format {d 'YYYY-MM-DD'}), datetime values are converted to {ts 'YYYY-MM-DD HH:MM:SS'} and time values are converted to {t 'HH:MM:SS'}.

An example is given below:

VB.NET
DBDate(E1)
  • When E1 contains the excel native date 18/04/2005, this results in : '20050418' (ANSI: date '2005-04-18', ODBC: {d '2005-04-18'}).
  • When E1 contains the excel native date/time value 10/01/2004 08:05, this results in: '20040110 08:05:00' (ANSI: timestamp '2004-01-10 08:05:00', ODBC: {ts '2004-01-10 08:05:00'})
  • When E1 contains the excel native time value 08:05:05, this results in '08:05:05' (ANSI: time '08:05:05', ODBC: {t '08:05:05'})

Of course you can also change the default setting for formatting by changing the function header in Module DBfunctions in DBfuncs.xla:

VB.NET
DBDate(datVal As Date, Optional formatting As Integer = 0) As String

Plookup

VB.NET
Plookup(inputRange, selectionRange, targetRange)

Does a pattern weighted lookup of values in range inputValues in pattern lookup area selectionRange return the values contained in found row of range targetRange (if entered as a matrix function).

In case Plookup is not entered as a matrix function, Plookup returns the first column of the matching row of targetRange. If more than one row matches, it always returns values from the first matching row.

Example:

VB.NET
selection range | target range
1 * 3 4 5       | 11  12  13
* 2 3 * 5       | 21  22  23
* * 3 * 5       | 31  32  33
1 2 3 4 *       | 41  42  43
1 2 3 4 5       | 51  52  53

input: 1 2 3 4 x > matches 4th row -> returns 41  42  43
input: 1 2 3 4 5 > matches 5th row -> returns 51  52  53
input: x y 3 z 5 > matches 3rd row -> returns 31  32  33
input: x 2 3 z 5 > matches both 2nd and 3rd row -> returns 21  22  23
                        because row 2 is more precise

Supporting Tool Query Builder

There is a supporting tool for building queries (similar to Microsoft Query, in fact Microsoft Query is used as a frontend to build the query) available in DBfuncSupport.xla.

The query builder is invoked by right clicking anywhere in an open Excel sheet and selecting "build DBfunc query". This starts Microsoft Query (see associated documentation on using Microsoft Query to define queries), building the query either in the Query-Assistant or in Microsoft-Query itself, the option "Back to Excel" ("Zurück an Excel" in the german version) lets you insert the query into the active worksheet as one of the following "DB items":

  • a DBListFetch function
  • a DBRowFetch function
  • a Dropdown Database form
  • a Listbox Database form

The following dialog is used to achieve this:

Image 2

First the DB item to be inserted has to be chosen by selecting in the "DB function/ DB bound control" choice box. Then, depending on the above choice, the target cell for the function formula, resp. the target cells for the parameters of the function can be selected:

  • Function Target: The cell where the DB function (either DBRowFetch or DBListFetch) is being placed. Available only for DBRowFetch and DBListFetch.

  • Data Target: For DB functions, the cell(s) where retrieved database data is going to be placed. For DB bound controls this corresponds to the LinkedCell property, which is the target cell where the chosen control value is put. Available for all DB items.

  • Query Target: The cell where the query is placed in case it is bigger than 255 characters. Available only for DBRowFetch and DBListFetch. If this is explicitly set then the query is always placed there, regardless of it's size! Also the query is placed there if it contains quotation marks (").

  • ConnDef Target: The cell where the connection definition is placed in case "use custom database setting" is chosen (used to override the standard connection definition in DBfuncs.xla, see DBListFetch function description above) and the connection definition string is bigger than 255 characters. Available only for DBRowFetch and DBListFetch.

  • Range Calc: The range where formulas that should be filled in along with data are going to be placed. Available only for DBListFetch.

Other possible choices are:

  • additional Data choice: The way additional Data should be treated in DBListFetch (see explanation there). Available only for DBListFetch.
  • "include Header Info?": should Field Headers be included in DBListFetch (see explanation there). Available for DBListFetch DBDropDown and DBListbox.
  • "automatic Column Fit?": should columns be autofitted for DBListFetch (see explanation there). Available only for DBListFetch.
  • "automatic Format fill?": should 1st row formats be autofilled down for DBListFetch (see explanation there). Available only for DBListFetch.
  • "show row numbers?": should row numbers be displayed in 1st column for DBListFetch? (see explanation there). Available only for DBListFetch.
  • custom database setting: if a different database from the global standard connection definition (see DBListFetch function) is used, then this is activated. Available for all DB items. Also the ODBC provider MSDASQL.1 is inserted automatically in front of the rest of the connection string.

For DB bound controls the Data target corresponds to the LinkedCell of the control. Actually only data target, "include Header Info?" and "use custom database setting?" are available, as the query and the (possible custom) connection definition are always put to the right of the data target (LinkedCell).

With DB bound controls, also a "header" label (gray background, bars as separators) is added. This is also the reason why the font in the listbox/dropdown is always a fixed width, to make calculation and alignment of header/date widths possible:

Image 3

After creating the DB bound control both a question whether all currently existing DB controls within the workbook should be filled now and a hint to fill in the auto_open procedure ("Don't forget to add 'Application.Run ...' commands (in clipboard) to your workbook_open procedure (or create one)") are displayed. A minimum auto_open procedure is added to the clipboard which can be pasted into the workbook's ThisWorkbook module (this uses code originally written by Terry Kreft).

Global Connection Definition and Query Builder

There are two possibilities of connection strings: ODBC or OLEDB. ODBC has the advantage to seamlessly work with Microsoft-Query, native OLEDB is said to be faster and more reliable (there is also a generic OLEDB over ODBC by Microsoft, which emulates OLEDB if you have just a native ODBC driver available).

Now, if using ODBC connection strings (those containing "Driver="), there is a straightforward way to redefine queries directly from the cell containing the DB function: just right click on the function cell and select "build DBfunc query". Then Microsoft-query will allow you to redefine the query which you can use to overwrite the function's query.

If using OLEDB connection strings, Microsoft-query will try to connect using a system DSN named like the database as identified after the DBidentifierCCS given in the global constant connection definition:

VB.NET
Public Const ConstConnString = "Provider=OraOLEDB.Oracle;.. ..;User ID=pubs"
Public Const DBidentifierCCS = "User ID="
Public Const DBidentifierODBC = "UID="

The DBidentifierCCS is used to identify the database within the global constant connection definition, The DBidentifierODBC is used to identify the database within the connection definition returned by Microsoft-Query (to compare and possibly allow to insert a custom connection definition within the DB function/control). Usually these identifiers are called "Database=" (all SQLservers, MySQL), "location=" (PostgreSQL), "User ID/UID" (oracle), "Data source=" (DB2)

DB Bound Controls

Creating DB Bound Controls

DB bound controls can be created using the supporting tool described above, but can also be created manually following conventions:

  • The name property of the control must start with "DB_", required for the filling procedure (see below)
  • The LinkedCell property must be filled, this is needed to locate the associated query (right to the linked cell)
  • Correspondingly, the cell containing the query must be exactly one cell right to the LinkedCell
  • The optional connection definition string must be exactly two cells right to the LinkedCell
  • The ColumnCount property should reflect the expected column count from the query.

Filling DB Bound Controls

The DB bound controls are filled by calling the Macro initDBforms in the DBfuncSupport.xla addin the following way (preferably in an Workbook_open event proc):

VB.NET
Private Sub Workbook_Open()
 Application.Run "DBFuncs.xla!initDBforms", ThisWorkbook.Name
End Sub

This walks through the supplied workbooks sheets, refreshing all contained controls having names starting with "DB_" with the queries contained in the cell directly right to the LinkedCell. Additionally it checks whether the next cell right to the query cell contains something other than the empty string and in that case takes the content as a custom connection definition.

The DB bound control retains the selection during a refresh (e.g. saving/reopening the workbook). Also the height and width of the DB bound control are maintained.

Installation

Installation is simply done by copying the two Excel addins DBFuncs.xla and DBFuncSupport.xla into an XLStart folder of your choice (there are basically three of them:

  • one in C:\Programs\Microsoft Office\Office<X>\XLStart
  • one in C:\Documents and Settings\<username>\Application Data\Microsoft\Excel\XLStart
  • and the last in the Alternative start folder you specified in Options/General).

Then you'd want to adapt the standard connection string that is globally applied if no function-specific connection string is given. This is done in DBFuncs.xla, module DBfunctions, on top:

VB.NET
Public Const ConstConnString = "provider=SQLOLEDB,Server=…. "
Public Const CnnTimeout = 10

When starting the Testworkbook, after waiting for the – probable – connection error, you have to change the connection string(s) to suit your needs (see below for explanations).

Image 4

The connection information in DBFuncsTest.xls is stored to the right of the black line, the actual connection is then selected by choosing the appropriate shortname (dropdown) in the yellow input field. After the connection has been changed, don't forget to refresh the queries/DBforms by right clicking and selecting "refresh data".

Points of Interest

The basic principle behind returning results into an area external to the Database query functions, is the utilisation of the calculation event (as mentioned in and inspired by the excelmvf project, see here for further details), as Excel won't allow ANY side-effects inside a UDF.

There is lots of information to be carried between the function call and the event (and back for status information). This is achieved by utilising a so-called calcContainer and a statusMsgContainer, basically being VBA classes abused as a simple structure that are stored into global collections called allCalcContainers and allStatusContainers. The references of the correct calcContainers and statusMsgContainers are the Workbook-name, the table name and the cell address of the calling functions which is quite a unique description of a function call (this description is called the callID in the code).

The diagram given below should clarify the process:

Image 5

The real trick is to find out when resp. where to get rid of the calc containers, considering Excel's intricate way of invoking functions and the calc event handler (the above diagram is simplifying matters a bit as the chain of invocation is by no way linear in the calculations in the dependency tree).

Excel sometimes does additional calculations to take shortcuts and this makes the order of invocation basically unpredictable, so you have to take great care to just work on every function once and then remove the calcContainer (if you're interested in the actual invocations being done, set #Const DebugMe = True in both class module calcClass and module DBfunctions to see what's going on in the direct window).

After every calculation event the working calcContainers are removed, if there are no more calcContainers left, then allCalcContainers is reset to Nothing, being ready for changes in input data or function layout. Good resources for more details on the calculation order/backgrounds is Decision Model's Excel Pages, especially Calculation Secrets.

The information for resizing the list areas in DBListFetch is stored in hidden named ranges assigned to the calling function cell.

History

  • 2006: First versions using CopyFromRecordSet, Version 1.5 is the last one utilizing Range.CopyFromRecordSet for DBListFetch, as I found this function to be highly unreliable.
    From 2.0 on, I use QueryTables instead, which copies (at least until now) all data correctly into the sheet.
  • 31/01/2007: Version 2.0: Codeproject article posted
  • 03/02/2007: Version 2.1: Separate autosized naming of FormulaRange is now possible
  • 27/02/2007: Version 2.2
    • Bigfixes:
      • Check for non-range parameters in DBRowFetch's parameter list
      • chainCells & concatCells now working with general parameters
      • Remaining names from Querytables are deleted now
      • Autosizing names works now with single cells, too
  • 12/03/2007: Version 2.3
    • Bugfixes:
      • Single cell naming the TargetRange now correctly resizes to full extent
      • Autoformatting down now also works for formula cells
      • Worked around the validation list bug (calculation couldn't be set to manual in the event procedure)
      • Format filling now works correctly also with headers
      • Removed a very subtle error with multiple resizing TargetRanges beneath one another leading to DBlistFetch losing track of the upper TargetRange
      • Formatting didn't work correctly down to end of data areas
    • Enhancements:
      • a named TargetRange extends to the FormulaRange if the FormulaRange is adjacent to the TargetRange
      • Added DBinClause Function for creating bracketed in clauses from parameter arrays (ranges, etc.)
      • Included Option explicit for better compiler support
  • 25/03/2007: Version 2.4
    • Bugfixes:
      • Filtering now works correctly again
      • verketteZellen didn't work with non-ranges
    • Enhancements:
      • Added function help for all public functions
  • 09/04/2007: Version 2.5
    • Bugfixes:
      • Auto-Formatting works now for general formats (not only numerical)
      • error values (#NV!...) possible in formula range of DBListFetch
    • Enhancements:
      • Added Plookup function for weighted pattern lookup
  • 02/11/2007: DBAddin Version 1.0 now available at sourceforge

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)