Introduction
There are certain situations, where it does make sense to store all application data in memory on the Web server, even within a Web farm. If your application employs read-only state that is infrequently updated, it makes sense to store the pages in memory on the Web server to offload the database machine.
To make this work, all of the Web servers within the Web farm must look at the same physical data store, to load the application state into memory. Once loaded, all Web servers within the farm will have identical information in their in-memory caches. Propagating updates between the different Web servers in the farm is not a concern, since the data is read-only. If the application state has to be updated externally by an administrator, all of the Web servers will reload their cache when told to do so.
Microsoft employs this technique on some of its very large Web sites, including MSN™. Since much of the information on home.msn.com doesn't change very often, it makes sense to store as much of that information as possible in memory.
Even though the actual data may exist in a relational database, it makes sense to pull this information directly in memory, because it's so fast and inexpensive to do so. Why overload the database server with multiple read-only queries?
XML makes state easier to manage on the server as well as on the client machine. Using XML, developers can maintain the structural meaning of state information, while making it easy to manipulate through standard XML handler APIs.
If the application state is maintained in-memory as XML, performing a simple retrieval like get all headlines, is a piece of cake. With XML, you can take advantage of XSL patterns to query the application data and retrieve the pieces of information you want. You can display portions of the application state to the user by performing XSL transformations on the XML data.
This approach works well for read-only data that doesn't change very often. You'll have to reload the in-memory XML cache periodically, to account for any changes in the underlying XML. But it's possible to implement some techniques to directly and instantly update in-memory XML data, when some changes are happening on the database server. By using XML in-memory on the Web server, your data is much more accessible and easier to deal with.
Functionality
The application architecture is so simple. It has 3 parts: An ISAPI extension (WriteHeadlines
), a COM+ component (ArticleNews
) and a Data Repository (files, Access MDB, SQL Server).
- The
WiteHeadlines
ISAPI extension is responsible with the HTTP output on web browsers requests. - All business logic and data are kept in
ArticleNews
COM+ component. Here are nested a collection of XMLDOM documents. WriteHeadlines
extension asks the ArticleNews
component for HTML outputs. Here, we will take advantage of XSL patterns to query the application XMLDOM and retrieve the pieces of information we want. We can display portions of the application to the user, by performing XSL transformations on the XML data. - The data is loaded at system initialization in XMLDOM collection from a data repository. Because that data will be kept in memory, it doesn't matter so much where in fact that data is kept: in system files, Access MDB, SQL server, Oracle, etc.
Because we use ISAPI extension for HTML output, no ASP scripting is needed. So that is the best method to build a site, if we need fastest response, the biggest requests per second and large amount of HTML data transferred between web server(s) and clients' browsers.
Administration
First of all, will be need to load data into COM+ from data repository (files, Access, SQL Server) using the Control Panel.
Our web page finally must look like this:
The web page contains a left, top, body and right sections. The whole page is in fact a static HTML page, which is nested in memory in an XMLDOM XSL object. Using XML+XSL=HTML, the body of XSL is filled up with needed data from XMLDOM XML object. Now we have only 4 pages: That first page with all headlines, a hot stories page and another 2 which return the stories (affiliation or category) chosen by a user search. It is possible to kept in our XMLDOM collection, more XSL corresponding to more HTML pages.
So, in order to manage and extend that web, a powerful administration system is needed.
Using the Control Panel, we'll have access to all XML and XSLs kept in the collection. After the data loading, we can see all the documents in an IE preview, with nodes and syntax colors:
If we want to extend the web functionality, or add new pages, we will need to add new XSL patterns to query the application data and retrieve the pieces of information we want. Because we already have all XML data, an online patterns query against XML data is provided:
After entering the XSL pattern string, just click on SelectNodes or SelectNode button and the query result will be shown!
Put that new query string in a new method of ArticleNews
component, make a new link from WriteHeadlines
which accesses that method and we will have a new page in the site!
Of course, after the structure of the site is ready, it's needed to make changes in the HTML of each page, to add/change colors, add some words, pictures, style sheets, etc. That is possible by accessing the data repository (files, Access or SQL Server) or directly accessing the XMLDOM collection. In that way, we will obtain instantaneous changes on the web site, with no user interruption and with corresponding changes in the data repository.
Given below is a preview of this module. We have the XML document source data and all XSL document's data sources.
Here, it is possible to make changes on the XSL documents and save them directly in memory and in data repository using SaveXSL button. The document must be W3C well-formed, so that it will be signaled with red/green colors.
In fact, each XSL is an HTML document, so a PreviewHTML button is provided (which will obtain only the static HTML data, without XML).
To obtain a preview of the page of the AllHeadlines link, we must combine that XSL (HTML static) page with the XML document. Yeap, XML+XSL=HTML ! Of course, we must click on TransformNode button:
Coding
WriteHeadlines Extension
First of all, we need to have a pointer to the ArticleNews
component. That is done by a public
variable:
IHeadlinePtr pItfHeadline;
After that, will must instantiate the pointer and initialize the ArticleNews
XMLDOM collection by loading data. The administrator has the option to choose what is the data repository: files, Access MDB or SQL server. Just run LoadFiles
, LoadDBmdb
or LoadDBsql
from the respective links. The jobs are completed in fLoadDB
or fLoadFiles
functions:
hr = pItfHeadline.CreateInstance( L"ArticleNews.Headline.1" );
if ( FAILED( hr ) )
{
*bstrError = L"Error.Attempt to create instance of
TierXmlDom object failed with error";
return false;
}else
{
switch (iType)
{
case DB_SQL:
pItfHeadline->LoadDBsql(L"File Name="+
MapPath(pCtxt,L"Data\\Connection.udl"));
break;
case B_ACCESS:
pItfHeadline->LoadDBmdb( MapPath(pCtxt,
L"Data\\data.mdb") );
break;
}
if ( ! pItfHeadline->DomLoaded() )
{
*bstrError =
L"Component ArticleNews.Headline.1. XMLDOM is not loaded.";
return false;
}
}
It uses ArticleNews
' LoadDBsql
, LoadDBmdb
or LoadDBFiles
methods, depending on user choice.
Printall
, printhotstories
, printbycategory
, printbyaffiliation
are site methods which return the needed pages on the client browser. They are only wrappers over the respective ArticleNews
methods:
if(pItfHeadline == NULL)
{
*pCtxt <<_T("COM component not loaded.")
<<_T("Try Load XMLDOM options in Control Panel.");
return;
}
if (! pItfHeadline->DomLoaded())
{
*pCtxt << _T("XMLDOM not loaded in COM component.");
return;
}
*pCtxt << pItfHeadline->PrintHeadlines();
Addheadline
, printData
, transformNode
, buildXPath
, saveXSL
are WriteHeadlines
methods used in Control Panel administration system. The methods use AddHeadlineBatch
, GetXML
, SaveXSL
methods of ArticleNews
component methods and some specific JavaScript client side code.
An interesting function which automatically gives us the actual path of the web (exactly like ASP MapPath
) is GetPath
function:
pCtxt -> GetServerVariable ( "URL", szPath, &size );
pCtxt->ServerSupportFunction ( HSE_REQ_MAP_URL_TO_PATH, szPath, &size, 0 ) ;
Another helper function is LoadLongResource
, which quickly gives us encoded HTML code.
ArticleNews Component
The basic piece of in-memory store of XML information (XML and XSL) is DOM Document:
CComPtr<IXMLDOMDocument2> m_spDomDocHeadlines;
hr = m_spDomDocHeadlines.CoCreateInstance(__uuidof(DOMDocument30));
Because our program supports 3 types of data stores, it will have 3 pairs of Load
/Save
methods: LoadFiles
/SaveFiles
, LoadDBsql
/SaveSQL
, LoadDBmdb
/SaveMDB
.
We have 3 possibilities in our program to store the XML/XSL information. First way is to use CComPtr
variables for each of them. So we will have m_spDomDocHeadlines
for the XMLDOM document with XML data and m_spDomDocStylesheet
, m_spDomDocStylesheetNode
, m_spDomDocStylesheetIE
for the rest of XSL needed. That approach is not very good because we must add more code each time when we need a new XSL (HTML) page. The second way is a better improvement. All XSL documents are hosted into an STL collection - mapDOM
. It has Put
and Get
methods to load and retrieve the XSLs.
mapDOM.Put(L"xslAll", m_spDomDocStylesheet);
BSTR bstrOut;
hr = mapDOM.Get(bstrName)->get_xml(&bstrOut);
In that way, it is possible to add XSL pages into DB, and without any changes or recompile of COMponent, all new XSL (HTML) pages will be loaded into in-memory collection. The third way is just an optimization of the second approach. The idea is about the TransformNode
method which makes XML+XSL=HTML page. At each request, the XMLDOM processes the specific XSL, an operation which needs some time. Why not use the XMLDOM processor and keep all these XSL files already compiled? We will have a new STL collection - mapProcessor
with Put
and Get
methods:
hr = spDomDoc.CoCreateInstance(__uuidof(FreeThreadedDOMDocument30));
hr = spDomDoc->loadXML(Value, &bSuccess);
CComPtr<IXSLTemplate> spDOMXSLTpl;
CComPtr<IXSLProcessor> spDOMXSLProc;
spDOMXSLTpl.CoCreateInstance(__uuidof(XSLTemplate30));
spDOMXSLTpl->putref_stylesheet(spDomDoc);
hr = spDOMXSLTpl->createProcessor(&spDOMXSLProc);
mapProcessor.Put( Name, spDOMXSLProc);
To update directly in memory XSL specific page, I used the SaveDOM
function, which accesses our mapProcessor
collection of preprocessed XSLs:
CComPtr<IXSLTemplate> spDOMXSLTpl;
CComPtr<IXSLProcessor> spDOMXSLProc;
spDOMXSLTpl.CoCreateInstance(__uuidof(XSLTemplate30));
spDOMXSLTpl->putref_stylesheet(spDomDoc);
hr = spDOMXSLTpl->createProcessor(&spDOMXSLProc);
mapProcessor.Put( Name, spDOMXSLProc);
So, to output an HTML page, we must make an XMLDOM "transform" between XML and specified XSL. That is now very easy:
bstrName = L"xslAll";
mapProcessor.Get( bstrName )->put_input(_variant_t(m_spDomDocHeadlines));
mapProcessor.Get( bstrName )->transform( &iResult );
mapProcessor.Get( bstrName )->get_output( &vOut );
*bstrHeadlines = vOut.bstrVal;
Conclusion
Using XML to manage application state makes it possible to maintain the structure of application data and perform more sophisticated queries (like printing headlines by category, affiliation, and so on). The XML cache approach can greatly improve application performance and offload the database server. The ISAPI approach makes obsolete the slower ASP scripting, making this way the best candidate to a web where what is needed is, first time best results.
This article was inspired from Aaron Skonnard's excellent article - "XML-based Persistence Behaviors Fix Web Farm Headaches".
License
This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below. A list of licenses authors might use can be found here.