What is it about
The initial idea of this article was to bring you a simplistic example of using a Scripting Runtime Library, "click here and here, blah-blah-blah, thank you". The reason why I began writing was the need to make my and my colleagues' scripts file-system-aware. This ability proved to be very useful for software prototyping purposes or for building some small utilities; of course, it shouldn't be used on web (due to security/privacy reasons we'll discuss later).
Back to the tool. Prior to writing it, I was deep in XML for several months, so what you see here is an XML/XSLT viewer/browser, enveloped in the form of an HTML application. It helped me a lot when I was learning XML/XSL, now it aids <some other people> in rapid checking and bug tracking of large numbers of XSL templates; hope it helps you too.
Of course, this little browser (I'll call it "Xbrowser" further on) is in no way a replacement for any enterprise-grade development tool. It is just:
- an interactive learning tool that illustrates the basics of XML handling with JScript/MSXML - for beginners in XML development; and, maybe -
- an example of using the Microsoft Scripting Runtime Object Library - for developers of Office tools and solutions; and, of course -
- a simple utility for validating XML documents (for well-formness/against a schema) and viewing XSLT output.
The last minute additions to this article were XML/XSL transformation and XML/XSD validation tools, which use (most of) the techniques described here.
Requirements
For this piece of code to work properly, you'll need the following code packs:
- Common Dialog ActiveX Control - provides a standard set of dialog boxes for operations such as opening and saving files, setting print options, and selecting colors and fonts. It is shipped with MS Visual Basic and the MS Office 2000/XP products, or can be downloaded from the Microsoft website.
- Scripting Runtime Object Library is a Microsoft file system management solution, designed for use with scripting languages; it is an integral part of Microsoft Office 2000/XP. This library is also available for download at the Microsoft website.
- Microsoft XML Core Services and/or SDK (versions 3.0 or, preferably, 4.0). Can be downloaded from the Microsoft website.
- For installing some of the previously mentioned packages, you'll probably need a CAB extraction utility. You can download it from the Microsoft site.
How stuff works
If you look inside the attached archive, you'll see that the "Xbrowser" is no more than an HTML form. Let's see how to use it, and how the code works behind the stage, step by step.
- Folder browsing
Step 1: choose a folder where your XML files are located.
This part uses the Shell
object, specifically its BrowseForFolder
method.
function BrowseFolder()
{
var objShell = new ActiveXObject("Shell.Application");
var objFolder = objShell.BrowseForFolder(0, "Select a folder:", 0);
if (objFolder == null) return "";
var objFolderItem = objFolder.Items().Item();
var objPath = objFolderItem.Path;
var foldername = objPath;
if (foldername.substr(foldername.length-1, 1) != "\\")
foldername = foldername + "\\";
...
}
- File browsing and enumeration.
Step 2: choose a file.
Two interesting things here:
- The
Scripting.FileSystemObject
is the main point of access to the file system. In short:
FileSystemObject contains: |
Drives collection
Folders collection
Files collection
GetDrive method (access a particular drive).
GetFolder method (access a particular folder).
GetFile method (access a particular file). |
Drives collection contains: |
Item property (used to access the drive).
Count property (number of drives in a system). |
Folders collection contains: |
Item property (used to access the folder).
Count property (number of folders in a collection).
Add method (create new folder). |
Folder object contains: |
SubFolders collection (subfolders of a folder, including those with hidden and system file attributes set).
Files collection (access all files in a folder). |
Files collection contains: |
Item property (used to access the file).
Count property (number of files in a folder). |
File object contains: |
Name property (file name).
Size property (file size).
DateCreated property (file creation date and time). |
The FSO has lots of collections, methods, and properties; I've just pointed out the most commonly used ones.
- The
Enumerator
object is a simple iterator, used to cycle through the collection of objects:
Enumerator object contains: |
item method (returns a reference to the current object in a collection).
atEnd method (returns true if the iterator has reached the end of the collection).
moveFirst method (iterates to the first object in a collection).
moveNext method (iterates to the next object in a collection). |
var fc = new Enumerator(colFiles);
for (; !fc.atEnd(); fc.moveNext())
{
var objFile = fc.item();
...
}
Actual code:
var objFSO = new ActiveXObject("Scripting.FileSystemObject");
var objFolder = objFSO.GetFolder(curXMLfolder);
var colFiles = objFolder.Files;
var xmlcount = 0, xslcount = 0;
var fc = new Enumerator(colFiles);
if (fc.atEnd() != true)
{
for (; !fc.atEnd(); fc.moveNext())
{
var objFile = fc.item();
var ftext = objFile.Name.toLowerCase();
if ((ftext.substr(ftext.length-3, 3)=="xml") ||
(ftext.substr(ftext.length-3, 3)=="rdf"))
{
xmlcount = xmlcount + 1;
if (xmlcount == 1)
xmlsel="<SELECT id='xmlselection' onchange='refresh()'>";
xmlsel=xmlsel+"<OPTION value="+ftext+">"+
ftext+"</OPTION>";
if (fc.atEnd()) xmlsel=xmlsel+"</SELECT>";
}
}
}
- Loading XML from a file.
This is the MSXML's part:
var xml = new ActiveXObject("MSXML2.DOMDOCUMENT");
xml.async = false;
xml.load(curXMLfolder + xmlselection.value);
Doing the same for the stylesheet:
var xsl = new ActiveXObject("MSXML2.DOMDOCUMENT");
xsl.async = false;
xsl.load(curXSLfolder + xslselection.value);
- Loading XML from a string.
Loading XML data from a string is a bit different from loading a file. No files, no options; all you must do is to write a string which will contain your XML code. Then, you parse that string with a single call to the LoadXML
method:
var defsheet="<?xml version=\"1.0\"?>";
...
defsheet += "</xsl:stylesheet>";
if(!defSheetCache)
{
defSheetCache = new ActiveXObject("MSXML2.DOMDocument");
defSheetCache.async = false;
defSheetCache.resolveExternals = false;
defSheetCache.loadXML(defsheet);
}
Here, LoadXML
is used for loading a default stylesheet (hard-coded in a string), used when no XSL files are found in the appropriate folder.
- Document validation.
Step 3: review the validation result.
The actual validation takes place immediately after the XML document has finished loading:
...
xml.load(curXMLfolder + xmlselection.value);
...
So, all you must do is to check:
if (xml.parseError.errorCode != 0)
{
}
else
{
}
- Validating an XML document against an arbitrary XSD schema.
To validate your XML document against a schema, the script does the following:
...
if(xslFile.substr(xslFile.length - 3, 3) == "xsd")
{
var schemaSource = new ActiveXObject("MSXML2.DOMDocument.4.0");
if(!schemaSource.load(curXSLfolder + xslFile))
{
xslErrorCache = schemaSource.parseError.errorCode +
": " + schemaSource.parseError.reason;
passedXSL.innerHTML = "... Schema is corrupt ...";
result.innerHTML = "";
return;
}
schemaSource.setProperty("SelectionLanguage", "XPath");
schemaSource.setProperty("SelectionNamespaces",
"xmlns:xs='http://www.w3.org/2001/XMLSchema'");
var tnsattr = schemaSource.selectSingleNode("/*[local-name()" +
"='schema']/@targetNamespace");
var nsuri = tnsattr ? tnsattr.nodeValue : "";
if(!schemaCache)
schemaCache = new ActiveXObject("Msxml2.XMLSchemaCache.4.0");
else
{
for(var i = 0; i < schemaCache.length; i++)
{
schemaCache.remove(schemaCache.namespaceURI(i));
}
}
schemaCache.add(nsuri, schemaSource);
...
var xmlSource = new ActiveXObject("MSXML2.DOMDocument.4.0");
xmlSource.schemas = schemaCache;
xmlSource.async = false;
if(!xmlSource.load(curXMLfolder + xmlFile))
{
xslErrorCache = xmlSource.parseError.reason;
passedXSL.innerHTML =
"... XML document doesn't conform to schema ...";
result.innerHTML = "";
return;
}
else
{
result.innerHTML = xml.transformNode(xsl.documentElement);
passedXSL.innerHTML =
"... XML document conforms to schema ...";
}
}
Please take a note: this validation procedure requires you to have MSXML 4.0 installed.
- Transforming the XML with an XSL stylesheet.
Once the XML and the XSL files are loaded into DOM trees, transforming the XML data with a stylesheet is as easy as nothing:
resultCache = xml.transformNode(xsl.documentElement);
- Reconnecting CSS.
One problem arises with resultCache
: if the input XSLT document generates embedded stylesheet (<STYLE>
) blocks, these will be stripped from the resulting HTML after we display it through result.innerHTML
. This problem can be solved by extracting the style definition from the result and incorporating it into the browser's document:
var elem = document.createStyleSheet();
elem.cssText =
trim(resultCache.substring(resultCache.indexOf("<style>") + 7,
resultCache.indexOf("</style>")));
elem.title = "user_styles";
To avoid style conflicts, we must "collect garbage" immediately before every transformation:
var stls = document.getElementsByTagName("style");
for(var k = 0; k < stls.length; k++)
{
var stl = document.styleSheets[k];
if (stl.title == "user_styles")
{
var r = stl.rules.length;
for(var j = 0; j < r; j++)
stl.removeRule[0];
break;
}
}
- Saving the result of the transformation to a file.
Step 4: save the XSLT output to a file.
Two points of interest here: the "Save" dialog, and the file creation process itself. For the "Save" dialog to work, you must register and obtain a design-time license for the following ActiveX component:
<object id="cmdlg"
classid="clsid:F9043C85-F6F2-101A-A3C9-08002B2F49FB"
codebase="http://activex.microsoft.com/controls/vb6/comdlg32.cab">
</object>
Then, you can use it:
function fileSave()
{
cmdlg.CancelError = false;
cmdlg.FilterIndex = 1;
cmdlg.DialogTitle = "Save file as";
cmdlg.Filter = "HTML file (*.html)|*.html|XML file (*.xml)|*.xml";
cmdlg.ShowSave();
return cmdlg.FileName;
}
Wishing to save XSLT output, we simply take resultCache
and stream it down to a file. No need to check for any errors here, because we don't show the "Save..." button if either of the two documents (XML or XSL) hasn't passed validation.
function Save()
{
var filename = fileSave();
if (filename != "")
{
var objFSO = new ActiveXObject("Scripting.FileSystemObject");
var objFile = objFSO.CreateTextFile(filename);
objFile.Write(resultCache);
objFile.Close();
}
}
All inclusive
As a test bed for the presented utility, the accompanying Zip file contains the NASDAQ historical price data of Sun Microsystems Inc., along with three XSLT stylesheets I've written:
- Plain. This one shows the original XML data, with IE's color scheme:
- Table. This is a simple example of an XSL transformation. Green/red color rows, showing increase/decrease of stock price, is an illustration of the
<xsl:choose>
rule:
- Bar graph. A more complex stylesheet. This one features: "for-next"-style cycles (implemented using recursive calls of named templates); searching for the maximum in a row of values (with the use of the
<xsl:sort>
rule); and, of course, an algorithm for building stylish bar charts:
Make it quick
The last little things included are:
- transfrm script - a simple WSH tool for applying XSLT stylesheets to XML documents. The script works in two modes:
- Batch mode is ideal for performing numerous XSL transformations in one go. The script takes a single file name as an argument:
transfrm.js batch.list
The file, passed as an argument, may contain an arbitrary number of lines (one transformation per line) in the following format:
<xml_file_name>,<xsl_file_name>,<result_file_name>
Sample batch list:
stock.xml,plain.xsl,result1.html
stock.xml,table.xsl,result2.html
stock.xml,bargraph.xsl,result3.html
Aside from the resulting files, the script creates, or appends to, the "transfrm.log" file, which contains a transformation log.
- In the single transformation mode, the script accepts three arguments:
transfrm.js input.xml input.xsl output_file.html
The "transfrm.log" file will be populated with the information on the last transformation.
- validate.js script - a WSH tool for validating a single XML document against an arbitrary XSD schema.
- Single validation mode:
validate.js input.xml input.xsd
As a result, you'll have a message box, saying if the input.xml conforms to input.xsd or not.
- Batch mode:
validate.js batch.file
The batch file contains the list of input XML files and XSD schemas:
<xml_file_1_name>,<xsd_file_1_name>
<xml_file_2_name>,<xsd_file_2_name>
The script creates, or appends to, the "validate.log" file with the details of the last validation.
Security
Looking at the previous sections, one can guess: a script that writes arbitrary data to arbitrary files can be a big source of headache and security problems. Moreover, HTML applications are not subject to IE's security restrictions (see the appropriate introduction), so third-party (or just erroneous) scripts that use the FileSystemObject
can be a major security threat.
This dictates two primary uses of the Scripting Runtime: "local" (non-web) utilities, and server-side scripting. As MSDN says, "because the use of the FSO on the client side raises serious security issues about providing potentially unwelcome access to a client's local file system, this documentation assumes the use of the FSO object model to create scripts executed by Internet Web pages on the server side. Since the server side is used, the Internet Explorer default security settings do not allow the client-side use of the FileSystemObject
object. Overriding those defaults could subject a local computer to unwelcome access to the file system, which could result in total destruction of the file system's integrity, causing loss of data, or worse."
Sometimes, the only option you can look at is turning stand-alone HTAs to corporate web pages (simply renaming the .hta file to .html and ripping the HTA:APPLICATION
tag), thus using the FSO at client-side. It raises problems with component licensing and execution permissions; furthermore, you must be sure that your intranet is extremely secure. In this case (i.e., if you're returning to "ordinary web"), in order to defend yourself from any unexpected behavior, and, on the other side, use the power of advanced scriptable objects (like Scripting.FileSystemObject
or MSXML.DOMDocument
), please consider the following:
- Never allow non-secure and unsigned ActiveX components to run without your explicit approval. Set the "Initialize and script ActiveX controls not marked as safe" option in IE's Security tab to "prompt".
- Never allow Java components to be downloaded and run without your explicit approval. Set the Java Virtual Machine security level in IE's Security tab to "Medium" or "High".
- Scripts downloaded from home or corporate intranets are usually trustworthy; so you may wish to set the security level of "Local Intranet" to "Medium" or even "Low", while setting "Internet" security level to "High".
- Add servers where your scripts reside in, to the "Trusted sites" zone.
Please note, that these are just basic rules; you should probably consult with IT professionals to build up your intranet security to the appropriate level.
Alternatives
Other downsides of the "Xbrowser" are: it is strictly IE-bound, and it depends too much on external code libraries. This is the flipside of the power that the IE engine provides; however, dependencies can be reduced in a number of ways.
- XML/XSL/XPath processing:
XML for <SCRIPT> - cross Platform standards-compliant XML parser in JavaScript. Pros: W3C DOM (level 2)/SAX parsers included, together with an XPath processor. Cons: if you need a schema/DTD-aware parser, this is not your choice. (Almost) Perfect for cross-browser work.
Sarissa - not a parser, but a JavaScript wrapper for native XML APIs. DOM operations, XSL transformations, and XPath queries can be performed; all popular browsers are supported. This can (also) help you build cross-browser XML solutions.
- I/O:
Unfortunately (or fortunately?), there is no standard alternative to the Scripting Runtime Object Library for file I/O. Other browsers (Mozilla Firefox, Opera, etc.) don't tolerate any deviations from the ECMAScript standard (Microsoft's JScript is the implementation of it), so you can be sure that no code is capable of tearing your file system apart.
Links
Common Dialog Controls
Scripting
Script Security
XML
Tools
For developing some serious XML applications, you'll need something more than a Notepad.
Officials speak
Formal technical specifications, written, as usual, in W3C's heavy language:
- XML - home of XML and XML-based technologies.
- XML Schema - general info on XML schemas; links to XSD/DTD helper utilities.
- XSL - home of the Extensible Stylesheet Language. Specs, software news and links, tutorials.
Articles & code samples
The Web has more papers on XML/XSLT than any ordinary human can read for life. Here're just a few links:
History
- November 10th, 2005 - initial release.
- March 15th, 2006 - serious CSS issue fixed, words on common controls licensing added.
- April 25th, 2006 - validation of an XML document against an XSD schema is now available; minor improvements and bug-fixes.
- May 15th, 2006 - minor optimizations and bug-fixes.
- May 25th, 2006 - transformation and validation tools rewritten.