Introduction
Currently, MySQL only provides very basic read functionality, ExtractValue()
. This only returns a concatenated string
of node values without a delimiter and has no capability to return the actual child node tree of the target node. Therefore, I created several UDFs for reading XML for MySQL.
MyXml_InnerXml
- Retrieve inner XML from an XML fragment
- This will also return an attribute value if that is the XPath target
MyXml_OuterXml
- Retrieve full outer XML from an XML fragment
- This will also return an attribute value if that is the XPath target
MyXml_XQuery
- This currently only supports the XQuery count as this was what I needed at the time
Background
When working with stored procedures in MySQL, there is only one option for reading XML, ExtractValue()
.
MySQL ExtractValue()
function is limited at best as it only reads node values and does not return XML children nodes. After much searching, I could not find any UDF that would return XML fragments as do InnerXml
and OuterXml
. In addition, there is no way to count nodes of an XPath query for looping purposes.
Using the Code
Depending on your version of MySQL, external DLLs for UDFs either go in lib/plugins or the bin directory. Place MyXml_Functions.dll there and libxml2.dll in the bin directory and then restart MySQL.
To figure out where to place MyXml_Functions.dll, run the following in SQLYog.
SHOW VARIABLES LIKE 'plugin_dir'
To Create/Add the UDFs to MySQL:
CREATE FUNCTION MyXml_InnerXml RETURNS STRING SONAME 'MyXml_Functions.dll';
CREATE FUNCTION MyXml_OuterXml RETURNS STRING SONAME 'MyXml_Functions.dll';
CREATE FUNCTION MyXml_XQuery RETURNS STRING SONAME 'MyXml_Functions.dll';
To remove the UDFs from MySQL:
DROP FUNCTION MyXml_InnerXml;
DROP FUNCTION MyXml_OuterXml;
DROP FUNCTION MyXml_XQuery;
Validate that the UDFs loaded properly, try any/all of the following examples:
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children");
SELECT MyXml_XQuery( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "count(//Children/node())");
SELECT MyXml_XQuery( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "count(//Children/Child/node())");
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children/Child");
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children/Child/@Age");
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children/Child[1]");
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children/Child[2]");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children/Child");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children/Child[2]/@Age");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children/Child[1]");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", "//Children/Child[2]");
Some error handling examples that I used for testing this are:
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", NULL);
SELECT MyXml_InnerXml( NULL, "//Children/Child[2]/@Age");
SELECT MyXml_InnerXml( NULL, NULL);
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>", NULL);
SELECT MyXml_OuterXml( NULL, "//Children/Child[2]/@Age");
SELECT MyXml_OuterXml( NULL, NULL);
SELECT MyXml_InnerXml( 1, "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>");
SELECT MyXml_InnerXml( 2, 1);
SELECT MyXml_InnerXml( "//Children/Child[2]/@Age", 1);
SELECT MyXml_OuterXml( 1, "<Children><Child Age='7'>Nicole</Child>_
<Child Age='4'>Aaron</Child></Children>");
SELECT MyXml_OuterXml( 2, 1);
SELECT MyXml_OuterXml( "//Children/Child[2]/@Age", 1);
Traversing an XML Fragment
The following code is a stored procedure I wrote that walks an XML fragment and theoretically acts on all or the nth node inline. This example simply returns a CSV string from the XPath "//Children/Child". This stored procedure can easily be modified to take an XPath
parameter. I use hundreds of functions similar to this for doing CRUD operations using XML fragments and ifnull()
constructs.
DELIMITER $$
DROP PROCEDURE IF EXISTS `MyXml_XmlWalker`$$
CREATE PROCEDURE `MyXml_XmlWalker`(IN testXml TEXT, IN verbose BIT, _
OUT children VARCHAR(1024))
BEGIN
DECLARE _child_age INT(11);
DECLARE _child_name VARCHAR(100);
DECLARE v_row_count INT UNSIGNED;
DECLARE _column_id INT(11) DEFAULT 1;
IF (verbose = 1) THEN
SELECT 'MyXml_XmlWalker', testXml;
END IF;
SET children = NULL;
IF (testXml IS NOT NULL) THEN
SET v_row_count := MyXml_XQuery_
( testXml, 'count(//Children/Child/node())');
IF (verbose = 1) THEN
SELECT 'MyXml_XmlWalker', v_row_count;
END IF;
WHILE _column_id <= v_row_count DO
SELECT MyXml_InnerXml_
( testXml, CONCAT('//Children/Child[', _column_id, ']'))
INTO _child_name;
IF (verbose = 1) THEN
SELECT 'MyXml_XmlWalker', _child_name;
END IF;
IF (_child_name IS NOT NULL) THEN
IF (children IS NULL) THEN
SET children = _child_name;
ELSE
SET children = CONCAT_
( children, ', ', _child_name);
END IF;
END IF;
SET _column_id = _column_id + 1;
END WHILE;
END IF;
END$$
DELIMITER ;
Points of Interest
These UDFs rely on libXml2
and have been built against libxml2-2.4.12
. I have included a stripped down version (to make the download smaller and still allow for the VS2008 project to build).
Though I have tried my best, there is a multi-threading issue that I was not able to figure out. When I run these functions from two SQLYog sessions under heavy load, MySQL sometimes crashes. I found a note in the libXml2
library that says its XPath
support should be thread safe. Operative word "should". Apparently, it is not as robust as it could be. If anyone can figure out what is up with that and let me know, I would be highly appreciative.
I have used some example code from libXml2
source. I left in header definitions and whatnot from the libXml2
examples I used to create this, so that theoretically the code for these UDFs should compile under Linux. If anyone would be so kind as to take the source code and do so, that would be much appreciated.
History
- April 7, 2010: Initial submission