Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / MySQL

MySQL XML User Defined Functions

0.00/5 (No votes)
7 Apr 2010CPOL3 min read 1   658  
Support for Reading XML in MySQL

Introduction

Currently, MySQL only provides very basic read functionality, ExtractValue(). This only returns a concatenated string of node values without a delimiter and has no capability to return the actual child node tree of the target node. Therefore, I created several UDFs for reading XML for MySQL.

  • MyXml_InnerXml
    • Retrieve inner XML from an XML fragment
    • This will also return an attribute value if that is the XPath target
  • MyXml_OuterXml
    • Retrieve full outer XML from an XML fragment
    • This will also return an attribute value if that is the XPath target
  • MyXml_XQuery
    • This currently only supports the XQuery count as this was what I needed at the time

Background

When working with stored procedures in MySQL, there is only one option for reading XML, ExtractValue().

MySQL ExtractValue() function is limited at best as it only reads node values and does not return XML children nodes. After much searching, I could not find any UDF that would return XML fragments as do InnerXml and OuterXml. In addition, there is no way to count nodes of an XPath query for looping purposes.

Using the Code

Depending on your version of MySQL, external DLLs for UDFs either go in lib/plugins or the bin directory. Place MyXml_Functions.dll there and libxml2.dll in the bin directory and then restart MySQL.

To figure out where to place MyXml_Functions.dll, run the following in SQLYog.

SQL
SHOW VARIABLES LIKE 'plugin_dir'

To Create/Add the UDFs to MySQL:

SQL
CREATE FUNCTION MyXml_InnerXml RETURNS STRING SONAME 'MyXml_Functions.dll';
CREATE FUNCTION MyXml_OuterXml RETURNS STRING SONAME 'MyXml_Functions.dll';
CREATE FUNCTION MyXml_XQuery   RETURNS STRING SONAME 'MyXml_Functions.dll';

To remove the UDFs from MySQL:

SQL
DROP FUNCTION MyXml_InnerXml;
DROP FUNCTION MyXml_OuterXml;
DROP FUNCTION MyXml_XQuery;

Validate that the UDFs loaded properly, try any/all of the following examples:

SQL
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children");
SELECT MyXml_XQuery(   "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "count(//Children/node())");
SELECT MyXml_XQuery(   "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "count(//Children/Child/node())");

SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children/Child");
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children/Child/@Age");
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children/Child[1]");
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children/Child[2]");

SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children/Child");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children/Child[2]/@Age");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children/Child[1]");
SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", "//Children/Child[2]"); 

Some error handling examples that I used for testing this are:

SQL
SELECT MyXml_InnerXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", NULL);
SELECT MyXml_InnerXml( NULL, "//Children/Child[2]/@Age");
SELECT MyXml_InnerXml( NULL, NULL);

SELECT MyXml_OuterXml( "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>", NULL);
SELECT MyXml_OuterXml( NULL, "//Children/Child[2]/@Age");
SELECT MyXml_OuterXml( NULL, NULL);

SELECT MyXml_InnerXml( 1, "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>");
SELECT MyXml_InnerXml( 2, 1);
SELECT MyXml_InnerXml( "//Children/Child[2]/@Age", 1);

SELECT MyXml_OuterXml( 1, "<Children><Child Age='7'>Nicole</Child>_
	<Child Age='4'>Aaron</Child></Children>");
SELECT MyXml_OuterXml( 2, 1);
SELECT MyXml_OuterXml( "//Children/Child[2]/@Age", 1);

Traversing an XML Fragment

The following code is a stored procedure I wrote that walks an XML fragment and theoretically acts on all or the nth node inline. This example simply returns a CSV string from the XPath "//Children/Child". This stored procedure can easily be modified to take an XPath parameter. I use hundreds of functions similar to this for doing CRUD operations using XML fragments and ifnull() constructs.

SQL
DELIMITER $$

DROP PROCEDURE IF EXISTS `MyXml_XmlWalker`$$

CREATE PROCEDURE `MyXml_XmlWalker`(IN testXml TEXT, IN verbose BIT, _
	OUT children VARCHAR(1024))
BEGIN
/*
	SET @testXml = CONCAT(
		"<Children><Child Age='7'>Nicole</Child>_
			<Child Age='4'>Aaron</Child></Children>"
	);
		
	CALL MyXml_XmlWalker( @testXml, 1, @children); select @children;
*/
   	DECLARE _child_age 	INT(11);
	DECLARE _child_name	VARCHAR(100);
	
   	DECLARE v_row_count 	INT UNSIGNED;
    	DECLARE _column_id 	INT(11) DEFAULT 1;
	
    	IF (verbose = 1) THEN
		SELECT 'MyXml_XmlWalker', testXml;
     	END IF;
     	
     	SET children = NULL;
     	
     	IF (testXml IS NOT NULL) THEN

		-- Get child count
		-- 
		SET v_row_count := MyXml_XQuery_
			( testXml, 'count(//Children/Child/node())');
		IF (verbose = 1) THEN
			SELECT 'MyXml_XmlWalker', v_row_count;
		END IF;
		
		-- Extract child definitions
		-- 
		WHILE _column_id <= v_row_count DO
	 
			SELECT MyXml_InnerXml_
			( testXml, CONCAT('//Children/Child[', _column_id, ']')) 
			INTO _child_name;
	
			IF (verbose = 1) THEN
				SELECT 'MyXml_XmlWalker', _child_name;
			END IF;
					
			IF (_child_name IS NOT NULL) THEN
				
				IF (children IS NULL) THEN
					SET children = _child_name;
				ELSE
					SET children = CONCAT_
					( children, ', ', _child_name);
				END IF;
				
			END IF;
			
			SET _column_id = _column_id + 1;
				
		END WHILE;	
	
	END IF;	

    END$$

DELIMITER ;

Points of Interest

These UDFs rely on libXml2 and have been built against libxml2-2.4.12. I have included a stripped down version (to make the download smaller and still allow for the VS2008 project to build).

Though I have tried my best, there is a multi-threading issue that I was not able to figure out. When I run these functions from two SQLYog sessions under heavy load, MySQL sometimes crashes. I found a note in the libXml2 library that says its XPath support should be thread safe. Operative word "should". Apparently, it is not as robust as it could be. If anyone can figure out what is up with that and let me know, I would be highly appreciative.

I have used some example code from libXml2 source. I left in header definitions and whatnot from the libXml2 examples I used to create this, so that theoretically the code for these UDFs should compile under Linux. If anyone would be so kind as to take the source code and do so, that would be much appreciated.

History

  • April 7, 2010: Initial submission

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)