Introduction
XML has much scope in transporting and persistence data to take advantages of XML capabilities in development. Generally for data synchronization XML is used to make data synchronization more verbose. In SQL server 2005 XQuery support in database engine introduced to make querying structured XML data. XQuery is generally work with XPath expressions with additional support of node Iteration, sorting and in constructing of desired XML instance for various purpose.
XQuery follows the XPath 1.0 expressions. The XPath 1.O contains general syntax to retrieve specific node information from structured XML instance.
XSLT is another language using for XML Transforms .XSLT also follows the XPath 1.0 rules to transforms data from XML data. Generally XSLT used for XML transformation to HTML. XSLT is good for sequential nodes but not work with sequence of values. It’s difficult to apply joins and functions in XSLT so in this manner XQuery wins because you can use functions and joins in XQuery with ease.
In SQL Server using Open XML is good in terms of data shredding for large XML DATA files but for small XML DATA files it’s more memory intensive than XQuery. We can apply joins and functions on XML data as per requirement. It’s providing better code manageability.
This article is just to show how XQuery work by explaining some scenarios:
There are various advantages of using XQuery instead of Open XML.
- Shorter XML syntax comparison of similar SQL syntax.
- Flexible query structure to get tree view and tabular result.
- Built in XQuery functions.
Overview: This article just for the beginners who want to start with XQuery in easy way .This demonstrates how you can query and manipulate xml data in SQL Server using XQuery. I am not going deep as its too big topic to understand technically, as in this article I just want to show how beginners can quick start with XQuery by giving some simple demonstrations.
Here we are going to understand some XQuery method by demonstrating some scenarios:
CREATE TABLE #TempXML(
[DocId] [int] IDENTITY(1,1) NOT NULL,
[Doc] [xml] NULL
)
CREATE TABLE #TempRegion(
[RegionId] [int] IDENTITY(1,1) NOT NULL,
[Region] [varchar](100) NULL
)
To start here creating two xml instance stored in sql temp table one without namespace and other with default namespace to understand how XQuery work with these types
Insert Into #TempXML Values ('<Persons>
<Person>
<Name>Demo1</Name>
<Salary>10000</Salary>
<Designation>Developer</Designation>
<RegionId>1</RegionId>
</Person>
<Person>
<Name>Demo2</Name>
<Salary>12000</Salary>
<Designation>Sr.
Developer</Designation>
<RegionId>2</RegionId>
</Person>
<Person>
<Name>Demo3</Name>
<Salary>13000</Salary>
<Designation>Sr.
Developer</Designation>
<RegionId>3</RegionId>
</Person>
<Person>
<Name>Demo4</Name>
<Salary>14000</Salary>
<Designation>Developer</Designation>
<RegionId>5</RegionId>
</Person>
</Persons>')
Insert Into #TempXML Values ('<Persons
xmlns="http://schemas.test.App">
<Person>
<Name>Demo1</Name>
<Salary>10000</Salary>
<Designation>Developer</Designation>
<RegionId>1</RegionId>
</Person>
<Person>
<Name>Demo2</Name>
<Salary>12000</Salary>
<Designation>Sr.
Developer</Designation>
<RegionId>2</RegionId>
</Person>
<Person>
<Name>Demo3</Name>
<Salary>13000</Salary>
<Designation>Sr.
Developer</Designation>
<RegionId>3</RegionId>
</Person>
<Person>
<Name>Demo4</Name>
<Salary>14000</Salary>
<Designation>Developer</Designation>
<RegionId>5</RegionId>
</Person>
</Persons>')
Insert
#TempRegion Values ('East')
Insert
#TempRegion Values ('West')
Insert
#TempRegion Values ('North')
Insert #TempRegion Values ('South')
query() Method:The query() method is for provided XML nodes and fragmented XML document by evaluating XPath expressions. Query method takes "FLOWR" expression as well to retrieve relative untyped XML instance result.
Querying Person(s) with salary >12000 from XML (without Namespace):
Select
Doc.query('/Persons/Person[Salary>12000]')
From #TempXML
WHERE DocId=1
Querying Person(s) with salary >12000 from XML instance where default namespace specified:
Here are two methods to deal with default namespace
;WITH
XMLNAMESPACES( default 'http://schemas.test.App' )
Select
Doc.query('/Persons/Person[Salary>12000]')
From #TempXML
WHERE DocId=2
Select
Doc.query('declare default element
namespace "http://schemas.test.App";
/Persons/Person[Salary>12000]')
From #TempXML
WHERE DocId=2
If XML Namespace is not Default instance and declare as
Xmlns:PER=http://schemas.test.App
By declaring XQuery Prolog
declare namespace PER="http://schemas.test.App";
- The XQuery prolog includes a namespace prefix (PER) declaration,
(Namespace PER="http://schemas.test.App"; - The declare namespace keyword defines a namespace prefix that is used later in the query body.
- /PER:Persons/PER:Person[Name="Demo"] is the query body.
Using FLOWR Expression
FLWOR is pronounced "flower". FLOWR stands for FOR , LET, ORDER BY, WHERE and RETURN. Most of these are optional only return clause is mandatory with one of FOR or LET clause at least.FlOWR expression also used for querying or iteration XML instance .Here are few example to understand how it’s work in SQL.
Querying for Person(s) with salary >=12000
Select
Doc.query('for $A in
/Persons/Person/Salary[.>="12000"]/..
return
$A ')
From #TempXML
WHERE DocId=1
Querying for Person(s) with
salary >=12000 (Use of WHERE and ORDER BY)
Select
Doc.query('for $A in /Persons/Person
where $A/Salary>=12000
order by $A/Name[1]
descending
return $A')
From #TempXML
WHERE DocId=1
Querying for Person(s) with salary <=12000 and adding 1200 to salary (Use of LET)
Select
Doc.query('for $T in /Persons/Person
let $L := 1200
where $T/Salary[1]<=12000
return
<Person
Name="{data($T/Name)}"
RestructuredSalary="{data($L)+($T/Salary/text())[1]}"/>')
FROM #TempXML
Where DocId=1
value() Method: Method is used for retrieving scalar text values from XML instance. Relevant XQuery expression which identifies singleton node and text value type returns the node text value. Node value returned cast to provided SQL type.
;WITH
XMLNAMESPACES( default 'http://schemas.test.App' )
SELECT
Name=x.value('(Name/text())[1]','varchar(100)'),
Salary=x.value('(Salary/text())[1]','varchar(100)'),
Designation=x.value('(Designation/text())[1]','varchar(100)')
From #TempXML
CROSS APPLY Doc.nodes('/Persons/Person') as e(x)
Where DocId=2
In below scenerio we are extracting Regions currently assigned to person(s) by applying JOIN between XML instance and user table #TempRegion
SELECT tR.Region
FROM #TempXML tX Left JOIN
#TempRegion tR
on
tX.Doc.exist('/Persons/Person/RegionId[.=sql:column("tR.RegionId")]') = 1
Where DocId=1
If you want to extract particular person(s) node by Zone then you can get this as below:
SELECT tX.Doc.query('/Persons/Person[RegionId=sql:column("tR.RegionId")]'),tR.Region
FROM #TempXML tX Left JOIN
#TempRegion tR
on
tX.Doc.exist('/Persons/Person/RegionId[.=sql:column("tR.RegionId")]') = 1
Where DocId=1
Using Functions in Xquery
Retrieving Person(s) count those salary >12000 (Use of count)
Select
PersonCount=Doc.value('count(/Persons/Person[Salary>12000])','int')
From #TempXML
WHERE DocId=1
Retrieving Average salary of Person(s) those salary >12000 (Use of avg)
Select
PersonCount=Doc.value('avg(/Persons/Person[Salary>12000]/Salary)','int')
From #TempXML
WHERE DocId=1
Xquery has various built in function listed below:
String Functions
- concat: Concatenate string instance for similar uses in SQL.
- contains: Searches the specified string.
- substring: Retrieves part of string from other string
- string-length: Determines the length of specified string
Aggregate Functions: min (), max (), and sum ()
Context Functions: last () and position ()
Manipulating Data in XML instance
To do the modification in XML nodes and their values, the XQuery XML query language conform an additional room known as the XML Data Modification Language(DML) by using insert, replace value of and delete in XQuery modify() method.
Inserting XML instance to Node of Person whose Name =Demo1
UPDATE #TempXML
SET
Doc.modify('replace value of
(/Persons/Person[DeploymentStatus[@IsDeployed="True"]]/Salary/text())[1]
with ("15000")')
WHERE DocId=1
Replace Salary node text with 15000 person name with Deployment Status value to True
UPDATE #TempXML
SET
Doc.modify('replace value of
(/Persons/Person[DeploymentStatus[@IsDeployed="True"]]/Salary/text())[1]
with ("15000")')
WHERE DocId=1
Deleting DeploymentStatus node from Person with name Demo1
UPDATE #TempXML
SET
Doc.modify('delete
(/Persons/Person[Name="Demo1"]/DeploymentStatus)')
WHERE DocId=1
Tip: Xquery To Insert Data into Master and refrenced Child table simultaneously:
There are n number of way to achieve this, This is a hint to insert data in master and refrenced child table using XQuery and Merge statement.
Create table #Master
(
Id int Identity(1,1),
MasterName Varchar(100)
)
Create Table #Intermediate
(MasterId int,
ChildData XML)
Create Table #Child
(
ChildId int identity(1,1),
MasterId int,
ChildName Varchar(100)
)
Declare @XML XML='<Record>
<MasterRecord>
<Master>Master1</Master>
<ChildRecord>
<Child>Child11</Child>
<Child>Child12</Child>
</ChildRecord>
</MasterRecord>
<MasterRecord>
<Master>Master2</Master>
<ChildRecord>
<Child>Child21</Child>
<Child>Child22</Child>
</ChildRecord>
</MasterRecord>
</Record>'
MERGE #Master
_MTR
USING (Select x.value('Master[1]','varchar(255)') AS masterName ,
x.query('ChildRecord/Child') AS ChildData
From
@XML.nodes('/Record/MasterRecord')
e(x)) AS _XML
ON 1=0
WHEN NOT MATCHED THEN
INSERT (MasterName)
VALUES(_XML.MasterName)
OUTPUT INSERTED.Id, _XML.ChildData
INTO
#Intermediate(MasterId,ChildData);
Insert Into #Child(MasterId,ChildName)
Select mas.Id, _data.value('(text())[1]', 'varchar(100)') as ChildName
from
#Intermediate intr Inner Join #Master Mas
On intr.MasterId=Mas.Id
CROSS APPLY ChildData.nodes('/Child') AS _chd(_data)
Select * from #Master
Select * from #Intermediate
select * from #Child