Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / T-SQL

Cross Apply, Outer Apply some unique scenarios.

5.00/5 (1 vote)
21 Sep 2024CPOL5 min read 1.7K  
A deeper look at Cross Apply and Outer Apply
A deeper look at Cross Apply and Outer Apply trying to resolve a performance issue with my query which was challenging to get with the traditional joins. CROSS APPLY allows for more detailed control on individual rows, while INNER JOIN operates on sets of data. It becomes very useful when we need to perform row-wise operations or apply subqueries dynamically for each row.

Introduction

CROSS APPLY and OUTER APPLY can be used when working with calculated fields in complex/nested queries and make them simpler and more readable which could be challenging with traditional joins. Cross Apply works like a cursor, one record at a time not the set operations done by joins. In most of the scenarios APPLY performs as fast as a join, but in certain scenarios, it can be slow. In this article, I’ll discuss a few scenarios where we could use the APPLY clause.

Without Cross Apply and Outer Apply, we have limited options:

  • The CROSS APPLY statement behaves like to a correlated sub-query but allows us to use ORDER BY statements within the subquery. This is very useful when we want to extract the top record from a sub-query to use in an outer query.
  • SQL Server User Defined Functions tend to prevent parallel operations. It doesn't inline the function. As each cumulative update for SQL 2022 has been released more and more restrictions as to when inlining will occur have been introduced.
  • Intermediate tables - Usually works and is often a good option as they can be indexed and fast, but performance can also drop due to UPDATE statements not being parallel and not allowing to cascade or reuse results to update several fields within the same statement. Junction tables can work, but soon enough you're joining subqueries with many UNION ALL statements.
  • Nesting queries and Repeating code - You can put parenthesis on your query and use it as a subquery upon which you can manipulate source data and calculated fields alike. It can't include a COMPUTE or FOR BROWSE clause and can only include an ORDER BY clause when a TOP clause is also specified.
  • Values introduced through CROSS APPLY can be used to create one or multiple calculated fields without adding performance, complexity and thus, the CROSS APPLY is like an INNER JOIN, or, more precisely, like a CROSS JOIN with a correlated sub-query with an implicit join condition of 1=1.

Derived Values

This example shows how you can perform "procedural" calculations on the chain of derived values. JOIN subqueries cannot introduce new values in the dataset based on data in the parent query (it must stand on its own).

SELECT calc1.val1,
       ROW_NUMBER() OVER (PARTITION BY calc1.val1 ORDER BY calc5.price_gross) pos_number,
       calc1.price1,
       calc2.price2,
       calc3.price3,
       calc4.price4,
       calc5.price_gross
  FROM tbl t

CROSS APPLY (SELECT CASE t.col1 WHEN 1 THEN 'A' WHEN 2 THEN 'B' WHEN 3 THEN 'C' END val1, t.price * (100 - t.discount1) / 100 AS price1) as calc1
             CROSS APPLY (SELECT calc1.price1 * (100 - t.discount2) / 100 AS price2) as calc2
             CROSS APPLY (SELECT calc2.price2 * (100 - t.discount3) / 100 AS price3) as calc3
             CROSS APPLY (SELECT calc3.price3 * (100 - t.discount4) / 100 AS price4) as calc4
             CROSS APPLY (SELECT calc4.price4 * (100 + t.VAT) / 100 AS price_gross) as calc5

 INNER JOIN tbl2 t2  ON t2.val1 = calc1.val1
 ORDER BY calc1.val1

Top N Values

Another usage is let’s say you have two tables, Customer and Order. Customers have many Orders. If I want to create a view that gives me details about customers, and last 10 most recent order they've made. With just JOINS, this would require some self-joins and aggregation which isn't pretty. But with Cross Apply, its super easy and friendly on the eyes.

SELECT *
FROM Customer cust

CROSS APPLY (
  SELECT TOP 10 *
  FROM Order o
  WHERE o.CustomerId = cust.CustomerId
  ORDER BY OrderDate DESC
) T

DRY Principle

CROSS APPLY to prevent duplicate code, when I must use the same CASE in multiple parts of a query. Furthermore, it is nice to use multiple CROSS APPLY's when you have staggered calculations where the second value depends on the first result, the third is using the second resutld, the forth the third result and so on.

SELECT
  webTable.*,
  LEFT(noPrefix.myURL, domain_end.pos) AS domain
FROM webTable

CROSS APPLY
  (SELECT REPLACE(webTable.myURL, 'www.', '') AS myURL) AS [noPrefix]
CROSS APPLY
  (SELECT CHARINDEX('.', noPrefix.MyURL) AS pos) AS [suffix_start]
CROSS APPLY
  (SELECT CASE WHEN suffix_start.pos = 0 THEN LEN(noPrefix.myURL) ELSE suffix_start.pos - 1 END AS pos) AS [domain_end]

Using Table Valued Functions to Inner Joins

The cross apply is really a cross-product operator. The CROSS APPLY operator returns only those rows from left table expression if it matches with right table expression is not the whole story. It serves to function as inner and outer joins by chance. Cross apply sometimes enables you to do things that you cannot do with inner join.

The below query will throw a syntax error:

SELECT fn.* from sys.objects sobj 
INNER JOIN dbo.myTableFunc(sobj.name) fn
ON fn.schema_id = sobj.schema_id

This is a syntax error, because, when used with inner join, table functions can only take variables or constants as parameters. i.e. the table function parameter cannot depend on another table's column.

However we can refer to the table alias column name using a CROSS Apply

SELECT fn.* from sys.objects sobj 
CROSS APPLY dbo.myTableFunc(sobj.name) fn 
WHERE fn.schema_id= sobj.schema_id

XML Fields

CROSS APPLY has its obvious usage in allowing a set to depend on another (unlike the JOIN operator), but that doesn't come without a cost: it behaves like a function that operates over each member of the left set, so, in SQL Server terms it always perform a Loop Join, which rarely is the best way to join sets. So, use APPLY when you need to, but don't overuse it against JOIN. Below is a simple XML snippet for Orders, we use the CROSS Apply to join the OrderDetail to Order xml .

DECLARE @x1 xml =
'<Orders>
    <Order>
       <OrderID>13000</OrderID>
       <CustomerID>ALFKI</CustomerID>
       <OrderDate>2006-09-20Z</OrderDate>
       <EmployeeID>2</EmployeeID>
    </Order>
    <Order>
       <OrderID>13001</OrderID>
       <CustomerID>VINET</CustomerID>
       <OrderDate>2006-09-20Z</OrderDate>
       <EmployeeID>1</EmployeeID>
    </Order>
</Orders>'


DECLARE @x2 xml =
'<Orders>
      <OrderDetails>
           <OrderID>13000</OrderID>
           <ProductID>76</ProductID>
           <Price>123</Price>
           <Qty>10</Qty>
       </OrderDetails>
       <OrderDetails>
           <OrderID>13000</OrderID>
           <ProductID>16</ProductID>
           <Price>3.23</Price>
           <Qty>20</Qty>
       </OrderDetails>
</Orders>'

And the query to join these two xmls

SELECT a.OrderID,c.ProductID,c.Price,c.Qty FROM  
 (SELECT OrderID    = O.n.value('(OrderID/text())[1]',   'int'), 
        CustomerID = O.n.value('(CustomerID/text())[1]', 'nchar(5)'), 
        OrderDate  = O.n.value('(OrderDate/text())[1]',  'datetime'), 
        EmployeeId = O.n.value('(EmployeeID/text())[1]', 'smallint') 
 FROM   @x1.nodes('/Orders/Order') AS O(n)) a 

CROSS APPLY 
( 
       SELECT * FROM ( 
          SELECT OrderID = D.n.value('(OrderID/text())[1]', 'int'), 
                 ProductID  = D.n.value('(ProductID/text())[1]', 'int'), 
                 Price = D.n.value('(Price/text())[1]', 'decimal(10,2)'), 
                 Qty = D.n.value('(Qty/text())[1]', 'int') 
           FROM   @x2.nodes('/Orders/OrderDetails') AS D(n)) b 
       WHERE a.OrderID=b.OrderID 
) c

Replacement for UnPivot

APPLY can be used as a replacement for UNPIVOT. Either CROSS APPLY or OUTER APPLY can be used here, which are interchangeable.

Consider you have the below table.

CREATE TABLE tblOrders (OrderId int identity, DVD int, HardDisk int, RAM int)

INSERT INTO tblOrders
SELECT 1, 3, 3
UNION ALL
SELECT 2, 5, 4
UNION ALL
SELECT 1, 3, 10

/******************************************************************/

SELECT DISTINCT OrderId, ProductName, ProductQty
FROM tblOrders
CROSS APPLY(VALUES ('DVD', DVD),('HardDisk', HardDisk), ('RAM', RAM))
COLUMNNAMES(ProductName, ProductQty)

Result of the Pivoting operation with Cross Apply.

OrderId     ProductName ProductQty
----------- ----------- -----------
1           DVD         1
1           HardDisk    3
1           RAM         3
2           DVD         2
2           HardDisk    5
2           RAM         4
3           DVD         1
3           HardDisk    3
3           RAM         10

 

Performance Common Table Expression vs Cross Apply

Whether you use APPLY, IN, or EXISTS, the query optimizer does not take the query text literally. It transforms your code into logic operations, and optimizes from there, using a wide variety of transformations. It is possible to write the same logical requirement in many ways, and often they will produce the same or trivially different execution plan. Identical execution plans produce identical results all things being equal.

Using two sub queries

SELECT
  (
      SELECT MAX(to1.Qty1) as mQty1 
      FROM tblOrder to1 
      WHERE to1.OrderId = orderDetail.Id
  ),
 
  (
      SELECT MAX(to2.Qty2) as mQty2 
      FROM tblOrder to2 
      WHERE to2.OrderId = orderDetail.Id
  )
      FROM tblDetails d

Using Cross Apply for the same operation

SELECT t.mQty1, t.mQty2
FROM tblDetails d
CROSS APPLY 
(
    SELECT MAX(to1.Qty1) as mQty1, MAX(to1.Qty2) as mQty2
    FROM tblOrder to1 
    WHERE to1.OrderId = orderDetail.Id
) as t

Through the Apply operator cross or outer you will be able to write a query that return the same results as the Query that use the sub-Query. But what are the differences between these two ways to write the query? You will have just noticed that in the first Query it was necessary to use two sub-queries while using the cross apply it is possible to read data from the tblOrder table only once.
The pros is in cross apply it actually reads half of the pages of the tblOrder table. But the con is SQL Server create a temporary table in the tempdb which contains the data returned from Cross Apply query. Using tempdb may affect performance. However, if the number of subqueries increased, the performance would most likely change in favor of cross apply.

 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)