Home > MS SQL SERVER 2005, Optimization Performance, SQL Tips, XML > Using CROSS APPLY & OUTER APPLY operators with UDFs, Derived tables & XML data

Using CROSS APPLY & OUTER APPLY operators with UDFs, Derived tables & XML data


In my previous posts CROSS APPLY vs OUTER APPLY I discussed on the same topic, but didn’t covered much on examples. Here we will see more examples on using APPLY operators, their workarounds and performance.

>> Using APPLY operator with User Define Functions:

USE [AdventureWorks]
GO

-- Create an "Inline Table-valued Function" that returns Sales Details of a Contact (Customer):
CREATE FUNCTION [dbo].[ufnGetSalesInfo](@ContactID int)
	RETURNS TABLE 
AS 
RETURN (
	SELECT SalesOrderID, OrderDate, PurchaseOrderNumber, AccountNumber, CustomerID, TotalDue
	FROM Sales.SalesOrderHeader
	WHERE ContactID = @ContactID
)
GO

-- We can use this function with SELECT FROM statement like we do for a table:
SELECT * FROM [dbo].[ufnGetSalesInfo](100)

Here, we cannot provide multiple IDs or any variable to make it work for multiple or dynamic Contacts. To return Sales details for multiple Customers we need to JOIN it with Contacts table, but JOINS are now allowed with UFDs, they only work with Tables/Views.

So to JOIN a UDF with a TABLE we need an APPLY operator, check examples below:

-- CROSS APPLY acts as INNER JOIN & will return only matching rows, returns 513 rows:
SELECT 
	con.ContactID, con.FirstName, con.LastName, con.EmailAddress, con.Phone,
	x.*
FROM Person.Contact con
CROSS APPLY [dbo].[ufnGetSalesInfo](con.ContactID) x
WHERE con.ContactID <= 100

-- OUTER APPLY acts as OUTER LEFT JOIN & will return all matching & non-matching rows, returns 529 rows:
SELECT 
	con.ContactID, con.FirstName, con.LastName, con.EmailAddress, con.Phone,
	x.*
FROM Person.Contact con
OUTER APPLY [dbo].[ufnGetSalesInfo](con.ContactID) x
WHERE con.ContactID <= 100


-- Final Cleanup
DROP FUNCTION [dbo].[ufnGetSalesInfo]
GO

>> Using APPLY operator with Derived tables:

[Scenario]: Pull upto 5 Sales Orders for each Contact (Customer):

This is not possible with JOINS, because the condition to SELECT max 5 Orders is applied after the JOIN is made, we need to add the condition before JOIN. Using APPLY operator we can do this, let’s see how:

-- With Derived Table:
SELECT 
	con.ContactID, con.FirstName, con.LastName, con.EmailAddress, con.Phone,
	x.*
FROM Person.Contact con
CROSS APPLY (SELECT TOP 5 
				SalesOrderID, OrderDate, PurchaseOrderNumber, AccountNumber, CustomerID, TotalDue 
			 FROM Sales.SalesOrderHeader soh
			 WHERE soh.ContactID = con.ContactID) x
WHERE con.ContactID <= 100

-- We can also write the above query by using ROW_NUMBER() function & JOINs instead of the APPLY operator:
;WITH CTE AS (
	SELECT 
		SalesOrderID, OrderDate, PurchaseOrderNumber, AccountNumber, ContactID, TotalDue,
		ROW_NUMBER() OVER(PARTITION BY ContactID ORDER BY ContactID, SalesOrderID) AS ROWNUM
		FROM Sales.SalesOrderHeader)
SELECT 
	con.ContactID, con.FirstName, con.LastName, con.EmailAddress, con.Phone,
	CTE.*
FROM Person.Contact con
JOIN CTE cte
ON cte.ContactID = con.ContactID
WHERE con.ContactID <= 100
AND cte.ROWNUM <= 5
GO

Both of these queries do the same task. So to check which one is more optimized we can check the cost of each query in the batch. So the cost of query using APPLY operator is 84% and query using ROW_NUMBER & JOIN is 16%. Hence the second approach is much more performant & should be considered.

>> Using APPLY operator with XML data:

APPLY operators can also we used with XML data when there is a need to query parent/child nodes, check my previous blog post on this topic.

About these ads
  1. Brian H. Lee
    December 1, 2012 at 12:51 pm

    Is it true that both of the queries that you explained in the end do the exact same task? I was not able to test both of the queries since I do not have the Person.Contact table. It seems to me that, in the first query, “SELECT TOP 5″ selects 5 random “SalesOrderID, OrderDate, …” while, in the second query, “SELECT con.ContactID, con.FirstName, …, ROW_NUMBER() OVER(PARTITION BY ContactID ORDER BY ContactID, SalesOrderID)” selects the SalesOrderID’s in ascending order for each ContactID. Then, the next SELECT statement chooses the ContactID’s on the condition that the number of rows for each ContactID must be limited to 5. I believe that the first query does not care if more than 5 rows in the Sales.SalesOrderHeader table matches a row in the Person.Contact table based on their ContactID’s, but the second query does.

  1. November 19, 2012 at 12:37 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 389 other followers

%d bloggers like this: