Joins and subqueries are both used to query data from different tables and may even share the same query plan, but there are many differences between them. Knowing the differences and when to use either a join or subquery to search data from one or more tables is key to mastering SQL.
All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012
database. You can get started using these free tools using my Guide Getting Started Using SQL Server.
Joins versus Subqueries
Joins and subqueries are both used to combine data from different tables into a single result. They share many similarities and differences.
Subqueries can be used to return either a scalar (single) value or a row set; whereas, joins are used to return rows.
A common use for a subquery may be to calculate a summary value for use in a query. For instance, we can use a subquery to help us obtain all products that have a greater than average product price.
SELECT ProductID,
Name,
ListPrice,
<span style="color: rgb(0, 128, 0);">(SELECT AVG(ListPrice)</span>
<span style="color: rgb(0, 128, 0);"> FROM Production.Product)</span> AS AvgListPrice
FROM Production.Product
WHERE ListPrice > <span style="color: rgb(0, 0, 255);">(SELECT AVG(ListPrice)</span>
<span style="color: rgb(0, 0, 255);"> FROM Production.Product)</span>
There are two subqueries in this SELECT
statement. The first’s purpose is to display the average list price of all products, the second’s purpose is for filtering out products less than or equal to the average list price.
Here the subquery is returning a single value which is then used to filter out products.
Notice how the subqueries are queries unto themselves. In this example, you could paste the subquery, without the parenthesis, into a query window and run it.
Contrast this with a join
whose main purpose is to combine rows from one or more tables based on a match condition. For example, we can use a join
display product names and models.
Select Product.Name,
ProductModel.Name as ModelName
FROM Production.product
<span style="color: rgb(255, 0, 0);">INNER JOIN Production.ProductModel</span>
<span style="color: rgb(255, 0, 0);"> ON Product.ProductModelID = ProductModel.ProductModelID</span>
In this statement, we’re using an INNER JOIN to match rows from both the Product and ProductModel tables. Notice that the column ProductModel.Name
is available for use throughout the query.
The combined row set is then available by the select
statement for use to display, filter, or group by the columns.
This is different than the subquery. There the subquery returns a result, which is immediately used.
Note that the join
is an integral part of the select
statement. It cannot stand on its own as a subquery can.
A subquery is used to run a separate query from within the main query. In many cases, the returned value is displayed as a column or used in a filter condition such as where
or having
clause. When a subquery incorporates a column from the main query, it is said to be correlated. In this way, a sub query is somewhat like a join
in that values from two or more tables can be compared.
My article Introduction to Subqueries in the SELECT Statement provides a good explanation of correlated subqueries.
Join
s are used in the FROM
clause of the WHERE
statement; however, you’ll find subqueries used in most clauses such as the:
- SELECT List – here a subquery is used to return single values
- WHERE clause– depending on the conditional operator, you’ll see single value or row based subqueries
- FROM clause– It is typical to see row based result subqueries used here
- HAVING clause – In my experience, scalar (single value) subqueries are used here
Though joins and subqueries have many differences, they can be used to solve similar problems. In fact, just because you write a SQL statement as a subquery doesn’t mean the DBMS executes as such.
Let’s look at an example.
Suppose the Sales Manager for Adventure Works wants a detailed listing of all sales orders and the number of order details lines for each order.
Surprisingly, there are two ways to go about solving this. We can use a join
or a subquery.
Here are the two statements side by side:
Side-by-Side Comparison of Join and Subquery
Obviously they look different, but did you know they have very similar query plans?
Here is the query plan for a subquery:
Subquery Query Plan
If you look closely, you’ll see there is a Merge Join
operation. The subquery is being translated into the same set of operation used for the join
. In fact, if you look at the corresponding join
s query plan, you’ll see it is very similar. You can get more detail about his in my article what is a query plan.
Subqueries and joins can be confusing, but they don’t have to be that way. I have put together a really great series of videos explaining subqueries and their mysteries. Click the button below to see more!
The post What is the Difference between a Join and Subquery? appeared first on Essential SQL.