This is the fifth in a series of articles about subqueries. In this article, we discuss subqueries in the HAVING
clause. Other articles discuss their uses in other clauses.
All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012
database.
Using Subqueries in the HAVING Clause
You can use sub queries in the HAVING
clause to filter out groups of records. Just as the WHERE
clause is used to filter rows of records, the HAVING
clause is used to filter groups. Because of this, it becomes very useful in filtering on aggregate values such as averages, summations, and count.
The power of using a subquery in the HAVING
clause is now you don’t have to hard-code values within the comparisons. You can rely on the subquery’s results to do so for you.
For example, it is now possible to compare the average of a group to the overall average. We’ve always been able to use the average of the group in the HAVING
clause, but we had no way to compute the overall average. Now, using subqueries, this is possible.
In this example, we’re selecting employee job titles having remaining vacation hours greater than the overall average for all employees.
Here is the query written without the subquery:
SELECT JobTitle,
AVG(VacationHours) AS AverageVacationHours
FROM HumanResources.Employee
GROUP BY JobTitle
HAVING AVG(VacationHours) > <span style="color: rgb(255, 0, 0);">50</span>
I’ve highlighted the value in red that will be replaced by a subquery.
Now here is the complete statement including the subquery:
SELECT JobTitle,
AVG(VacationHours) AS AverageVacationHours
FROM HumanResources.Employee
GROUP BY JobTitle
HAVING AVG(VacationHours) > <span style="color: rgb(255, 0, 0);">(SELECT AVG(VacationHours)</span>
<span style="color: rgb(255, 0, 0);"> FROM HumanResources.Employee)</span>
This query is executed as:
- Compute the remaining average vacation hours for all employees (subquery)
- Group records by
JobTitle
and compute the average vacation hours. - Only keep groups whose average vacation hours are greater than the overall average.
Correlated Subqueries in HAVING Clause
As with any other subquery, subqueries in the HAVING
clause can be correlated with fields from the outer query.
Suppose we further group the job titles by marital status and only want to keep those combinations of job titles and martial statuses whose vacation hours are greater than those for their corresponding overall marital status?
In other words, we want to answer a question similar to “do married accountants have, on average, more remaining vacation, than married employees in general?”
One way to find out is to use the following query:
SELECT JobTitle,
MaritalStatus,
AVG(VacationHours)
FROM HumanResources.Employee AS E
GROUP BY JobTitle, MaritalStatus
HAVING AVG(VacationHours) >
(SELECT AVG(VacationHours)
FROM HumanResources.Employee
WHERE HumanResources.Employee. MaritalStatus = E.MaritalStatus)
There are a couple of things to point out. First, notice that I aliased the Employee
as “E
” in the outer query. This allows me to reference the outer table within the inner query.
Also, with the correlated query, only fields used in the GROUP BY
can be used in the inner query. For instance, for kicks and grins, I tried replacing MaritalStatus
with Gender
and got an error.
SELECT JobTitle,
MaritalStatus,
AVG(VacationHours)
FROM HumanResources.Employee AS E
GROUP BY JobTitle, MaritalStatus
HAVING AVG(VacationHours) >
(SELECT AVG(VacationHours)
FROM HumanResources.Employee
WHERE HumanResources.Employee. Gender = E. Gender)
Is a broken query. If you try to run it, you’ll get the following error:
Column ‘HumanResources.Employee.Gender
’ is invalid in the HAVING
clause because it is not contained in either an aggregate function or the GROUP BY
clause.
Summary
One advantage of using a subquery in the HAVING
clause is to avoid hard coding values, such as an overall average, which can change and is easily computed.
As with other queries, it is possible to build correlated subqueries in the HAVING
clause. This can be useful when the subquery is dependent on the outer query’s column values, and may make it easier to initially understand a query; however, care should be taken! As with all SQL, there are usually many ways to write a query to return the same result. If performance is a concern, then use query plans to understand performance and explore alternatives.
Hello my name is Kris. I’m here because I am passionate about helping non-techie people to overcome their fear of learning SQL.
I know what it is like to not know where to start or whether the time spent learning is worth the effort. That is why I am here to help you to:
- Get started in an easy to follow step-by-step manner.
- Use your time wisely so you focus on what is important to learn to get the most value from your time.
- Answer your questions. Really! Just post a comment and I’ll respond. I’m here to help.
It wasn’t long ago that I was helping a colleague with some reporting. She didn’t know where to start and soon got overwhelmed and lost as she didn’t know SQL.
I felt really bad, as she was under pressure to get some summary information to her boss, the built-in reports were falling short, and to make them better would require her to know SQL. At that time that seemed impossible! It in dawned on me, it doesn’t have to be that way.
Then I discovered a way for anyone with the desire to easily learn SQL. I worked with my co-worker, started to teach her what I learned and soon she was able to write reports and answer her boss’ questions without getting stressed or ploughing hours into manipulating data in Excel.
It hasn’t always been easy. Sometimes the information seems abstract or too conceptual. In this case I’ve found out that a visual explanation is best. I really like to use diagrams or videos to explain hard-to-grasp ideas.
Having video, pictures, and text really help to reinforce the point and enable learning.
And now I want to help you get the same results.
The first step is simple, click here http://www.essentialsql.com/get-started-with-sql-server/