People ask me a lot about subquery, so I thought let’s answer with an article touching a bit of everything about subquery.
Before discussing SubQuery, there are few things which we should know like...
What is a Query?
I will go with its English Definition first. As a verb, “Put a question or questions to”.
Technical definition w.r.t. SQL “These are the commands issued to a database for retrieval of required information.” If we look closely, both the definitions complement each other. For example:
Select all data in the employees
table.
select * from employees
Let's put a condition like- select
all employees with salary>5000
.
select * from employee where salary>5000
What is a SubQuery?
SubQuery can be treated as a ‘query on query’. A subquery is the inner query which provides a targeted result to the outer main query. We can try few examples to learn it.
Example:
Select
employee name with its manager name:
select emp.name,(select mgr.name from employee AS mgr
where emp.mgrid=mgr.empid) from employee AS emp
We mostly see subqueries in where
clause like – select employee
s having average salary
:
select * from employee where salary=(select AVG(salary) from employee)
What is Correlated SubQuery?
A correlated sub-query is a sub-query that uses values from the outer query in its WHERE
clause. Let’s try with an example.
select
employees having salary
greater than average salary
of employee
s of department ‘IT
’.
select * from employee where salary=(select AVG(salary) from employee
where department=’IT’)
The main difference is that the subquery will be executed for each row before the result can be used by outer query.
Why Do We Require SubQuery or Advantages of SubQuery?
- SubQuery holds the results like a temporary table which can be used by outer query
- SubQuery is easier to understand
- SubQuery breaks down a complex query into small and simple queries
- SubQuery is easy to use as a replacement of joins. There is no major difference in performance.
SubQuery Rules
A subquery is subject to the following restrictions:
- Up to 32 levels of nesting is possible, although the limit varies based on available memory and the complexity of other expressions in the query
- If a table appears only in a subquery and not in the outer query, then columns from that table cannot be included in the output
- The select list of a subquery introduced with a comparison operator can include only one expression or column name (except that
EXISTS
and IN
operate on SELECT *
or a list, respectively). - If the
WHERE
clause of an outer query includes a column name, it must be join
-compatible with the column in the subquery select list. - The
ntext
, text
, and image
data types cannot be used in the select list of subqueries. - Because they must return a single value, subqueries introduced by an unmodified comparison operator (one not followed by the keyword
ANY
or ALL
) cannot include GROUP BY
and HAVING
clauses. - The
DISTINCT
keyword cannot be used with subqueries that include GROUP BY
. - The
COMPUTE
and INTO
clauses cannot be specified. ORDER BY
can only be specified when TOP
is also specified. - A view created by using a subquery cannot be updated.
- The select list of a subquery introduced with
EXISTS
, by convention, has an asterisk (*) instead of a single column name. The rules for a subquery introduced with EXISTS
are the same as those for a standard select list, because a subquery introduced with EXISTS
creates an existence test and returns TRUE
or FALSE
, instead of data.
Reference: http://msdn.microsoft.com/en-us/library/ms189543(v=sql.105).aspx
Join Vs SubQueries
I was looking for this answer and though it’s not a verified answer but yes, it’s true in most cases. Refer to this: http://stackoverflow.com/questions/2577174/join-vs-subquery
In most cases, JOIN
s are faster than sub-queries and it is very rare for a sub-query to be faster.
In JOIN
s, RDBMS can create an execution plan that is better for your query and can predict what data should be loaded to be processed and save time, unlike the sub-query where it will run all the queries and load all their data to do the processing.
The good thing in sub-queries is that they are more readable than JOIN
s: that’s why most new SQL people prefer them; it is the easy way; but when it comes to performance, JOIN
S are better in most cases even though they are not hard to read too.
Conclusion
I am not building something new but assembled all the questions and answers related to subquery which keeps on bugging me day and night. I hope this will be useful for people looking for answers in one place. Please email me at admin@codespread.com
CodeProject
Related Posts
- Nice reference for C# evolution Part 2
- SQL Bitwise operator can simplify DB design
- How to make a Connection between C# and mysql?