Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / SQL

SQL: Target a SubQuery

4.50/5 (2 votes)
11 Feb 2013CPOL4 min read 14.4K  
SQL: Target a SubQuery

People ask me a lot about subquery, so I thought let’s answer with an article touching a bit of everything about subquery.

Before discussing SubQuery, there are few things which we should know like...

What is a Query?

I will go with its English Definition first. As a verb, “Put a question or questions to”.

Technical definition w.r.t. SQL “These are the commands issued to a database for retrieval of required information.” If we look closely, both the definitions complement each other. For example:

Select all data in the employees table.

SQL
select * from employees

Let's put a condition like- select all employees with salary>5000.

SQL
select * from employee where salary>5000

What is a SubQuery?

SubQuery can be treated as a ‘query on query’. A subquery is the inner query which provides a targeted result to the outer main query. We can try few examples to learn it.

Example:

Select employee name with its manager name:

SQL
select emp.name,(select mgr.name from employee AS mgr
where emp.mgrid=mgr.empid) from employee AS emp

We mostly see subqueries in where clause like – select employees having average salary:

SQL
select * from employee where salary=(select AVG(salary) from employee)

What is Correlated SubQuery?

A correlated sub-query  is a sub-query that uses values from the outer query in its WHERE clause. Let’s try with an example.

select employees having salary greater than average salary of employees of department ‘IT’.

SQL
select * from employee where salary=(select AVG(salary) from employee
where department=’IT’)

The main difference is that the subquery will be executed for each row before the result can be used by outer query.

Why Do We Require SubQuery or Advantages of SubQuery?

  1. SubQuery holds the results like a temporary table which can be used by outer query
  2. SubQuery is easier to understand
  3. SubQuery breaks down a complex query into small and simple queries
  4. SubQuery is easy to use as a replacement of joins. There is no major difference in performance.

SubQuery Rules

A subquery is subject to the following restrictions:

  • Up to 32 levels of nesting is possible, although the limit varies based on available memory and the complexity of other expressions in the query
  • If a table appears only in a subquery and not in the outer query, then columns from that table cannot be included in the output
  • The select list of a subquery introduced with a comparison operator can include only one expression or column name (except that EXISTS and IN operate on SELECT * or a list, respectively).
  • If the WHERE clause of an outer query includes a column name, it must be join-compatible with the column in the subquery select list.
  • The ntext, text, and image data types cannot be used in the select list of subqueries.
  • Because they must return a single value, subqueries introduced by an unmodified comparison operator (one not followed by the keyword ANY or ALL) cannot include GROUP BY and HAVING clauses.
  • The DISTINCT keyword cannot be used with subqueries that include GROUP BY.
  • The COMPUTE and INTO clauses cannot be specified.
  • ORDER BY can only be specified when TOP is also specified.
  • A view created by using a subquery cannot be updated.
  • The select list of a subquery introduced with EXISTS, by convention, has an asterisk (*) instead of a single column name. The rules for a subquery introduced with EXISTS are the same as those for a standard select list, because a subquery introduced with EXISTS creates an existence test and returns TRUE or FALSE, instead of data.

Reference: http://msdn.microsoft.com/en-us/library/ms189543(v=sql.105).aspx

Join Vs SubQueries

I was looking for this answer and though it’s not a verified answer but yes, it’s true in most cases. Refer to this: http://stackoverflow.com/questions/2577174/join-vs-subquery

In most cases, JOINs are faster than sub-queries and it is very rare for a sub-query to be faster.

In JOINs, RDBMS can create an execution plan that is better for your query and can predict what data should be loaded to be processed and save time, unlike the sub-query where it will run all the queries and load all their data to do the processing.

The good thing in sub-queries is that they are more readable than JOINs: that’s why most new SQL people prefer them; it is the easy way; but when it comes to performance, JOINS are better in most cases even though they are not hard to read too.

Conclusion

I am not building something new but assembled all the questions and answers related to subquery which keeps on bugging me day and night. I hope this will be useful for people looking for answers in one place. Please email me at admin@codespread.com

Related Posts

  1. Nice reference for C# evolution Part 2
  2. SQL Bitwise operator can simplify DB design
  3. How to make a Connection between C# and mysql?

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)