In the last blog post, we discussed about COALESCE
function in SQL Server. You can read that article here. In this article, we will go over UNION
and UNION ALL
Operators in SQL Server.
In this article, we will discuss about the following:
- Purpose of
UNION
and UNION ALL
- Differences between
UNION
and UNION ALL
- Differences between
JOIN
and UNION
UNION and UNION ALL operators in SQL Server are used to combine the result sets of two or more SELECT queries.
Let’s understand what is meant by this with an example. We have 2 tables here, tblIndiaCustomers
and tblUSCustomers
. Both of these tables have identical columns – Id
, Name
and Email
.
Now if we issue a select
query against these 2 tables and execute them, we will get 2 separate result sets. We will get 2 rows each from tblIndiaCustomers
and tblUSCustomers
.
Now we want to combine both of these result sets into one result set. How do we do that? We can use UNION
or UNION ALL
Operators. Let’s first use UNION ALL
and see what will happen. When we execute the query, it is going to combine the rows from tblndiaCustomers
with rows from tbUSCustomers
. So while using UNION ALL
, we are combining the result sets from these two queries. Here in output, we will get 4 rows.
Then instead of using UNION ALL
, just use UNION
and see what will happen. Now we will get only 3 rows. While looking at the output, we can see that the duplicate record – Name: Thomas, Email: T@T.Com
is eliminated. While using UNION ALL, we will get all the rows including duplicates. While using UNION Operator, the output is sorted as well.
What are the Differences Between UNION and UNION ALL Operators?
UNION
removes duplicate rows, whereas UNION ALL
doesn’t. UNION
has to perform distinct sort to remove duplicates, which makes it less faster than UNION ALL
.
The distinct sort operation happening in UNION is time consuming. If you want to see that, you can turn on the Estimated Execution Plan in SQL Server Management Studio(SSMS). To do that, you can either press CTRL+L or you can click on Display Estimated Execution Plan icon in SSMS.
When we use UNION
and click on this icon, we can see that the distinct sort is actually taking 63% of time.
On the other hand, when we use UNION ALL
and click on Display Estimated Execution Plan, there is no distinct sort at all.
So for this reason, UNION
is a little bit slower than UNION ALL
.
Another important point to ponder is that for UNION and UNION ALL to work, the Number, Data types and the Order of the columns in the SELECT statements should be same. This makes sense as well. For example, write the query like below.
SELECT Id, Name From tblIndiaCustomers
UNION
SELECT Id, Name, Email From tblUSCustomers
In the above query, the first SELECT
statement is giving 2 columns and the second SELECT
statement is giving 3 columns. How can we combine them?! We can’t do that. So while executing the query, we will get an error.
It is not enough to have equal number of columns, but they should be in the right order as well. If we write a query like below, it doesn’t make any sense.
SELECT Name, Email, Id From tblIndiaCustomers
UNION
SELECT Id, Name, Email From tblUSCustomers
How can we combine Id
with Email
?! So while executing the query, it is trying to convert Name to integer and fails.
So while using UNION
and UNION ALL
Operators, the number of columns have to be the same, data types have to be the same and they have to be in the same order.
Sorting the Results of UNION and UNION ALL
ORDER BY
clause should be used only on the last SELECT
statement in the UNION
query.
While executing the above query, we will get the results sorted by Name
. If we use ORDER BY
clause in the first query, let’s see what is going to happen. While executing the query, it will give an error stating Incorrect syntax near the keyword ‘UNION’.
This makes sense as well. How does UNION ALL
work? It will take the rows from the first table and combine them with the rows from the second table. It doesn’t make sense to take sorted rows from the first table and combine. When they are combined, the result will again become unsorted. That’s why, after getting all the results, sort them using ORDER BY
clause. So ORDER BY should be with the last SELECT statement.
Differences Between JOIN and UNION
UNION
combines the result set of two or more select
queries into a single result set which includes all the rows from all the queries in the UNION
, whereas JOIN
retrieves data from two or more tables based on logical relationships between the tables. - In short, UNION combines rows from 2 or more tables, where JOIN combines columns from 2 or more tables.
Reference: Arun Ramachandran (http://BestTEchnologyBlog.Com)
CodeProject