Introduction
This document will try and explain the ways that joining two tables work. To help explain this I created two temp tables
with specific rows in them to proof the point. Find the SQL to create the sample Temp tables in Appendix A. We have table TEMP_A
with four rows in it the ID's of this four rows in unique and numbered 1,2,3,4 respectively. Then we also have table TEMP_B with five rows in it.
It has rows 1, 2,3,3,5 in it. Note that row 1 and 2 from Table A have one reference each in table B.
Row 3 have two reverences in table B row 4 have no reverences in table B at all, and there is an Orphan row in table B (row 5) that have no parent row in table A.
Also note that the reserved words inner and outer is optional. left outer join and left join mean exactly the same thing.
OK now on to the fun stuff.
Normal Join (Or Inner Join)
Joining (or inner joining) the two table on the ID fields you will get all rows in the intersection of the two sets, meaning where they both have the same value.
Using the data sample created in Appendix A we will get the following result:
select *
from TEMP_A
INNER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_ID | Tbl_Data | Tbl_ID | Tbl_Data |
1 | Tbl A Row 1 | 1 | Tbl B Row 1 |
2 | Tbl A Row 2 | 2 | Tbl B Row 2 |
3 | Tbl A Row 3 | 3 | Tbl B Row 3a |
3 | Tbl A Row 3 | 3 | Tbl B Row 3b |
Note that row 3 in table a is duplicated once for each corresponding row in table B.
Left Join (Or Left Outer Join)
The Left join will return the Intersection of the two tables and in addition it will also return the
rows from the left table that do not have corresponding rows in the right table. What is left and what
is right. Well the Left table is the first table specified and the right is the second table specified.
Or the Right table is the table after the Join statement. The Left side is the rest of the data SQL is working with.
Using the data sample created in Appendix A we will get the following result.
select *
from TEMP_A
LEFT OUTER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_ID | Tbl_Data | Tbl_ID | Tbl_Data |
1 | Tbl A Row 1 | 1 | Tbl B Row 1 |
2 | Tbl A Row 2 | 2 | Tbl B Row 2 |
3 | Tbl A Row 3 | 3 | Tbl B Row 3a |
3 | Tbl A Row 3 | 3 | Tbl B Row 3b |
4 | Tbl A Row 4 | NULL | NULL |
Note that Row 4 from table A is now included but since there is not corresponding row in table B
all the fields from table B contain NULL's.
Right Join (Or Right Outer Join)
The Right Join Is very much like the left join but it return rows from the Right Table that have
no corresponding rows in the Left table.
Using the data sample created in Appendix A we will get the following result.
select *
from TEMP_A
RIGHT OUTER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_ID | Tbl_Data | Tbl_ID | Tbl_Data |
1 | Tbl A Row 1 | 1 | Tbl B Row 1 |
2 | Tbl A Row 2 | 2 | Tbl B Row 2 |
3 | Tbl A Row 3 | 3 | Tbl B Row 3a |
3 | Tbl A Row 3 | 3 | Tbl B Row 3b |
NULL | NULL | 5 | Tbl B Row 5 |
This set also contains 5 rows. But the last row this time contains data from table B
and all data from table A is NULL.
Full Join (Or Full Outer Join)
Well this is like a left and a Right join Combined. It will return the intersection of the
two tables, and all the rows from table A not having corresponding rows in B and all the
rows from B not having corresponding rows in A.
Using the data sample created in Appendix A we will get the following result:
select *
from TEMP_A
FULL OUTER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_ID | Tbl_Data | Tbl_ID | Tbl_Data |
1 | Tbl A Row 1 | 1 | Tbl B Row 1 |
2 | Tbl A Row 2 | 2 | Tbl B Row 2 |
3 | Tbl A Row 3 | 3 | Tbl B Row 3a |
3 | Tbl A Row 3 | 3 | Tbl B Row 3b |
4 | Tbl A Row 4 | NULL | NULL |
NULL | NULL | 5 | Tbl B Row 5 |
Cross Join
Well a Cross join is not really a join you do not specify the fields to join on you
just specify the name of the tables. It will return every row from table A matched
up with every row in table B so the end result will have lots of rows.
Using the data sample created in Appendix A we will get the following result.
select *
from TEMP_A, TEMP_B
Tbl_ID | Tbl_Data | Tbl_ID | Tbl_Data |
1 | Tbl A Row 1 | 1 | Tbl B Row 1 |
1 | Tbl A Row 1 | 2 | Tbl B Row 2 |
1 | Tbl A Row 1 | 3 | Tbl B Row 3a |
1 | Tbl A Row 1 | 3 | Tbl B Row 3b |
1 | Tbl A Row 1 | 5 | Tbl B Row 5 |
2 | Tbl A Row 2 | 1 | Tbl B Row 1 |
2 | Tbl A Row 2 | 2 | Tbl B Row 2 |
2 | Tbl A Row 2 | 3 | Tbl B Row 3a |
2 | Tbl A Row 2 | 3 | Tbl B Row 3b |
2 | Tbl A Row 2 | 5 | Tbl B Row 5 |
3 | Tbl A Row 3 | 1 | Tbl B Row 1 |
3 | Tbl A Row 3 | 2 | Tbl B Row 2 |
3 | Tbl A Row 3 | 3 | Tbl B Row 3a |
3 | Tbl A Row 3 | 3 | Tbl B Row 3b |
3 | Tbl A Row 3 | 5 | Tbl B Row 5 |
4 | Tbl A Row 4 | 1 | Tbl B Row 1 |
4 | Tbl A Row 4 | 2 | Tbl B Row 2 |
4 | Tbl A Row 4 | 3 | Tbl B Row 3a |
4 | Tbl A Row 4 | 3 | Tbl B Row 3b |
4 | Tbl A Row 4 | 5 | Tbl B Row 5 |
Why not using a unique key is bad
During this explanation I used table A with unique values in the TBL_ID field. It is not desired
that you use table where the joining key is not unique in at least one of the two tables. If
this is the case you will find that the non-unique key will perform cross joins. Let's add a
row to Table A to show this.
insert into TEMP_A values (3, 'Tbl A Row 3 dup')
Now select a normal Join as before and watch the results.
select *
from TEMP_A
INNER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_ID | Tbl_Data | Tbl_ID | Tbl_Data |
1 | Tbl A Row 1 | 1 | Tbl B Row 1 |
2 | Tbl A Row 2 | 2 | Tbl B Row 2 |
3 | Tbl A Row 3 | 3 | Tbl B Row 3a |
3 | Tbl A Row 3 | 3 | Tbl B Row 3b |
3 | Tbl A Row 3 dup | 3 | Tbl B Row 3a |
3 | Tbl A Row 3 dup | 3 | Tbl B Row 3b |
See how we now retuned 6 rows.
Using Join to select orphans
So how do I get all the rows in one table that do not have corresponding rows from the
other table? Simple with a left or right join and a where removing the unwanted rows.
Using the sample data:
select *
from TEMP_A
LEFT OUTER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
where TEMP_B.Tbl_ID is null
Tbl_ID | Tbl_Data | Tbl_ID | Tbl_Data |
4 | Tbl A Row 4 | NULL | NULL |
Appendix A (Sample Data)
create table TEMP_A (
Tbl_ID int not null,
Tbl_Data varchar(50) not null
)
insert into TEMP_A values (1, 'Tbl A Row 1')
insert into TEMP_A values (2, 'Tbl A Row 2')
insert into TEMP_A values (3, 'Tbl A Row 3')
insert into TEMP_A values (4, 'Tbl A Row 4')
create table TEMP_B (
Tbl_ID int not null,
Tbl_Data varchar(50) not null
)
insert into TEMP_B values (1, 'Tbl B Row 1')
insert into TEMP_B values (2, 'Tbl B Row 2')
insert into TEMP_B values (3, 'Tbl B Row 3a')
insert into TEMP_B values (3, 'Tbl B Row 3b')
insert into TEMP_B values (5, 'Tbl B Row 5')