When I write joins in SQL statements, 99% of the time I write inner and left joins. Why such a high frequency? Because these two join types cover almost all the conceivable table interactions one might need. But every once in a while, I have an unconventional task to do, and sometimes a different join type is warranted.
In a future post, I’ll cover full joins and why they could be helpful in some [albeit rare] circumstances. For now, I’ll cover an example of using a Cartesian [cross] join.
At Boston Public Schools, we assign students MBTA passes, and the passes need to be in a certain range. I needed to come up with some logic to populate a table with a list of numbers. Logic afterwards is used to assign passes to students as appropriate. The SQL below is similar to what I came up with.
The example assumes 800,000 numbers need to be inserted into the [already existing but empty] TPassNumbers table. While this is far more records than would be needed for the number of students at BPS, having such a high number illustrates the speed improvement using a cross join.
Here’s the conventional solution:
declare @counter intset @counter = 0
While @counter < 800000
begin
insert into TpassNumbers (tpassNumber) values(@counter)
set @counter = @counter + 1
end
It is simple and gets the job done. Unfortunately, it takes over a minute to run. While this isn’t a terribly long time, this solution doesn’t seem to scale very well.
What does scale well? Joins scale. So here’s an example using cross joins that functionally does the same thing as the While loop above.
—– Use a table variable
declare @numbers table( num int)
—– Populate a table with numbers 0 to 9
declare @counter int
set @counter = 0
While @counter < 10
begin
insert into @numbers values(@counter)
set @counter = @counter + 1
end
insert into TpassNumbers (tpassNumber)
select
(
n100000place.num*100000 +
n10000place.num*10000 +
n1000place.num*1000 +
n100place.num*100 +
n10place.num*10 +
n1place.num
) as ‘number’
From @numbers n1place
Cross join @numbers n10place
Cross join @numbers n100place
Cross join @numbers n1000place
Cross join @numbers n10000place
Cross join @numbers n100000place
where n100000place.num < 8 —– < 800000
order by number
It is a bit more complicated than the While loop. But it scales – it runs in 3 seconds. That’s a 20 fold speed improvement. I’ll take that any day.