Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / SQL

Learn What is Cartesian Product in SQL

2.40/5 (3 votes)
6 Jul 2021CPOL5 min read 5.6K  
This article is about understanding the Cartesian Product in SQL followed by some examples.
The readers of this article are going to get some professional life tips related the subject matter. This article also highlights the importance of choosing the correct use of join in your SQL queries to meet business objectives.

About Cartesian Product

It would not be much fun for beginners if we just start SQL examples of Cartesian Product without first developing some understanding of it.

Let us explore it first with the help of some simple definitions and examples.

What is Cartesian Product?

A Cartesian Product is a product of two sets of elements where each element in one set is multiplied by every other element of the other set.

In simple words, if we have the following sets:

A={1,2}, B={3,4}

Their Cartesian Product will be as follows:

C=AXB={(1,3),(1,4),(2,3),(2,4)}

As you can see, the element 1 in set A is multiplied with all the elements 3,4 in set B to become (1,3) and (1,4) and similarly the second element 2 is multiplied with all the elements in set B to from (2,3) and (2,4).

What is Cross Product?

Cartesian Product is also known as Cross Product as the multiplication is applied across all the elements in one set with all the elements of the other set.

Difference between Cartesian Product and Cross Join

A Cross Join is actually the SQL name for Cartesian Product where the elements can be thought of columns and sets can be thought of rows.

Example of Cartesian Product

If we have two different tables in a SQL database with each having 3 rows, we will get 9 rows in total after applying the Cartesian Product.

Applying Cartesian Product (Cross Join) in SQL

Let us go through a couple of examples to understand how it works and why it is needed in SQL.

This article assumes that readers are familiar with basic relational database concepts and comfortable to write and run SQL scripts on a locally or remoted installed SQL instance.

Handling Scattered Examination Data

How would you handle a professional life scenario where you are handed over a disconnected or rather scattered data of examination bundled into a database to be refined further in order to make it more meaningful and understandable?

Well, perhaps you don’t know the solution until you analyze the data to choose the best suited option and that is what we are trying to understand.

Business Requirements

An Examination services provider company maintains the data in a SQL database but they source the data from different places in such a way that you have the following two tables:

  • Student
  • Exam Type

Now there is a table (Student) that contains the information about the registered students while the type of exam each of the students has to take once registered is stored in another table called Exam Type.

The problem is that these two tables are completely unlinked and you have to come up with a solution that each individual student should be allocated all the available exam types without manually entering the data.

Proposed Solution

The proposed solution is to get Cartesian Product of the two unlinked tables such that they become associated with each other the way the business wants it to see.

Setup Sample Exam Database

Let us setup a sample database called ExamCart by running the following script against master database:

SQL
-- Create sample database ExamCart
USE MASTER
GO
 
CREATE DATABASE ExamCart
GO
 
USE ExamCart
 
-- Creating a reference table ExamType
CREATE TABLE dbo.ExamType
(
    ExamTypeId INT IDENTITY(1,1),
    Name VARCHAR(50)NOT NULL,
    Detail VARCHAR(200)NULL 
    CONSTRAINT PK_ExamType_ExamTypeId PRIMARY KEY (ExamTypeId)
)
GO
 
-- Populating (adding rows to the) table ExamType
SET IDENTITY_INSERT dbo.ExamType ON
GO
INSERT INTO dbo.ExamType
(
  ExamTypeId
 ,Name
 ,Detail
)
VALUES
(
  1  -- ID - INT Primary Key
 ,'Mid-Term Exam' -- Name - varchar(50) NOT NULL
 ,'This is Mid-Term Exam' -- Detail - varchar(200)
),
(
  2  -- ID - INT Primary Key
 ,'Final Term Exam' -- Name - varchar(50) NOT NULL
 ,'This is Final Term Exam' -- Detail - varchar(200)
),
(
  3  -- ID - INT Primary Key
 ,'First Term Exam' -- Name - varchar(50) NOT NULL
 ,'This is First Term Exam' -- Detail - varchar(200)
);
GO
SET IDENTITY_INSERT dbo.ExamType OFF
GO

Now just to impersonate the professional life scenario soon after some time, the Student table about the registered students was added to the database.

So, please add a Student table by running the script against ExamCart database this time as follows:

SQL
-- Connect to ExamCart database
USE ExamCart
 
-- Creating a reference table Student
DROP TABLE if EXISTS dbo.Student
CREATE TABLE dbo.Student
(
    StudentId INT IDENTITY(1,1),
    Name VARCHAR(50)NOT NULL,
    Detail VARCHAR(200)NULL 
    CONSTRAINT PK_Student_StudentId PRIMARY KEY (StudentId)
)
GO
 
-- Populating (adding rows to the) table Student
SET IDENTITY_INSERT dbo.Student ON
GO
INSERT INTO dbo.Student
(
  StudentId
 ,Name
 ,Detail
)
VALUES
(
  1  -- ID - INT Primary Key
 ,'Atif' -- Name - varchar(50) NOT NULL
 ,'This is Atif' -- Detail - varchar(200)
),
(
  2  -- ID - INT Primary Key
 ,'Sam' -- Name - varchar(50) NOT NULL
 ,'This is Sam' -- Detail - varchar(200)
),
(
  3  -- ID - INT Primary Key
 ,'Sarah' -- Name - varchar(50) NOT NULL
 ,'This is Sarah' -- Detail - varchar(200)
);
GO
SET IDENTITY_INSERT dbo.Student OFF
GO

Quick Check

Let us take a quick look at the student table by running the following script against the sample database:

SQL
-- View Student table
SELECT
  s.StudentId
 ,s.Name
 ,s.Detail
FROM dbo.Student s

The output is as follows:

Student table in ExamCart database

Please note that I am using dbForge Studio for SQL Server to write and run the T-SQL scripts, so the output look (display) may be different but query results remain unaffected (are not going to be different) if you are using any other tool like SQL Server Management Studio (SSMS).

Similarly, let us see the contents of ExamType table as follows:

SQL
-- View ExamType tabl edata
SELECT et.ExamTypeId, et.Name, et.Detail FROM dbo.ExamType et

The output is as follows:

Examination Types

Applying the Cartesian Product of Both Tables

Now in order to assign each examination type to each student, we need a Cartesian Product of both tables which can be successfully achieved by the following query:

SQL
-- View Cartesian Product of Student and ExamType
SELECT s.StudentId,s.Name,s.Detail,et.ExamTypeId,et.Name AS Exam_Name, et.Detail AS Exam_Detail
FROM dbo.Student s,dbo.ExamType et 

The results are as follows:

Cartesian Product of Student and Exam Table

Applying Cartesian Product Using Cross Join

The same results can also be achieved using Cross Join keyword:

SQL
-- View Product of Student and Exam table rows using Cross Join
SELECT 
et.ExamTypeId,et.Name AS Exam_Name,et.Detail AS Exam_Detail,
s.StudentId,s.Name,s.Detail
FROM dbo.Student s CROSS JOIN dbo.ExamType et
order BY et.ExamTypeId 

The results are as follows:

exam and student table product using cross join

Creating a Cartesian Product StudentExam Table

A better way to store your results physically is to create a table on the fly that contains the result of the product of the two tables.

However, we need to create a distinct id column to get it qualified for a primary key later on as per standard practice of creating tables in a SQL database.

To create a unique id, we are going to use ROW_NUMBER() function that assigns a new id starting from 1.

This is achieved as follows:

SQL
--Creating a Cartesian Product StudentExam Table
SELECT 
ROW_NUMBER() OVER (ORDER BY et.Name) as StudentExamId,
et.ExamTypeId,et.Name AS Exam_Name,et.Detail AS Exam_Detail,
s.StudentId,s.Name,s.Detail INTO StudentExam
FROM dbo.Student s CROSS JOIN dbo.ExamType et

Let us expand the ExamCart databases node to locate the newly created StudentExam table as a result of Cartesian Product (Cross Join):

StudentExam table created by Cross Join of Student and ExamType tables

Professional Life Tip

Please remember that Cross Join or Cartesian Product of two large tables (having many rows) could become a very resource intensive operation which means your SQL Server may be needing a lot of power to perform this operation and that can cause performance issues and in order to resolve the performance issue, you may have to scale up (add more to) your Server resources (which is often easily possible in cloud environment) but that may also come up with some more cost.

However, it is a safe operation that does the job it is supposed to do and can also be used to denormalize the data (a standard practice) in a typical data warehouse business intelligence solution.

Congratulations! You have successfully learned about Cartesian Product in SQL along with the steps to implement it by working out your solution.

History

  • 6th July, 2021: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)