Introduction
This article is about the different between ROW_NUMBER()
, RANK()
and DENSE_RANK()
SQL functions.
Background
Most of the beginners mix up basic SQL functions or they don't know in which situation/scenario which functions are useful. They write messy code that work fine but compromise the efficiency of SQL queries. This tip is about using the window function build in many databases. So I want to write this post keeping basic thing basic. Hope this will help you out in improving your SQL queries.
For this tip, I have a database with table name "Employees
" with columns (Id
, EmpName
, EmpDob
, EmpSalary
). Here is the complete SQL script to generate Employees
Table with dummy data.
For executing these queries, use www.sqlfiddle.com if you don't have any databases installed on your system. :)
CREATE TABLE [Employees] (
[Id] INTEGER NOT NULL IDENTITY(1, 1),
[EmpName] VARCHAR(255) NULL,
[EmpDob] VARCHAR(255),
[EmpSalary] INTEGER NULL,
PRIMARY KEY ([Id])
);
GO
INSERT INTO Employees([EmpName],[EmpDob],[EmpSalary]) _
VALUES('Sasha Haynes','11/30/16',80000),('Wing Ryan','01/09/17',5200),_
('Evangeline Fitzpatrick','07/30/16',80000),('Brenden Saunders','05/21/17',8183),_
('Barrett Allison','03/10/17',25000),('Emily Garrett','08/11/16',63711),_
('Piper Chen','06/14/17',68525),('Kristen Juarez','07/30/16',65000),_
('Colton Johns','12/20/16',80000),('Cameron Massey','04/27/17',45000);
INSERT INTO Employees([EmpName],[EmpDob],[EmpSalary]) VALUES('Aimee Jacobson','04/17/17',4500),_
('Rashad Valencia','02/27/17',36000),('Aurelia Morrison','03/29/17',5700),_
('Beck Wood','12/23/16',36754),('Evan Gould','05/14/17',36000),_
('Moana Travis','03/08/17',65252),('Shelly Barron','12/26/16',65047),_
('Quamar Navarro','03/15/17',65000),('Gil Roach','04/18/17',65000),_
('Rosalyn Nieves','07/01/17',36382);
1. Row_Number()
This function assigns a unique number to each row fetched by the order by
clause. Order by
clause in necessary for Row_number()
to execute properly otherwise it will give syntax error.
SELECT TOP 10
[EmpName]
,[EmpSalary],ROW_NUMBER()
OVER (ORDER BY EMPSALARY) AS RowNum
FROM [TestDb].[dbo].[Employees]
The output of the query is given by. Here you can see the Row_Number ()
has assigned a unique index to each row.
| EmpName | EmpSalary | RowNum |
|------------------|-----------|--------|
| Aimee Jacobson | 4500 | 1 |
| Wing Ryan | 5200 | 2 |
| Aurelia Morrison | 5700 | 3 |
| Brenden Saunders | 8183 | 4 |
| Barrett Allison | 25000 | 5 |
2. RANK()
This function assigns a Rank
to each row in the result set based on the mentioned column in Over
clause of query.
SELECT EmpName,EmpSalary,RANK() OVER(ORDER BY EmpSalary Desc) as Rank
FROM employees
Here, you can see the Rank()
function has assigned a rank to each unique value of EmpSalary
column. There is a gap between rank, i.e., rank 2,3 are skipped by the Rank
function. If you want to remove the gap between ranks, we have to use DENSE_RANK()
function.
| EmpName | EmpSalary | Rank |
|------------------------|-----------|------|
| Sasha Haynes | 80000 | 1 |
| Evangeline Fitzpatrick | 80000 | 1 |
| Colton Johns | 80000 | 1 |
| Piper Chen | 68525 | 4 |
| Moana Travis | 65252 | 5 |
| Shelly Barron | 65047 | 6 |
3. DENSE_RANK()
This function performs same as Rank()
function. Only difference is returns rank without gaps.
SELECT EmpName,EmpSalary,DENSE_RANK() OVER(ORDER BY EmpSalary Desc) as Rank
FROM employees
The output of the query is given by. Here you can see the DENSE_RANK()
has removed the gaps between the ranks. Each unique value of the specified column (EmpSalary
) has a unique rank.
| EmpName | EmpSalary | Rank |
|------------------------|-----------|------|
| Sasha Haynes | 80000 | 1 |
| Evangeline Fitzpatrick | 80000 | 1 |
| Colton Johns | 80000 | 1 |
| Piper Chen | 68525 | 2 |
| Moana Travis | 65252 | 3 |
| Shelly Barron | 65047 | 4 |
| Quamar Navarro | 65000 | 5 |
I hope this tip will help you in understanding the difference between Row_Number()
, Rank()
and Dense_Rank()
functions. Feel free to comment if you want any amendments in this post.