About this Post
This post is designed to help you understand how to use the EXISTS
keyword in your SQL query environment. All of the examples will be based on a fictional three table database for posting messages to an Internet message board. The examples will be generic so that they can apply to any database (Oracle, SQL Server, etc.) that supports the EXISTS
syntax.
As part of my job, I regularly deal with SQL coding. My current focus is on Oracle databases, but I occasionally get to play in DB2 and other platforms. I recently encountered some SQL code that I was not familiar with. In the WHERE
clause of the code was an EXISTS
statement with a sub query listed inside. It looked something like the following:
SELECT *
FROM Users U
WHERE ID EXISTS (SELECT NULL
FROM PostRating PR
WHERE PR.UserID = U.ID
AND MarkedAsSpam >= 5)
Hard as I tried, I could not make sense of the query listed above. Why would you select NULL
from the table and what does it mean. In order to explore this further, we need to have a simple problem to solve.
First, there will be two database tables for this example. I'm going to write generic syntax to get the point across. You'll have to modify this to your platform. The tables only have the columns that are relevant to the example. Were this a production table, they would have many more columns.
CREATE TABLE Users AS
ID Number
Name String
INSERT INTO Users (ID, Name) VALUES (1, 'Nice Guy')
INSERT INTO Users (ID, Name) VALUES (2, 'Bad Guy')
CREATE TABLE PostRating AS
UserID Number
PostID Number
SpamVotes Number
INSERT INTO PostRating (UserID, PostID, SpamVotes) VALUES (1, 1, 1)
INSERT INTO PostRating (UserID, PostID, SpamVotes) VALUES (2, 2, 20)
For the tables above, the Users
table lists the valid users in the system. PostRatings
table is used for people on the message board to mark a message as SPAM. Using the community to mark messages as spam can help clean up an open message board from unwanted content.
Before digging into the original query above, let's explore with some basic testing how EXISTS
works.
SELECT *
FROM Users
SELECT *
FROM Users
WHERE EXISTS(SELECT null
FROM PostRating)
SELECT *
FROM Users
WHERE EXISTS(SELECT null
FROM PostRating
UserID = 999)
SELECT *
FROM Users
WHERE EXISTS(SELECT null
FROM PostRating
WHERE UserID = 1)
SELECT *
FROM Users
WHERE EXISTS (SELECT null
FROM dual)
SELECT *
FROM Users
WHERE EXISTS (NULL)
Hopefully at this point, you have come to the same conclusion that I did after all the testing. If the query listed inside the EXISTS
returns anything [including NULL
], then the statement is true
. If no records come back, then the record is false
and doesn't return anything in the whole query.
Now you might be asking why anybody would use this syntax. The power of this statement comes when you join the sub query of the EXISTS
to the main query. You can quickly exclude records that don't meet the criteria. In the original example above, I wanted to list all users that have had one or more posts listed as spam 5 or more times.
SELECT *
FROM Users U
WHERE ID EXISTS (SELECT NULL
FROM PostRating PR
WHERE PR.UserID = U.ID
AND MarkedAsSpam >= 5)
By joining the Users
Table to the PostRating
table in the WHERE
statement of the sub query inside the Exists
, I'm essentially asking, "Does this user have their MarkedAsSpam
for any post listed 5 or more times?" The result is that only 1 record will return for the 'Bad Guy' user, but it will ignore 'Good Guy' as their count is only at 1.
This would be great for an admin report to help know which users might be removed from the system for abuse, and it protects the good users are not abusing the system.
One final thought. Upon my initial research for EXISTS
, I could only think that there are other ways to write the same query and get the result you're looking for. However, I inherited a large code base where the developers preferred the EXISTS
syntax. I don't have the time to re-write and test working code, so I have learned to work with it. Now that I understand it, I have added it to my skills and even find it quite useful!
Let me know if you have questions or comments.
Hogan Haake
CodeProject