An easy way to remove duplicate rows from a table in SQL Server 2005 is to use undocumented feature called
%%lockres%%
. This pseudo column shows the physical location of a row.
Note that
this feature is undocumented and unsupported so use at your own risk!
A simple test-case. Create a test table:
CREATE TABLE TestTable (
Column1 varchar(1),
Column2 int
)
Add some rows with few duplicates:
INSERT INTO TestTable VALUES ('A', 1);
INSERT INTO TestTable VALUES ('A', 1);
INSERT INTO TestTable VALUES ('A', 2);
INSERT INTO TestTable VALUES ('B', 1);
INSERT INTO TestTable VALUES ('B', 2);
INSERT INTO TestTable VALUES ('B', 2);
INSERT INTO TestTable VALUES ('C', 2)
You can select the data to see that all seven rows are present:
SELECT *
FROM TestTable a
ORDER BY a.Column1, a.Column2
Now let's delete the two duplicates using the
%%lockres%%
:
DELETE
FROM TestTable
WHERE TestTable.%%lockres%%
NOT IN (SELECT MIN(b.%%lockres%%)
FROM TestTable b
GROUP BY b.column1, b.Column2)
And if you run the query again, you'll see that only five rows remain and duplicates have been deleted.
SELECT *
FROM TestTable a
ORDER BY a.Column1, a.Column2
For more information about
%%physloc%%
, corresponding SQL Server 2008 pseudo column, see:
Physical location of a row in SQL Server[
^].
For SQL Server 2008 equivalent, see:
How to remove duplicate rows in SQL Server 2008 when no key is present[
^].