I have many data in a table in which some rows are duplicated. How can I, for all duplicated rows, delete the extra rows and leave only one? You may assume checking one column is enough to tell if a row is duplicated.
Thanks
Step one: create a second table with identical structure: yourSecondTable;
Step two: CREATE UNIQUE INDEX removedups ON yourSecondTable (col1) WITH IGNORE_DUP_KEY
Step three:
INSERT yourSecondTable
SELECT * FROM yourFirstTable
--Duplicate key was ignored.
|||DECLARE @.FirstName varchar(20),
@.LastName varchar(20),
@.Age int --declare all fields in table
DECLARE c1 CURSOR FOR
--Find Dupes
SELECT FirstName, LastName, Age
FROM MyTable
GROUP BY FirstName, LastName, Age
HAVING COUNT(FirstName) > 1
OPEN c1
FETCH NEXT FROM c1
INTO @.FirstName, @.LastName, @.Age
WHILE @.@.FETCH_STATUS = 0
BEGIN
--Delete all dupes...the cursor remembers the current record
DELETE FROM MyTable
WHERE FirstName IN (SELECT FirstName FROM MyTable GROUP BY FirstName HAVING COUNT(FirstName) > 1)
--insert the current record back into the table
INSERT INTO MyTable(FirstName, LastName, Age) VALUES(@.FirstName, @.LastName, @.Age)
FETCH NEXT FROM c1
INTO @.FirstName, @.LastName, @.Age
END
CLOSE c1
DEALLOCATE c1
Adamus
|||Please look at my reply in the thread below for some solutions. You can do this with a single DELETE statement and batch it using SET ROWCOUNT or TOP clause if you have large number of rows to delete. And if you don't have any primary key or unique key on the table then you can use a cursor based approach.
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=666011&SiteID=1&PageID=1
|||Question. Why would the absence of a primary key be a determinant in selecting a cursor based approach?Umachandar Jayachandran - MS wrote:
Please look at my reply in the thread below for some solutions. You can do this with a single DELETE statement and batch it using SET ROWCOUNT or TOP clause if you have large number of rows to delete. And if you don't have any primary key or unique key on the table then you can use a cursor based approach.
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=666011&SiteID=1&PageID=1
Adamus
|||Because there is no easy way to determine which row to keep out of the duplicates which you can do easily in a single DELETE statement without writing procedural code.
No comments:
Post a Comment