Showing posts with label toimplement. Show all posts
Showing posts with label toimplement. Show all posts
Thursday, March 22, 2012
Deleting Duplicate Data
Hi to All!
I have a table with 60 columns and more than one million rows. i have to
implement composite primary key but it gives me error of duplicate data.
i have used following query to detect the duplicate rows
select NID,output_No from tbl_Data
group by NID,output_No
having count(*) > 1
it gives me 2526 duplicate dows. Now i wants to delete the duplicate
rows what will be the query for deleting the duplicate records.
Thanx
*** Sent via Developersdex http://www.examnotes.net ***This script has written by Itzik Ben-Gan
CREATE TABLE #Demo (
idNo int identity(1,1),
colA int,
colB int
)
INSERT INTO #Demo(colA,colB) VALUES (1,6)
INSERT INTO #Demo(colA,colB) VALUES (1,6)
INSERT INTO #Demo(colA,colB) VALUES (2,4)
INSERT INTO #Demo(colA,colB) VALUES (3,3)
INSERT INTO #Demo(colA,colB) VALUES (4,2)
INSERT INTO #Demo(colA,colB) VALUES (3,3)
INSERT INTO #Demo(colA,colB) VALUES (5,1)
INSERT INTO #Demo(colA,colB) VALUES (8,1)
PRINT 'Table'
SELECT * FROM #Demo
PRINT 'Duplicates in Table'
SELECT * FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo <> B.idNo
AND A.colA = B.colA
AND A.colB = B.colB)
PRINT 'Duplicates to Delete'
SELECT * FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo < B.idNo -- < this time, not <>
AND A.colA = B.colA
AND A.colB = B.colB)
DELETE FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo < B.idNo -- < this time, not <>
AND A.colA = B.colA
AND A.colB = B.colB)
PRINT 'Cleaned-up Table'
SELECT * FROM #Demo
DROP TABLE #Demo
"Ghulam Farid" <gfaryd@.yahoo.com> wrote in message
news:umVO5durFHA.3444@.TK2MSFTNGP12.phx.gbl...
>
> Hi to All!
> I have a table with 60 columns and more than one million rows. i have to
> implement composite primary key but it gives me error of duplicate data.
> i have used following query to detect the duplicate rows
> select NID,output_No from tbl_Data
> group by NID,output_No
> having count(*) > 1
> it gives me 2526 duplicate dows. Now i wants to delete the duplicate
> rows what will be the query for deleting the duplicate records.
>
> Thanx
>
> *** Sent via Developersdex http://www.examnotes.net ***|||Try this:
(1) SELECT columnList INTO workTable FROM tableName GROUP BY keyColumns
HAVING COUNT(*) > 1
(2) DELETE tableName FROM workTable WHERE tableName.keyColumns =
workTable.keyColumns
(3) INSERT tableName (columnList) SELECT columnList FROM workTable
(4) DROP workTable
You might want to wrap this in a transaction, but if you don't then a temp
table for the work table is contraindicated because if power goes out
between steps 2 and 3, you will lose the duplicate rows altogether.
If the table were tiny, you could use something like:
SET ROWCOUNT 1
AGAIN:
DELETE tableName FROM (SELECT keyColumns FROM tableName GROUP BY keyColumns
HAVING COUNT(*) > 1) a WHERE tableName.keyColumns = a.keyColumns
IF @.@.ROWCOUNT > 0 GOTO AGAIN
SET ROWCOUNT 0
"Ghulam Farid" <gfaryd@.yahoo.com> wrote in message
news:umVO5durFHA.3444@.TK2MSFTNGP12.phx.gbl...
>
> Hi to All!
> I have a table with 60 columns and more than one million rows. i have to
> implement composite primary key but it gives me error of duplicate data.
> i have used following query to detect the duplicate rows
> select NID,output_No from tbl_Data
> group by NID,output_No
> having count(*) > 1
> it gives me 2526 duplicate dows. Now i wants to delete the duplicate
> rows what will be the query for deleting the duplicate records.
>
> Thanx
>
> *** Sent via Developersdex http://www.examnotes.net ***
Subscribe to:
Posts (Atom)