Showing posts with label coding. Show all posts
Showing posts with label coding. Show all posts

Thursday, March 22, 2012

Deleting Duplicated Rows

Hi,
I have a table named "std_attn", where, by some bad coding, lots of duplicated rows have been created. And the table don't have any PK. So Now tell me the way to remove the duplicaies..............
thnxDROP DATABASE database_name

:)

just kidding

do you want to remove the duplicates, or remove all but one duplicate?

and if all but one, how do you pick which one?

is there some column that is incrementing, like an IDENTITY or a dateadded column or something? if so, you could keep the last one of each group...|||Since there is no PK, insert the duplicate records into a temp table with an identity column. You can then delete the records from the source table and insert a SELECT DISTINCT from the temp table back into your source table. After this is done, put a constraint on the table and fix the application. :)|||derrick, how does this isolate which one of the duplicates to keep?

personally, i like the approach offered in this article:
Advanced SQL Techniques - Part 1: Analyzing Duplicate Records (http://www.15seconds.com/issue/011009.htm)|||? They're duplicates. Who cares which one you keep. This gets rid of all of them, then inserts one back in.|||Hi,
No Derrick, its not that I don't want any of the duplicate rows. I need one. I will tell you, what actually i was trying..........

1. I made a temporary table which was the copy of the original one
2. Then copied all the rows from original to the temporary one.
3. Then deleted all rows from the original one
4. Defind the PK in the original table
5. Added a column in temporary table with allow null property. The data type was boolean. It was made for the reason to update the it with the value "yes" if the rows was having any duplicate row and "no" in the row is unique.
6. Then insert into the original from temporary where the column value is equal to "no"

But unfortunetly I failed at the 5th step only. I could not get the correct SQL stetement...

So it'll be a great help, if any one can get me the correct SQL statement

thnx|||did you try the method in the article i gave you the link for?|||Ok guys, this isn't that hard. The article you referred to is in reference to tables that have an identity column, which means there's something in the row that makes it intrinsically unique even though it's just an ID field. This works great, except for the fact his table does not have an identity column. This means the idea in the article does not work.

That's why he needs to do what I told him to do, which does work. There's really not a "better" way to do it if there's nothing to distinguish between the rows in the table. You were doing fine until you tried to get fancy.

1. Insert the duplicate records into a temp table with an identity column.
2. Delete the duplicate records only from the source table.
3. Insert a SELECT DISTINCT from the temp table back into your source table.
>>This part is really important and you kind of missed it last time around.
4. Put a constraint on the table and fix the application.|||? They're duplicates. Who cares which one you keep. This gets rid of all of them, then inserts one back in.

Wanna bet...

U$1000.00 That they're not......betcha something is different|||Then he needs to ask the right question. I'm answering the one he asked, so I really don't care if something is different. :) How ya' doing Brett. I just keep seeing you on these forums. lol|||But just like the day to day job you're in...is the question asked EVER what the real question is, and what the real answer is suppose to be?

Ever?

OK, sometimes...but rarley....

And yeah...an addiction is an addiction...sorry...I just can't help myself...|||If he did ask the question appropriately though, the answer is correct. If not, then the article suggested should work. We both know the real issue is the fact he got the rows in the first place. The reality is that the application needs fixed and constraints need to be created so it doesn't happen in the first place. :)|||Hi,
where, by some bad coding, lots of duplicated rows have been created. And the table don't have any PK.

Yeah Derrick...I'm not disagreeing with you ...the answers solve the the question asked...

I just don't believe the question...

Some bad coding...puuuuuuleeeeeeeeeeze...

Some bad db design...

A good db design FORCES the developers to do the right thing...

@.@.ERROR <> 0

and to be TOTALLY politically incorrect...

EDIT: ...nope...uncalled for...sorry

Damn...it's Monday again...

Sorry for the foul mood...|||You were doing fine until you tried to get fancy. me? fancy? that'll be the day

1. Insert the duplicate records into a temp table with an identity column.why does the temp table need an identity column?|||Literacy is a wonderful thing. :) You don't. I'm smoking crack and up too late. You do need the rest though.

It's just that Monday thing. (We'll all use that as an excuse today.)|||(We'll all use that as an excuse today.)

i even use it on other days as well

;)|||There is nothing to laugh brett.........

Database can be wroungly designed....................After all its human act n instinct. Else there won't b any difference between u n god. And I assume that, ur nt GOD. Just a human........................who is bound to do the mistakes................n that is wht it hapnd with th table i was talking about.
But it will more more disgusting, if that wrong design is not corrected in time. And that is why we are here to discuss. Don't make fuss of the issue. Even you may had mistakes in your early days of career. And lemme tell you one thing...... the person who has designed the databse is from genetics background. So these are the expected errors. And now I have been assigned to remove the duplicated rows, and make the table perfect, in the sense that, no more dulicated rows should be created.|||Brett ... Should we call that a second one :p

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=35538|||And now I have been assigned to remove the duplicated rows, and make the table perfect, in the sense that, no more dulicated rows should be created.

What's with getting all bent out of shape about?

My POINT...is that you do not have duplicates...Where's the table DDL?

What do you call a duplicate?

Where's the sample data?

Where's ANYTHING you can supply the good people of dbforums

And why are you so offended...can you point out what I said that caused so much angst?

So I'll know to be more careful next time?

And hey...didn't you notice my label?|||After all its human act n instinct. .

Well it was a human that started all of this business...more dba's should understand his concepts

http://en.wikipedia.org/wiki/Edgar_F._Codd

Just a human who is bound to do the mistakes

Sure enough...kinda like answering questions

the person who has designed the databse is from genetics background. So these are the expected errors.

Watch your genome...if he asks you to help in a research project...look out...

And now I have been assigned to remove the duplicated rows, and make the table perfect, in the sense that, no more dulicated rows should be created.

So now back to the real issue.

Please

Give us the DDL of the table

Like

CREATE TABLE myTable99(Col1 int, Col2 char(10), ect...

Some sample "duplicate" data

INSERT INTO myTable99(Col1, Col2, ect..
SELECT data,data,data UNION ALL
SELECT data,data,data UNION ALL
SELECT data,data,data UNION ALL
SELECT data,data,data UNION ALL
...
SELECT data,data,data

And what the expected results should be...

OK?|||this is like pushing a rope

never mind the ddl, brett

ask them merely to define what a duplicate is

remind them that autonumber/identity columns used as the pk merely masks the real issue

the real issue is, what's a duplicate?

let them do the work, not you :cool:|||OK...

Define a Duplicate

(shameful, isn't it?)

:rolleyes:|||Haven't we already defined what a duplicate is. lol I'm sooooo confused now. All these thoughts duplicating in my heads. I mean it's so confusing. All those thoughts duplicating in my headers...my heads...my my my

I'm confused.

Really.

Friday, February 17, 2012

Delete Query ?

>> Does the OP want portability? Why spend extra effort in coding for someth
ing that may never be required (ever), lets face it an extremely small propo
rtion of dev's need SQL portability. <<
I totally disagree with that last statement, based on 35 years in the
trade. Proprietary code bites you in ass from product to product and
even from release to release of the same product. To think that
writing Standard SQL is an "extra effort" is simply wrong. Even for
good programmers, trying to remember a dozen different syntaxes is more
effort than to use the real language.
Run the code; read the standards.
I also used a fixed length CHAR(n) column because the fixed length is a
good check on the data element that can put into a regular expression
and make maintaining the table easier. VARCHAR(n) can save storage,
but at the expense of access time and indexing.> I totally disagree with that last statement, based on 35 years in the
> trade. Proprietary code bites you in ass from product to product and
> even from release to release of the same product. To think that
> writing Standard SQL is an "extra effort" is simply wrong. Even for
> good programmers, trying to remember a dozen different syntaxes is more
> effort than to use the real language.
35 years, 20 years makes no difference, its what you have done in that time,
for the past 20 years i've been cutting code using a number of different
databases and languages so i can speak from developer experience - can you
(really) or is it just SQL?
Propritary code does not bite you in the ass, standard SQL is a myth, it
behaves different between products anyway - locking behaviour, query
behaviour etc...
Tools between products are different as well, the way products work etc...
Writing standard code involves significantly more effort, you need to test
for each platform you tell your customers you support - is that effort
really worth it? There are only a few companies that think so.
Good programmers know to take advantage of the tools that are in front of
them to get the most efficient business solution for the business problem
they are coding for, they do not indulge themselves in IT solutions for the
sake of a long lost and forgotten portability standard.
Products are diverging release by release.

> I also used a fixed length CHAR(n) column because the fixed length is a
> good check on the data element that can put into a regular expression
> and make maintaining the table easier. VARCHAR(n) can save storage,
> but at the expense of access time and indexing.
That is just complete rubbish - check the product specs as you put it, using
varchar is not at the expense of access time nor indexing, in fact its
better because the data is on less pages so you need less IO's, less locking
etc...
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials
"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1132863499.248876.91150@.g44g2000cwa.googlegroups.com...
> I totally disagree with that last statement, based on 35 years in the
> trade. Proprietary code bites you in ass from product to product and
> even from release to release of the same product. To think that
> writing Standard SQL is an "extra effort" is simply wrong. Even for
> good programmers, trying to remember a dozen different syntaxes is more
> effort than to use the real language.
>
> Run the code; read the standards.
> I also used a fixed length CHAR(n) column because the fixed length is a
> good check on the data element that can put into a regular expression
> and make maintaining the table easier. VARCHAR(n) can save storage,
> but at the expense of access time and indexing.
>