Showing posts with label address. Show all posts
Showing posts with label address. Show all posts

Thursday, March 22, 2012

Deleting dupes in special cases

I need to delete all rows that match at least one of the account_id values of another row *and* that has the same email address. However, if they have the same email address and none of the account_id values then I need to keep it. I've attached a sample dataset along with the expected results.

I have this:
DELETE [acctID_emailAddress_tmp] FROM [acctID_emailAddress_tmp]
JOIN
(select emailaddress, account_id, max(contact_id_tmp) max_cid
from [acctID_emailAddress_tmp]
group by emailaddress, account_id) AS tempImportTable
ON tempImportTable.[emailaddress] = [acctID_emailAddress_tmp].[emailaddress]
WHERE [acctID_emailAddress_tmp].[contact_id_tmp] < tempImportTable.[max_cid]
AND tempImportTable.[account_id] = [acctID_emailAddress_tmp].[account_id];

but it doesn't work since it's keeping the subset of the dupe row(s).

Can someone shed some light?

TIASo why do you keep 2 instead of 1?

EDIT: You won't be able to answer that one..unless you say that because it comes "first"...is that it?|||Because it comes first :)|||OK

USE Northwind
GO

SET NOCOUNT ON
CREATE TABLE myTable99(temp_id int, contact_id_tmp int, account_id int, emailAddress varchar(255))
GO

INSERT INTO myTable99(temp_id, contact_id_tmp, account_id, emailAddress)
SELECT 1, 1, 3, 'test@.acme.com' UNION ALL
SELECT 2, 1, 5, 'test@.acme.com' UNION ALL
SELECT 3, 1, 10, 'test@.acme.com' UNION ALL
SELECT 4, 2, 12, 'test@.acme.com' UNION ALL
SELECT 5, 2, 3, 'test@.acme.com' UNION ALL
SELECT 6, 2, 9, 'test@.acme.com'
GO

SELECT *
FROM ( SELECT * FROM myTable99 l
WHERE contact_id_tmp IN (
SELECT MAX(contact_id_tmp)
FROM (
SELECT contact_id_tmp, emailAddress
FROM myTable99
GROUP BY contact_id_tmp, emailAddress
HAVING COUNT(*) > 1) AS XXX)
) AS L
JOIN ( SELECT * FROM myTable99 l
WHERE contact_id_tmp NOT IN (
SELECT MAX(contact_id_tmp)
FROM (
SELECT contact_id_tmp, emailAddress
FROM myTable99
GROUP BY contact_id_tmp, emailAddress
HAVING COUNT(*) > 1) AS XXX)
) AS R
ON L.account_id = R.account_Id
GO

SET NOCOUNT OFF
DROP TABLE myTable99
GO

You can convert that in to a delete...let me know if you can't|||Thanks very much. Looks like this will do the trick if I ever figure out the delete statement for it =\|||Did you cut and paste the example in to QA and run it?

Also, I would use that as SELECT to see the population BEFORE you unleash the delete...oh and make sure you back up the table BEFORE you do the DELETE|||Yes, I ran it in QA and got the expected results. I need to delete R.contact_id_tmp

Currently trying to figure out the delete statement...|||Come on...that's too easy

DELETE FROM myTable99 WHERE contact_id_tmp IN (
SELECT R.contact_id_tmp
FROM ( SELECT * FROM myTable99 l
WHERE contact_id_tmp IN (
SELECT MAX(contact_id_tmp)
FROM (
SELECT contact_id_tmp, emailAddress
FROM myTable99
GROUP BY contact_id_tmp, emailAddress
HAVING COUNT(*) > 1) AS XXX)
) AS L
JOIN ( SELECT * FROM myTable99 l
WHERE contact_id_tmp NOT IN (
SELECT MAX(contact_id_tmp)
FROM (
SELECT contact_id_tmp, emailAddress
FROM myTable99
GROUP BY contact_id_tmp, emailAddress
HAVING COUNT(*) > 1) AS XXX)
) AS R
ON L.account_id = R.account_Id
)

GO

SELECT * FROM myTable99
GO|||OK. This just got uglier... Turns out accounts will be stored as a comma separated list in one field. See the attachment for an example.

Now, is it possible to de-dupe from a comma separated list all in one field? I know I could create a new table and loop over each account-emailAddress combo, but I'd like to know if it's possible to avoid doing this (since it will be much slower) and de-dupe straight from the source table.

Thanks again|||Read this

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=25830&SearchTerms=udf,csv,string

Then create a new table where all the values are on 1 row, then use the query in this thread|||I tried the UDF but I'm getting 0 rows back. I've attached the sample dataset (1 row) and the UDF call.|||You really should normalized your data...I feel I'm doing more harm than good...

EDIT: This is pure theft from this Article (http://www.sqlteam.com/item.asp?ItemID=2652) written by A most Valuable Yak (http://www.sqlteam.com/forums/pop_profile.asp?mode=display&id=1176) , Rob Volk (http://weblogs.sqlteam.com/robv/)

USE Northwind
GO

SET NOCOUNT ON
SET NOCOUNT ON
CREATE TABLE myTable99(temp_id int IDENTITY(1,1), contact_id_tmp int, account_id int, emailAddress varchar(255))
CREATE TABLE myTable00(contact_id int, accounts varchar(200), emailAddress varchar(255))
CREATE TABLE Tally(ID int)
GO

INSERT INTO myTable00(contact_id, accounts, emailAddress)
SELECT 1, '1,2,3,10,15', 'test@.acme.com' UNION ALL
SELECT 2, '5,10,20,30', 'test@.acme.com' UNION ALL
SELECT 3, '88,42', 'test@.acme.com' UNION ALL
SELECT 4, '1,2,3', 'test2@.acme.com'
GO

DECLARE @.x int
SELECT @.x = 1
WHILE @.x < 100
BEGIN
INSERT INTO Tally(ID) SELECT @.x
SELECT @.x = @.x + 1
END
GO

INSERT INTO myTable99(contact_id_tmp, Account_Id, emailAddress)
SELECT contact_id AS contact_id_tmp,
NullIf(SubString(',' + accounts + ',' , ID , CharIndex(',' , ',' + accounts + ',' , ID) - ID) , '') AS Account_Id,
emailAddress
FROM Tally, myTable00
WHERE ID <= Len(',' + accounts + ',') AND SubString(',' + accounts + ',' , ID - 1, 1) = ','
AND CharIndex(',' , ',' + accounts + ',' , ID) - ID > 0 --remove this line to keep NULL rows

SELECT * FROM myTable99
GO

-- Now just add the other SQL

SET NOCOUNT OFF
DROP TABLE myTable99
DROP TABLE myTable00
DROP TABLE Tally
GO|||This is exactly what I needed, but...users have decided to go back and separate each account in individual rows!!

Thanks a lot for your help. I guess I'll check out sqlteam.com from now on before posting here. :)|||Even if they do...the data is still not normalized if you need an operation like this...anyway, it was a neat exercise...|||This is just 1/3 of the entire process. The data is normalized in its final state.

Monday, March 19, 2012

Deleting a record

I have three tables

1. membership table(aspnet_membership table)

2. User Contacts table

3. Address table

These three table have relation ship with one another through a UserId field.

How will you set up cascaded delete on these tables, Like if I delete a user in the membership table I want the related records in the other tables to be deleted as well. Cascaded delete is it something done through code, or is it definde when the tables are created.

Please advice.

bendJoe:

is it definde when the tables are created.

Yes, you should define it when you create the tables. You can also use theALTER TABLE command if you want to add to existing tables.