I need to delete all rows that match at least one of the account_id values of another row *and* that has the same email address. However, if they have the same email address and none of the account_id values then I need to keep it. I've attached a sample dataset along with the expected results.
I have this:
DELETE [acctID_emailAddress_tmp] FROM [acctID_emailAddress_tmp]
JOIN
(select emailaddress, account_id, max(contact_id_tmp) max_cid
from [acctID_emailAddress_tmp]
group by emailaddress, account_id) AS tempImportTable
ON tempImportTable.[emailaddress] = [acctID_emailAddress_tmp].[emailaddress]
WHERE [acctID_emailAddress_tmp].[contact_id_tmp] < tempImportTable.[max_cid]
AND tempImportTable.[account_id] = [acctID_emailAddress_tmp].[account_id];
but it doesn't work since it's keeping the subset of the dupe row(s).
Can someone shed some light?
TIASo why do you keep 2 instead of 1?
EDIT: You won't be able to answer that one..unless you say that because it comes "first"...is that it?|||Because it comes first :)|||OK
USE Northwind
GO
SET NOCOUNT ON
CREATE TABLE myTable99(temp_id int, contact_id_tmp int, account_id int, emailAddress varchar(255))
GO
INSERT INTO myTable99(temp_id, contact_id_tmp, account_id, emailAddress)
SELECT 1, 1, 3, 'test@.acme.com' UNION ALL
SELECT 2, 1, 5, 'test@.acme.com' UNION ALL
SELECT 3, 1, 10, 'test@.acme.com' UNION ALL
SELECT 4, 2, 12, 'test@.acme.com' UNION ALL
SELECT 5, 2, 3, 'test@.acme.com' UNION ALL
SELECT 6, 2, 9, 'test@.acme.com'
GO
SELECT *
FROM ( SELECT * FROM myTable99 l
WHERE contact_id_tmp IN (
SELECT MAX(contact_id_tmp)
FROM (
SELECT contact_id_tmp, emailAddress
FROM myTable99
GROUP BY contact_id_tmp, emailAddress
HAVING COUNT(*) > 1) AS XXX)
) AS L
JOIN ( SELECT * FROM myTable99 l
WHERE contact_id_tmp NOT IN (
SELECT MAX(contact_id_tmp)
FROM (
SELECT contact_id_tmp, emailAddress
FROM myTable99
GROUP BY contact_id_tmp, emailAddress
HAVING COUNT(*) > 1) AS XXX)
) AS R
ON L.account_id = R.account_Id
GO
SET NOCOUNT OFF
DROP TABLE myTable99
GO
You can convert that in to a delete...let me know if you can't|||Thanks very much. Looks like this will do the trick if I ever figure out the delete statement for it =\|||Did you cut and paste the example in to QA and run it?
Also, I would use that as SELECT to see the population BEFORE you unleash the delete...oh and make sure you back up the table BEFORE you do the DELETE|||Yes, I ran it in QA and got the expected results. I need to delete R.contact_id_tmp
Currently trying to figure out the delete statement...|||Come on...that's too easy
DELETE FROM myTable99 WHERE contact_id_tmp IN (
SELECT R.contact_id_tmp
FROM ( SELECT * FROM myTable99 l
WHERE contact_id_tmp IN (
SELECT MAX(contact_id_tmp)
FROM (
SELECT contact_id_tmp, emailAddress
FROM myTable99
GROUP BY contact_id_tmp, emailAddress
HAVING COUNT(*) > 1) AS XXX)
) AS L
JOIN ( SELECT * FROM myTable99 l
WHERE contact_id_tmp NOT IN (
SELECT MAX(contact_id_tmp)
FROM (
SELECT contact_id_tmp, emailAddress
FROM myTable99
GROUP BY contact_id_tmp, emailAddress
HAVING COUNT(*) > 1) AS XXX)
) AS R
ON L.account_id = R.account_Id
)
GO
SELECT * FROM myTable99
GO|||OK. This just got uglier... Turns out accounts will be stored as a comma separated list in one field. See the attachment for an example.
Now, is it possible to de-dupe from a comma separated list all in one field? I know I could create a new table and loop over each account-emailAddress combo, but I'd like to know if it's possible to avoid doing this (since it will be much slower) and de-dupe straight from the source table.
Thanks again|||Read this
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=25830&SearchTerms=udf,csv,string
Then create a new table where all the values are on 1 row, then use the query in this thread|||I tried the UDF but I'm getting 0 rows back. I've attached the sample dataset (1 row) and the UDF call.|||You really should normalized your data...I feel I'm doing more harm than good...
EDIT: This is pure theft from this Article (http://www.sqlteam.com/item.asp?ItemID=2652) written by A most Valuable Yak (http://www.sqlteam.com/forums/pop_profile.asp?mode=display&id=1176) , Rob Volk (http://weblogs.sqlteam.com/robv/)
USE Northwind
GO
SET NOCOUNT ON
SET NOCOUNT ON
CREATE TABLE myTable99(temp_id int IDENTITY(1,1), contact_id_tmp int, account_id int, emailAddress varchar(255))
CREATE TABLE myTable00(contact_id int, accounts varchar(200), emailAddress varchar(255))
CREATE TABLE Tally(ID int)
GO
INSERT INTO myTable00(contact_id, accounts, emailAddress)
SELECT 1, '1,2,3,10,15', 'test@.acme.com' UNION ALL
SELECT 2, '5,10,20,30', 'test@.acme.com' UNION ALL
SELECT 3, '88,42', 'test@.acme.com' UNION ALL
SELECT 4, '1,2,3', 'test2@.acme.com'
GO
DECLARE @.x int
SELECT @.x = 1
WHILE @.x < 100
BEGIN
INSERT INTO Tally(ID) SELECT @.x
SELECT @.x = @.x + 1
END
GO
INSERT INTO myTable99(contact_id_tmp, Account_Id, emailAddress)
SELECT contact_id AS contact_id_tmp,
NullIf(SubString(',' + accounts + ',' , ID , CharIndex(',' , ',' + accounts + ',' , ID) - ID) , '') AS Account_Id,
emailAddress
FROM Tally, myTable00
WHERE ID <= Len(',' + accounts + ',') AND SubString(',' + accounts + ',' , ID - 1, 1) = ','
AND CharIndex(',' , ',' + accounts + ',' , ID) - ID > 0 --remove this line to keep NULL rows
SELECT * FROM myTable99
GO
-- Now just add the other SQL
SET NOCOUNT OFF
DROP TABLE myTable99
DROP TABLE myTable00
DROP TABLE Tally
GO|||This is exactly what I needed, but...users have decided to go back and separate each account in individual rows!!
Thanks a lot for your help. I guess I'll check out sqlteam.com from now on before posting here. :)|||Even if they do...the data is still not normalized if you need an operation like this...anyway, it was a neat exercise...|||This is just 1/3 of the entire process. The data is normalized in its final state.
Showing posts with label match. Show all posts
Showing posts with label match. Show all posts
Thursday, March 22, 2012
Wednesday, March 7, 2012
Delete where 2 fields do not match
I need to be able to delete records from 1 table that do not match 2 fields
on a 2nd table. I was trying to get the SELECT to work first to find the
records that are not full matches. Below is my code and would appreciate
any help. I thought of maybe a NOT IN() or NOT EXISTS() kind of code, but
was not sure. Thanks.
SELECT dbo.ClientWorkerStatus.ClientID, dbo.ClientWorkerStatus.WorkerID
FROM dbo.ClientWorkerStatus, ClientWorkerLink
WHERE (dbo.ClientWorkerStatus.ClientID = dbo.ClientWorkerLink.ClientID AND
dbo.ClientWorkerStatus.WorkerID <> dbo.ClientWorkerLink.WorkerID)
The problem with this is it gives me every record that does not match on
WorkerID. What I want is only those records that are on the
ClientWorkerStatus table that do not have exact match between ClientID and
WorkerID.David,
Can you post the DDL, sample data (INSERTs) and the desired resultset?
HTH
Jerry
"David" <dlchase@.lifetimeinc.com> wrote in message
news:OYqCKc2yFHA.3180@.TK2MSFTNGP14.phx.gbl...
>I need to be able to delete records from 1 table that do not match 2 fields
>on a 2nd table. I was trying to get the SELECT to work first to find the
>records that are not full matches. Below is my code and would appreciate
>any help. I thought of maybe a NOT IN() or NOT EXISTS() kind of code, but
>was not sure. Thanks.
> SELECT dbo.ClientWorkerStatus.ClientID, dbo.ClientWorkerStatus.WorkerID
> FROM dbo.ClientWorkerStatus, ClientWorkerLink
> WHERE (dbo.ClientWorkerStatus.ClientID = dbo.ClientWorkerLink.ClientID AND
> dbo.ClientWorkerStatus.WorkerID <> dbo.ClientWorkerLink.WorkerID)
> The problem with this is it gives me every record that does not match on
> WorkerID. What I want is only those records that are on the
> ClientWorkerStatus table that do not have exact match between ClientID and
> WorkerID.
>|||Try this,
SELECT ClientID, WorkerID
FROM ClientWorkerStatus C
WHERE (ClientID IN (SELECT ClientID
FROM ClientWorkerLink
WHERE (ClientID = C.ClientID) AND (WorkerID <>
C.WorkerID)))
"David" wrote:
> I need to be able to delete records from 1 table that do not match 2 field
s
> on a 2nd table. I was trying to get the SELECT to work first to find the
> records that are not full matches. Below is my code and would appreciate
> any help. I thought of maybe a NOT IN() or NOT EXISTS() kind of code, but
> was not sure. Thanks.
> SELECT dbo.ClientWorkerStatus.ClientID, dbo.ClientWorkerStatus.WorkerID
> FROM dbo.ClientWorkerStatus, ClientWorkerLink
> WHERE (dbo.ClientWorkerStatus.ClientID = dbo.ClientWorkerLink.ClientID AND
> dbo.ClientWorkerStatus.WorkerID <> dbo.ClientWorkerLink.WorkerID)
> The problem with this is it gives me every record that does not match on
> WorkerID. What I want is only those records that are on the
> ClientWorkerStatus table that do not have exact match between ClientID and
> WorkerID.
>
>|||Try,
delete dbo.ClientWorkerStatus
where not exists (
select *
from dbo.ClientWorkerLink as a
where a.ClientID = dbo.ClientWorkerStatus.ClientID
and a.WorkerID = dbo.ClientWorkerStatus.WorkerID
)
AMB
"David" wrote:
> I need to be able to delete records from 1 table that do not match 2 field
s
> on a 2nd table. I was trying to get the SELECT to work first to find the
> records that are not full matches. Below is my code and would appreciate
> any help. I thought of maybe a NOT IN() or NOT EXISTS() kind of code, but
> was not sure. Thanks.
> SELECT dbo.ClientWorkerStatus.ClientID, dbo.ClientWorkerStatus.WorkerID
> FROM dbo.ClientWorkerStatus, ClientWorkerLink
> WHERE (dbo.ClientWorkerStatus.ClientID = dbo.ClientWorkerLink.ClientID AND
> dbo.ClientWorkerStatus.WorkerID <> dbo.ClientWorkerLink.WorkerID)
> The problem with this is it gives me every record that does not match on
> WorkerID. What I want is only those records that are on the
> ClientWorkerStatus table that do not have exact match between ClientID and
> WorkerID.
>
>|||Got it! Thanks.
*** Sent via Developersdex http://www.examnotes.net ***
on a 2nd table. I was trying to get the SELECT to work first to find the
records that are not full matches. Below is my code and would appreciate
any help. I thought of maybe a NOT IN() or NOT EXISTS() kind of code, but
was not sure. Thanks.
SELECT dbo.ClientWorkerStatus.ClientID, dbo.ClientWorkerStatus.WorkerID
FROM dbo.ClientWorkerStatus, ClientWorkerLink
WHERE (dbo.ClientWorkerStatus.ClientID = dbo.ClientWorkerLink.ClientID AND
dbo.ClientWorkerStatus.WorkerID <> dbo.ClientWorkerLink.WorkerID)
The problem with this is it gives me every record that does not match on
WorkerID. What I want is only those records that are on the
ClientWorkerStatus table that do not have exact match between ClientID and
WorkerID.David,
Can you post the DDL, sample data (INSERTs) and the desired resultset?
HTH
Jerry
"David" <dlchase@.lifetimeinc.com> wrote in message
news:OYqCKc2yFHA.3180@.TK2MSFTNGP14.phx.gbl...
>I need to be able to delete records from 1 table that do not match 2 fields
>on a 2nd table. I was trying to get the SELECT to work first to find the
>records that are not full matches. Below is my code and would appreciate
>any help. I thought of maybe a NOT IN() or NOT EXISTS() kind of code, but
>was not sure. Thanks.
> SELECT dbo.ClientWorkerStatus.ClientID, dbo.ClientWorkerStatus.WorkerID
> FROM dbo.ClientWorkerStatus, ClientWorkerLink
> WHERE (dbo.ClientWorkerStatus.ClientID = dbo.ClientWorkerLink.ClientID AND
> dbo.ClientWorkerStatus.WorkerID <> dbo.ClientWorkerLink.WorkerID)
> The problem with this is it gives me every record that does not match on
> WorkerID. What I want is only those records that are on the
> ClientWorkerStatus table that do not have exact match between ClientID and
> WorkerID.
>|||Try this,
SELECT ClientID, WorkerID
FROM ClientWorkerStatus C
WHERE (ClientID IN (SELECT ClientID
FROM ClientWorkerLink
WHERE (ClientID = C.ClientID) AND (WorkerID <>
C.WorkerID)))
"David" wrote:
> I need to be able to delete records from 1 table that do not match 2 field
s
> on a 2nd table. I was trying to get the SELECT to work first to find the
> records that are not full matches. Below is my code and would appreciate
> any help. I thought of maybe a NOT IN() or NOT EXISTS() kind of code, but
> was not sure. Thanks.
> SELECT dbo.ClientWorkerStatus.ClientID, dbo.ClientWorkerStatus.WorkerID
> FROM dbo.ClientWorkerStatus, ClientWorkerLink
> WHERE (dbo.ClientWorkerStatus.ClientID = dbo.ClientWorkerLink.ClientID AND
> dbo.ClientWorkerStatus.WorkerID <> dbo.ClientWorkerLink.WorkerID)
> The problem with this is it gives me every record that does not match on
> WorkerID. What I want is only those records that are on the
> ClientWorkerStatus table that do not have exact match between ClientID and
> WorkerID.
>
>|||Try,
delete dbo.ClientWorkerStatus
where not exists (
select *
from dbo.ClientWorkerLink as a
where a.ClientID = dbo.ClientWorkerStatus.ClientID
and a.WorkerID = dbo.ClientWorkerStatus.WorkerID
)
AMB
"David" wrote:
> I need to be able to delete records from 1 table that do not match 2 field
s
> on a 2nd table. I was trying to get the SELECT to work first to find the
> records that are not full matches. Below is my code and would appreciate
> any help. I thought of maybe a NOT IN() or NOT EXISTS() kind of code, but
> was not sure. Thanks.
> SELECT dbo.ClientWorkerStatus.ClientID, dbo.ClientWorkerStatus.WorkerID
> FROM dbo.ClientWorkerStatus, ClientWorkerLink
> WHERE (dbo.ClientWorkerStatus.ClientID = dbo.ClientWorkerLink.ClientID AND
> dbo.ClientWorkerStatus.WorkerID <> dbo.ClientWorkerLink.WorkerID)
> The problem with this is it gives me every record that does not match on
> WorkerID. What I want is only those records that are on the
> ClientWorkerStatus table that do not have exact match between ClientID and
> WorkerID.
>
>|||Got it! Thanks.
*** Sent via Developersdex http://www.examnotes.net ***
Subscribe to:
Posts (Atom)