Showing posts with label millions. Show all posts
Showing posts with label millions. Show all posts

Sunday, March 25, 2012

Deleting large volume of data ..

Hi,
Currently we are integrating with a POS system. We would be importing
millions of records into our system from a SQL Server 2005 database. After
the import, we should clear (ie delete all records) in the SQL Server
database.
Ideally I want to
- write stored procs for deleting data and
- keep the system and transaction log resources to as low as possible
My query is - as a general rule, what would be the best way to delete data
of this size?
Thanks,
Harish Mohanbabu
--
Microsoft Dynamics Ax [MVP]
http://www.harishm.com/Two ways:
1) If the table is not referenced by other tables via foreign keys, use
TRUNCATE table.
2) Delete in chunks - say, 10,000 rows at a time - using DELETE TOP
(10000).
--
Tom
----
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
SQL Server MVP
Toronto, ON Canada
https://mvp.support.microsoft.com/profile/Tom.Moreau
"Harish Mohanbabu" <Axapta@.online.nospam> wrote in message
news:754FE6E6-EE6E-4E76-9845-5FB6DD83EFAA@.microsoft.com...
Hi,
Currently we are integrating with a POS system. We would be importing
millions of records into our system from a SQL Server 2005 database. After
the import, we should clear (ie delete all records) in the SQL Server
database.
Ideally I want to
- write stored procs for deleting data and
- keep the system and transaction log resources to as low as possible
My query is - as a general rule, what would be the best way to delete data
of this size?
Thanks,
Harish Mohanbabu
--
Microsoft Dynamics Ax [MVP]
http://www.harishm.com/|||Hi Tom,
Thanks for your suggestion. Good news is there is no FK reference. So I
guess 'Truncate' would be the obvious choice.
Thanks once again,
Harish Mohanbabu
--
Microsoft Dynamics Ax [MVP]
http://www.harishm.com/
"Tom Moreau" wrote:
> Two ways:
> 1) If the table is not referenced by other tables via foreign keys, use
> TRUNCATE table.
> 2) Delete in chunks - say, 10,000 rows at a time - using DELETE TOP
> (10000).
> --
> Tom
> ----
> Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
> SQL Server MVP
> Toronto, ON Canada
> https://mvp.support.microsoft.com/profile/Tom.Moreau|||And TRUNCATE won't increase yout transaction log size as well...
"Harish Mohanbabu" <Axapta@.online.nospam> wrote in message
news:3509DF4B-7CA9-457B-AC12-58B8902ED294@.microsoft.com...
> Hi Tom,
> Thanks for your suggestion. Good news is there is no FK reference. So I
> guess 'Truncate' would be the obvious choice.
> Thanks once again,
> Harish Mohanbabu
> --
> Microsoft Dynamics Ax [MVP]
> http://www.harishm.com/
> "Tom Moreau" wrote:
>> Two ways:
>> 1) If the table is not referenced by other tables via foreign keys, use
>> TRUNCATE table.
>> 2) Delete in chunks - say, 10,000 rows at a time - using DELETE TOP
>> (10000).
>> --
>> Tom
>> ----
>> Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
>> SQL Server MVP
>> Toronto, ON Canada
>> https://mvp.support.microsoft.com/profile/Tom.Moreausql

Sunday, February 19, 2012

Delete Records Fast

I run a query;

delete * from abc where date >= @.start and date < @.end

This query deletes millions of records but takes a very long time. Is there a way to delete records in faster way other that above query. Further, for your information the table 'abc' is used as article in merge replication. So deletion also implements on other replicated table.

Thanks,

A deletion of "millions of records" causes a lot of Transaction Log activity. This 'could' be done in smaller batches and that might relieve some of the Transaction Log pressure -but I'm not sure how that would work out with replication.|||The only way I know:

Code Snippet

while 1=1 begin
delete a from dbo.abc a
inner join (
select top 10000 Id from dbo.abc
where date >= @.start and date < @.end
) b on a.Id = b.Id

if @.@.rowcount < 10000 break
end


It's not much, but there's not much to do in this case since you cannot use TRUNCATE TABLE and there's a replication attached to your DB.|||

Please take a look at the link below for a technique using TOP clause in SQL Server 2005:

http://blogs.msdn.com/sqltips/archive/2005/05/31/423627.aspx

Note that you need to enclose the batched DELETE in a user transaction to get the same behavior as single DELETE statement in which case performance will be as bad as what you have. Otherwise, you can delete N rows at a time. Alternatively, if the number of rows being deleted is as large as the table then simply do a SELECT...INTO to create a new table and swap the old & new table. This approach is complicated if there are other transactions on the table or if there are lot of references to the table etc.

|||

thanks for the code. I believe this will run delete in batches of 10000 but will it affect existing merge replication? What will be the effect of subscriber once this deletion occurs.

Thanks,

|||Well, the merge replication will repeat the same action on the subscriber, if I remember right. That is, consequent deletes of groups of records, 10000 at once... Whilst your logic knows that a deletion must be done on both the publisher and the subscriber, nothing is wrong here.

Just not try to wrap the code above in a single transaction, or it will become senseless Smile. Only default autocommit mode will suffice.