Sunday, March 11, 2012

DELETING 100 million from a table weekly SQl SERVER 2000

DELETING 100 million from a table weekly SQl SERVER 2000
Hi All
We have a table in SQL SERVER 2000 which has about 250 million records
and this will be growing by 100 million every week. At a time the table
should contain just 13 weeks of data. when the 14th week data needs to
be loaded the first week's data has to be deleted.
And this deletes 100 million every week, since the delete is taking lot
of transaction log space the job is not successful.
Can you please help with what are the approaches we can take to fix
this problem?
Performance and transaction log are the issues we are facing. We tried
deletion in steps too but that also is taking time. What are the
different ways we can address this quickly.
Please reply at the earliest.
Thanks
HarishYou could try inserting the clean records i.e. Current 13 weeks data into a
a temporary table.
Then truncate the table with all data in (bear in mind any identity
columns).
Reinsert the clean records into the table.
Again, bear in mind any idientity columns if you need to keep the ids in
sync as truncate will reset the seed count, also, you can't truncate a table
with a FK so you'd need to work around that too.
Not the best method but may be quicker than your current process.
Immy
"harish" <harish.prabhala@.gmail.com> wrote in message
news:1131163435.343059.179520@.f14g2000cwb.googlegroups.com...
> DELETING 100 million from a table weekly SQl SERVER 2000
> Hi All
> We have a table in SQL SERVER 2000 which has about 250 million records
> and this will be growing by 100 million every week. At a time the table
> should contain just 13 weeks of data. when the 14th week data needs to
> be loaded the first week's data has to be deleted.
> And this deletes 100 million every week, since the delete is taking lot
> of transaction log space the job is not successful.
> Can you please help with what are the approaches we can take to fix
> this problem?
> Performance and transaction log are the issues we are facing. We tried
> deletion in steps too but that also is taking time. What are the
> different ways we can address this quickly.
> Please reply at the earliest.
> Thanks
> Harish
>|||> Performance and transaction log are the issues we are facing. We tried
> deletion in steps too but that also is taking time. What are the
> different ways we can address this quickly.
Deleting in smaller separate transaction batches will reduce transaction log
space requirements and improve performance You'll need to experiment to
determine the best batch size for your environment. In my experience, the
optimum size is somewhere around the amount of SQL Server memory. For
example, with 2GB RAM and a 100 byte row size, delete no more than 20M rows
at a time. If you are using the FULL are BULK_LOGGED recovery model, you'll
also need to backup the log between batches to keep the log size manageable.
Alternatively, you can use change to the SIMPLE model for the duration of
the delete script and change back to your normal recovery model afterward.
Don't forget to perform a full database backup following the change from
SIMPLE recovery.
Use the TABLOCKX hint if possible. Ideally, the table's clustered index
should be the column(s) used for your delete criteria.
Another option is to partition data based on your delete criteria (separate
table for each week). This will allow you to simply drop the table
containing the oldest data. The partitioning implementation can be made
transparent to applications by using a UNION ALL view. After dropping the
oldest table, you create a new table for the latest data and change the view
accordingly. If you can adhere to the rules for local partitioned views as
described in the Books Online, there are performance advantages with the
partitioned view and the view is updatable as well. See the example below.
CREATE TABLE OrderDetails_20051030
(
OrderID int NOT NULL,
ProductID int NOT NULL,
Quantity int NOT NULL,
UnitPrice decimal(9, 2) NOT NULL,
OrderDate smalldatetime NOT NULL
CONSTRAINT CK_OrderDetails_20051030
CHECK (OrderDate > '20051023' AND OrderDate <= '20051030')
)
ALTER TABLE OrderDetails_20051030
ADD CONSTRAINT PK_OrderDetails_20051030
PRIMARY KEY(OrderDate, OrderID, ProductID)
GO
CREATE TABLE OrderDetails_20051106
(
OrderID int NOT NULL,
ProductID int NOT NULL,
Quantity int NOT NULL,
UnitPrice decimal(9, 2) NOT NULL,
OrderDate smalldatetime NOT NULL
CONSTRAINT CK_OrderDetails_20051106
CHECK (OrderDate > '20051030' AND OrderDate <= '20051106')
)
ALTER TABLE OrderDetails_20051106
ADD CONSTRAINT PK_OrderDetailsOrderDetails_20051106
PRIMARY KEY(OrderDate, OrderID, ProductID)
GO
CREATE TABLE OrderDetails_20051113
(
OrderID int NOT NULL,
ProductID int NOT NULL,
Quantity int NOT NULL,
UnitPrice decimal(9, 2) NOT NULL,
OrderDate smalldatetime NOT NULL
CONSTRAINT CK_OrderDetails_20051113_OrderDate
CHECK (OrderDate > '20051106' AND OrderDate <= '20051113')
)
ALTER TABLE OrderDetails_20051113
ADD CONSTRAINT PK_OrderDetails_20051113
PRIMARY KEY(OrderDate, OrderID, ProductID)
GO
CREATE VIEW OrderDetails AS
SELECT
OrderID,
ProductID,
Quantity,
UnitPrice,
OrderDate
FROM OrderDetails_20051030
UNION ALL
SELECT
OrderID,
ProductID,
Quantity,
UnitPrice,
OrderDate
FROM OrderDetails_20051106
UNION ALL
SELECT
OrderID,
ProductID,
Quantity,
UnitPrice,
OrderDate
FROM OrderDetails_20051113
GO
INSERT INTO OrderDetails VALUES(1, 1, 1, 1, '20051030')
INSERT INTO OrderDetails VALUES(2, 1, 1, 1, '20051106')
INSERT INTO OrderDetails VALUES(3, 1, 1, 1, '20051113')
GO
--execution plan shows only OrderDetails_20051106 is accessed because
--all PK columns are referenced in this query
SELECT
a.OrderID,
a.ProductID,
a.Quantity,
a.UnitPrice,
a.OrderDate
FROM OrderDetails a
WHERE a.OrderDate = '20051106' AND OrderID = 2 AND ProductID = 1
GO
--to remove oldest week:
DROP VIEW OrderDetails
GO
--drop oldest table
DROP TABLE OrderDetails_20051030
GO
--create table for new data
CREATE TABLE OrderDetails_20051120
(
OrderID int NOT NULL,
ProductID int NOT NULL,
Quantity int NOT NULL,
UnitPrice decimal(9, 2) NOT NULL,
OrderDate smalldatetime NOT NULL
CONSTRAINT CK_OrderDetails_20051120_OrderDate
CHECK (OrderDate > '20051113' AND OrderDate <= '20051120')
)
ALTER TABLE OrderDetails_20051120
ADD CONSTRAINT PK_OrderDetails_20051120
PRIMARY KEY(OrderDate, OrderID, ProductID)
GO
--create view with current tables
CREATE VIEW OrderDetails AS
SELECT
OrderID,
ProductID,
Quantity,
UnitPrice,
OrderDate
FROM OrderDetails_20051106
UNION ALL
SELECT
OrderID,
ProductID,
Quantity,
UnitPrice,
OrderDate
FROM OrderDetails_20051113
UNION ALL
SELECT
OrderID,
ProductID,
Quantity,
UnitPrice,
OrderDate
FROM OrderDetails_20051120
GO
Hope this helps.
Dan Guzman
SQL Server MVP
"harish" <harish.prabhala@.gmail.com> wrote in message
news:1131163435.343059.179520@.f14g2000cwb.googlegroups.com...
> DELETING 100 million from a table weekly SQl SERVER 2000
> Hi All
> We have a table in SQL SERVER 2000 which has about 250 million records
> and this will be growing by 100 million every week. At a time the table
> should contain just 13 weeks of data. when the 14th week data needs to
> be loaded the first week's data has to be deleted.
> And this deletes 100 million every week, since the delete is taking lot
> of transaction log space the job is not successful.
> Can you please help with what are the approaches we can take to fix
> this problem?
> Performance and transaction log are the issues we are facing. We tried
> deletion in steps too but that also is taking time. What are the
> different ways we can address this quickly.
> Please reply at the earliest.
> Thanks
> Harish
>|||I forgot to mention that SQL 2005 Enterprise introduces new table and index
partitioning features that make it easier to partition large tables.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Dan Guzman" <guzmanda@.nospam-online.sbcglobal.net> wrote in message
news:%23$vpN7h4FHA.3000@.TK2MSFTNGP12.phx.gbl...
>> Performance and transaction log are the issues we are facing. We tried
>> deletion in steps too but that also is taking time. What are the
>> different ways we can address this quickly.
> Deleting in smaller separate transaction batches will reduce transaction
> log space requirements and improve performance You'll need to experiment
> to determine the best batch size for your environment. In my experience,
> the optimum size is somewhere around the amount of SQL Server memory. For
> example, with 2GB RAM and a 100 byte row size, delete no more than 20M
> rows at a time. If you are using the FULL are BULK_LOGGED recovery model,
> you'll also need to backup the log between batches to keep the log size
> manageable. Alternatively, you can use change to the SIMPLE model for the
> duration of the delete script and change back to your normal recovery
> model afterward. Don't forget to perform a full database backup following
> the change from SIMPLE recovery.
> Use the TABLOCKX hint if possible. Ideally, the table's clustered index
> should be the column(s) used for your delete criteria.
> Another option is to partition data based on your delete criteria
> (separate table for each week). This will allow you to simply drop the
> table containing the oldest data. The partitioning implementation can be
> made transparent to applications by using a UNION ALL view. After
> dropping the oldest table, you create a new table for the latest data and
> change the view accordingly. If you can adhere to the rules for local
> partitioned views as described in the Books Online, there are performance
> advantages with the partitioned view and the view is updatable as well.
> See the example below.
> CREATE TABLE OrderDetails_20051030
> (
> OrderID int NOT NULL,
> ProductID int NOT NULL,
> Quantity int NOT NULL,
> UnitPrice decimal(9, 2) NOT NULL,
> OrderDate smalldatetime NOT NULL
> CONSTRAINT CK_OrderDetails_20051030
> CHECK (OrderDate > '20051023' AND OrderDate <= '20051030')
> )
> ALTER TABLE OrderDetails_20051030
> ADD CONSTRAINT PK_OrderDetails_20051030
> PRIMARY KEY(OrderDate, OrderID, ProductID)
> GO
> CREATE TABLE OrderDetails_20051106
> (
> OrderID int NOT NULL,
> ProductID int NOT NULL,
> Quantity int NOT NULL,
> UnitPrice decimal(9, 2) NOT NULL,
> OrderDate smalldatetime NOT NULL
> CONSTRAINT CK_OrderDetails_20051106
> CHECK (OrderDate > '20051030' AND OrderDate <= '20051106')
> )
> ALTER TABLE OrderDetails_20051106
> ADD CONSTRAINT PK_OrderDetailsOrderDetails_20051106
> PRIMARY KEY(OrderDate, OrderID, ProductID)
> GO
> CREATE TABLE OrderDetails_20051113
> (
> OrderID int NOT NULL,
> ProductID int NOT NULL,
> Quantity int NOT NULL,
> UnitPrice decimal(9, 2) NOT NULL,
> OrderDate smalldatetime NOT NULL
> CONSTRAINT CK_OrderDetails_20051113_OrderDate
> CHECK (OrderDate > '20051106' AND OrderDate <= '20051113')
> )
> ALTER TABLE OrderDetails_20051113
> ADD CONSTRAINT PK_OrderDetails_20051113
> PRIMARY KEY(OrderDate, OrderID, ProductID)
> GO
> CREATE VIEW OrderDetails AS
> SELECT
> OrderID,
> ProductID,
> Quantity,
> UnitPrice,
> OrderDate
> FROM OrderDetails_20051030
> UNION ALL
> SELECT
> OrderID,
> ProductID,
> Quantity,
> UnitPrice,
> OrderDate
> FROM OrderDetails_20051106
> UNION ALL
> SELECT
> OrderID,
> ProductID,
> Quantity,
> UnitPrice,
> OrderDate
> FROM OrderDetails_20051113
> GO
> INSERT INTO OrderDetails VALUES(1, 1, 1, 1, '20051030')
> INSERT INTO OrderDetails VALUES(2, 1, 1, 1, '20051106')
> INSERT INTO OrderDetails VALUES(3, 1, 1, 1, '20051113')
> GO
> --execution plan shows only OrderDetails_20051106 is accessed because
> --all PK columns are referenced in this query
> SELECT
> a.OrderID,
> a.ProductID,
> a.Quantity,
> a.UnitPrice,
> a.OrderDate
> FROM OrderDetails a
> WHERE a.OrderDate = '20051106' AND OrderID = 2 AND ProductID = 1
> GO
> --to remove oldest week:
> DROP VIEW OrderDetails
> GO
> --drop oldest table
> DROP TABLE OrderDetails_20051030
> GO
> --create table for new data
> CREATE TABLE OrderDetails_20051120
> (
> OrderID int NOT NULL,
> ProductID int NOT NULL,
> Quantity int NOT NULL,
> UnitPrice decimal(9, 2) NOT NULL,
> OrderDate smalldatetime NOT NULL
> CONSTRAINT CK_OrderDetails_20051120_OrderDate
> CHECK (OrderDate > '20051113' AND OrderDate <= '20051120')
> )
> ALTER TABLE OrderDetails_20051120
> ADD CONSTRAINT PK_OrderDetails_20051120
> PRIMARY KEY(OrderDate, OrderID, ProductID)
> GO
> --create view with current tables
> CREATE VIEW OrderDetails AS
> SELECT
> OrderID,
> ProductID,
> Quantity,
> UnitPrice,
> OrderDate
> FROM OrderDetails_20051106
> UNION ALL
> SELECT
> OrderID,
> ProductID,
> Quantity,
> UnitPrice,
> OrderDate
> FROM OrderDetails_20051113
> UNION ALL
> SELECT
> OrderID,
> ProductID,
> Quantity,
> UnitPrice,
> OrderDate
> FROM OrderDetails_20051120
> GO
>
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> "harish" <harish.prabhala@.gmail.com> wrote in message
> news:1131163435.343059.179520@.f14g2000cwb.googlegroups.com...
>> DELETING 100 million from a table weekly SQl SERVER 2000
>> Hi All
>> We have a table in SQL SERVER 2000 which has about 250 million records
>> and this will be growing by 100 million every week. At a time the table
>> should contain just 13 weeks of data. when the 14th week data needs to
>> be loaded the first week's data has to be deleted.
>> And this deletes 100 million every week, since the delete is taking lot
>> of transaction log space the job is not successful.
>> Can you please help with what are the approaches we can take to fix
>> this problem?
>> Performance and transaction log are the issues we are facing. We tried
>> deletion in steps too but that also is taking time. What are the
>> different ways we can address this quickly.
>> Please reply at the earliest.
>> Thanks
>> Harish
>|||Hey
Thanks. We have an index on four columns in this table. For Ex A, B, C
and D
The delete statement's where clause has the conditions for A, B and C
The delete statement's where clause has the conditions for just A
Which of the two's performance will be faster?
We tried something like this:
SET ROWCOUNT 5000;
WHILE 1 = 1
BEGIN
DELETE FROM T1 WHERE dt < '20030101' -- original delete
IF @.@.rowcount < 5000 BREAK;
END
SET ROWCOUNT 0;
1) Does this setting ROWCOUNT first sort the table and then delete?
2) The above query is executed to delete all records satisfying the
condition in steps of 5000 until the delete is comple.
How can I stop it after one 5000?|||On 5 Nov 2005 11:01:30 -0800, harish wrote:
(snip)
Hi Harish,
I just replied to the same question in another thread.
Could you please ask your questions in JUST ONE place, and in JUST ONE
group? I've seen your messages scattered over several groups, and
several different messages in just this group. Many of them have
attracted replies. It's very hard to keep track of what is going on in
all thesse threads, and it's a waste of other people's time if someone
posts a reply to you that you already had received in another group.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||Hi
Sorry for that. I was new to the group and not sure which group is
active.
Thanks a lot for the information
Regards
Harish

No comments:

Post a Comment