Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will increasing the precision of a column with ALTER affect database performance while executing on bulk data?

I have an SQL table in a live database which has more than 6 millions rows and I want to increase the precision of a particular column:

ALTER TABLE sales ALTER COLUMN amount DECIMAL(8,4)

Now my question is if I execute above query, will SQL Server recalculate each cell and write back new values into same cell? (If yes, that means it will definitely take a long time to execute and our other activities on database will be affected.) Or, will the statement be executed some other way?

like image 568
CIPHER Avatar asked Sep 22 '14 12:09

CIPHER


People also ask

Will number of columns in a table can affect performance in SQL Server?

There will be no performance difference based on the column position.

How do I change the precision of a column in SQL?

Just put decimal(precision, scale) , replacing the precision and scale with your desired values. I haven't done any testing with this with data in the table, but if you alter the precision, you would be subject to losing data if the new precision is lower.

How do I make my query run faster?

The way to make a query run faster is to reduce the number of calculations that the software (and therefore hardware) must perform. To do this, you'll need some understanding of how SQL actually makes calculations.

How do I change the precision of a column in Oracle?

By setting the scale, you decrease the precision. Try NUMBER(16,2). Can't you create a new table definition that duplicates the existing one except for the NUMBER(14.2) column and perform an "insert into newtable select ... from oldtable" - and when you are sure of success, delete the old and rename the new?


2 Answers

A quick correction

A conversion of decimal(8, 2) to decimal(8, 4) isn't actually increasing the precision; it's increasing the scale.

From decimal/numeric - TSQL:

  • precision The maximum total number of decimal digits that can be stored, both to the left and to the right of the decimal point.
  • scale The maximum number of decimal digits that can be stored to the right of the decimal point. Scale must be a value from 0 through p.

So the overall number of digits (and therefore, storage requirements) have not changed.

The answer

If you have any value over 9999.9999, you're going to have a bad time in the form of an arithmetic overflow error. To accommodate all possible (8, 2) values, you would need to increase the column to decimal(10, 4) instead.

This, however, will increase the storage requirements for your column, going from 5 to 9 bytes. As a result, this is equivalent to an update statement in terms of its impact on availability and the transaction log.

What I have found based on testing, is that for at least SQL Server 2008R2, the increased precision will only incur an update of the data if the storage requirements change. Essentially if the column's new precision value is in the same storage (or smaller, and no truncation occurs) size category as the previous, then the table data is untouched.

For a 6-million row table, the impact on the transaction log will be around 2.5GB. It won't necessarily grow by this amount, but that's how much space it will consume. My testing used around 2-million rows, and the alter statement caused the logs to grow from 1MB to ~850MB.

As for the impact on performance (how long it will take), without knowing anything about your server's hardware and load it's impossible to say. If you were sufficiently concerned to want to avoid modifying the table in place, your best approach is probably a table swap:

Create a new table (sales_tmp) with the desired schema, and copy the data:

insert sales_tmp
select * from sales;

If you can ensure that the sales table won't be modified during the operation, you don't have to worry about preventing it with transactions and locks. Otherwise, a repeatable read transaction should suffice, and at least won't block reads to the sales table during the operation.

Then:

  1. Replicate permissions on the new table
  2. Recreate indexes & foreign key references
  3. Drop old table
  4. Rename new (sales_tmp to sales)

This can have some issues if you have replication or other fancy things setup on the affected table. Those aren't trivial to disable and re-enable, sadly.

Transaction log

If you are worried about the impact on the transaction log of updating 6-million records, you will need to update the records in batches. The size of which will vary depending on your needs. I would recommend 1,000-10,000.

IMO, I don't think you really need to worry about this, unless your database server is really strapped for free space, but this information might prove useful for future.

If your recovery model is SIMPLE, so too is log containment. If it is FULL, things are harder.

In spite of what Blam claims in his answer, his loop implementation will absolutely not guarantee any containment of the transaction log file.

SIMIPLE recovery

Issue a checkpoint statement at the end of the loop body to ensure log data is flushed. SQL Server periodically does this anyway and in most cases you won't notice, but in some cases you will. Below is highly simiplified psudeocode:

while @rows_left > 0
begin
    -- update/copy rows
    checkpoint;
end

FULL recovery

This is trickier because the log grows indefinitely until you take a log backup, which marks the log pages as inactive and enables SQL Server to re-use the space already allocated for the log files. I won't go into this in detail, other than to recommend a few resources:

  • Managing the Transaction Log
  • BACKUP (TSQL)
like image 198
md4 Avatar answered Nov 14 '22 18:11

md4


I do not come to the same conclusion/findings as Matt

decimal and numeric (Transact-SQL)

For the decimal and numeric data types, SQL Server considers each specific combination of precision and scale as a different data type. For example, decimal(5,5) and decimal(5,0) are considered different data types.

(8,4) is a different type from (8,2) and must be cast.

And you are not increasing the precision. Same precision - different scale.

decimal(8,2) 12356.78 will NOT cast to decimal(8,2).
I tested on a decimal(8,2) column that contained that value SSMS would not let me change it to decimal(8,4)

You will need to go to decimal(10,4) and it will cast / convert

This fails on the set @dec84 = @dec82;

declare @dec82 decimal(8,2);
declare @dec84 decimal(8,4);
set @dec82 = 123456.78;
set @dec84 = 1234.5678;
print @dec82;
print @dec84;
set @dec84 = @dec82;
print @dec84;

Add a column to the existing table with the proper precision.
No index.
And add it as the last column!
If not the last column it may try and move data around

To protect the transaction log update in batches
100 is just an example

update top (100) table 
set newCol = oldCol 
where newCol is null  

If it is indexed then add that here
Then finish with a

update table 
set newCol = oldCol 
where newCol <> oldCol 

Then rename newCol to oldCol

This is how I do loops to contain the transaction log
Usually in chunks of 10000

declare @rowCount Int;
Set @rowCount = 1
While @rowCount > 0
Begin
   update top (1) [test].[dbo].[DateDateTime] 
   set [ddateTimeNoTime] = '2014-11-12'
   where [ddateTimeNoTime] <> '2014-11-12'
   set @rowCount = @@rowcount;
End
like image 21
paparazzo Avatar answered Nov 14 '22 17:11

paparazzo