I have an SQL table in a live database which has more than 6 millions rows and I want to increase the precision of a particular column:
ALTER TABLE sales ALTER COLUMN amount DECIMAL(8,4)
Now my question is if I execute above query, will SQL Server recalculate each cell and write back new values into same cell? (If yes, that means it will definitely take a long time to execute and our other activities on database will be affected.) Or, will the statement be executed some other way?
There will be no performance difference based on the column position.
Just put decimal(precision, scale) , replacing the precision and scale with your desired values. I haven't done any testing with this with data in the table, but if you alter the precision, you would be subject to losing data if the new precision is lower.
The way to make a query run faster is to reduce the number of calculations that the software (and therefore hardware) must perform. To do this, you'll need some understanding of how SQL actually makes calculations.
By setting the scale, you decrease the precision. Try NUMBER(16,2). Can't you create a new table definition that duplicates the existing one except for the NUMBER(14.2) column and perform an "insert into newtable select ... from oldtable" - and when you are sure of success, delete the old and rename the new?
A conversion of decimal(8, 2)
to decimal(8, 4)
isn't actually increasing the precision; it's increasing the scale.
From decimal/numeric - TSQL:
So the overall number of digits (and therefore, storage requirements) have not changed.
If you have any value over 9999.9999, you're going to have a bad time in the form of an arithmetic overflow error. To accommodate all possible (8, 2) values, you would need to increase the column to decimal(10, 4)
instead.
This, however, will increase the storage requirements for your column, going from 5 to 9 bytes. As a result, this is equivalent to an update
statement in terms of its impact on availability and the transaction log.
What I have found based on testing, is that for at least SQL Server 2008R2, the increased precision will only incur an update of the data if the storage requirements change. Essentially if the column's new precision value is in the same storage (or smaller, and no truncation occurs) size category as the previous, then the table data is untouched.
For a 6-million row table, the impact on the transaction log will be around 2.5GB. It won't necessarily grow by this amount, but that's how much space it will consume. My testing used around 2-million rows, and the alter
statement caused the logs to grow from 1MB to ~850MB.
As for the impact on performance (how long it will take), without knowing anything about your server's hardware and load it's impossible to say. If you were sufficiently concerned to want to avoid modifying the table in place, your best approach is probably a table swap:
Create a new table (sales_tmp
) with the desired schema, and copy the data:
insert sales_tmp
select * from sales;
If you can ensure that the sales
table won't be modified during the operation, you don't have to worry about preventing it with transactions and locks. Otherwise, a repeatable read
transaction should suffice, and at least won't block reads to the sales
table during the operation.
Then:
sales_tmp
to sales
)This can have some issues if you have replication or other fancy things setup on the affected table. Those aren't trivial to disable and re-enable, sadly.
If you are worried about the impact on the transaction log of updating 6-million records, you will need to update the records in batches. The size of which will vary depending on your needs. I would recommend 1,000-10,000.
IMO, I don't think you really need to worry about this, unless your database server is really strapped for free space, but this information might prove useful for future.
If your recovery model is SIMPLE, so too is log containment. If it is FULL, things are harder.
In spite of what Blam claims in his answer, his loop implementation will absolutely not guarantee any containment of the transaction log file.
Issue a checkpoint
statement at the end of the loop body to ensure log data is flushed. SQL Server periodically does this anyway and in most cases you won't notice, but in some cases you will. Below is highly simiplified psudeocode:
while @rows_left > 0
begin
-- update/copy rows
checkpoint;
end
This is trickier because the log grows indefinitely until you take a log backup, which marks the log pages as inactive and enables SQL Server to re-use the space already allocated for the log files. I won't go into this in detail, other than to recommend a few resources:
I do not come to the same conclusion/findings as Matt
decimal and numeric (Transact-SQL)
For the decimal and numeric data types, SQL Server considers each specific combination of precision and scale as a different data type. For example, decimal(5,5) and decimal(5,0) are considered different data types.
(8,4) is a different type from (8,2) and must be cast.
And you are not increasing the precision. Same precision - different scale.
decimal(8,2) 12356.78 will NOT cast to decimal(8,2).
I tested on a decimal(8,2) column that contained that value SSMS would not let me change it to decimal(8,4)
You will need to go to decimal(10,4) and it will cast / convert
This fails on the set @dec84 = @dec82;
declare @dec82 decimal(8,2);
declare @dec84 decimal(8,4);
set @dec82 = 123456.78;
set @dec84 = 1234.5678;
print @dec82;
print @dec84;
set @dec84 = @dec82;
print @dec84;
Add a column to the existing table with the proper precision.
No index.
And add it as the last column!
If not the last column it may try and move data around
To protect the transaction log update in batches
100 is just an example
update top (100) table
set newCol = oldCol
where newCol is null
If it is indexed then add that here
Then finish with a
update table
set newCol = oldCol
where newCol <> oldCol
Then rename newCol to oldCol
This is how I do loops to contain the transaction log
Usually in chunks of 10000
declare @rowCount Int;
Set @rowCount = 1
While @rowCount > 0
Begin
update top (1) [test].[dbo].[DateDateTime]
set [ddateTimeNoTime] = '2014-11-12'
where [ddateTimeNoTime] <> '2014-11-12'
set @rowCount = @@rowcount;
End
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With