Wednesday 18 December 2013

Snapshot Isolation in SQL Server

Snapshot Isolation can be used to reduce deadlocks in your SQL Server. It works by marking records with a Row Versioning identifier which is stored in TempDB. When you query a database, the database returns all the data as a consistent dataset but still allows data updates while the query is in progress. Any subsequent queries will return the newly updated record, even if the previous query had yet to complete.

Snapshot Isolation is best used with a second setting which changes the Snapshot Isolation to Read Committed Snapshot. This tells SQL Server to observe the row versioning indicator when making a query.

Without Snapshot Isolation, you may see this behaviour:

Large query 1 begins
Small update begins
Small query 2 begins
[pause until query 1 completes]
Large query 1 ends
Small update ends
Small query 2 ends

The first large query is holding a lock on all the records it is trying to return until the query is complete. This stops the update occurring, which in turn stops the second query executing.  Only when that lock from Query 1 is released is SQL able to perform the update, and only once the update is complete can query 2 execute. Of course I could have used a (NOLOCK) table hint in both queries, which would have allowed both the update and second query to run straightaway, but there are two issues with this. Firstly NOLOCK is deprecated in SQL Server 2012, and secondly in both queries it would have been pot luck whether I got the version of the record pre or post update. In the example above, a particularly large query could have caused the update to be chosen as a deadlock victim and resulted in a failed update.

With Snapshot Isolation, the row versioning would have allowed the update to progress, and the subsequent query to progress also. The example above would now behave like this:

Large query 1 begins
Small update begins – notes that row is in use and marks the update accordingly.
Small query 2 begins
Small update ends
Small query 2 ends – returning post-update value
[pause until query 1 completes]
Large query 1 ends – returning pre-update value
Pre-update value is now removed completely

Of course this could still be a slightly confusing result: the first query to complete returns the new value, and the second query to complete contains the old value! It is important to remember that the result of your query that you see is the result as the database was when then query started, not when it completed.

Because one of the databases I look after is very heavily queried, we were seeing a lot of issues with the inserts and updates of records. We were constantly seeing errors in our CRM logs reporting deadlocks, and the performance of some of the inserts was poor. On changing the snapshot isolation settings, the difference was a 500x improvement in the average time to insert a record into one of our core tables, and our deadlocks vanished overnight. Inserts and deletions benefit particularly from this technology, as updates may affect one of two data pages and create a local lock, but an insert and associated index reorganisation can end up with many more data pages being affected, and therefore many more pages requiring a lock to progress.

The penalty, if you can call it that,  of changing the implementationwas that we found our TempDB usage went up from almost zero in size to around 1GB. The technique heavily uses TempDB, so before you enable it make sure that your TempDB is on a decent disk and there is plenty of memory available otherwise you may have just transferred the bottleneck from one place to another.

Snapshot Isolation causes the underlying database structure to be altered as a few extra bytes are added to each row to hold a version indicator. Unfortunately if your database is mirrored, you must break the mirror to make the change and recreate it as the change is not supported by mirroring. Also you must kill all the user sessions on the database. If the database is used by Dynamics, that will mean stopping IIS too on the CRM Web Servers to prevent connections re-establishing themselves.

To make the change, run the following commands:

ALTER DATABASE YOURDB
SET READ_COMMITTED_SNAPSHOT ON;
GO
ALTER DATABASE YOURDB
SET ALLOW_SNAPSHOT_ISOLATION ON
GO


I struggle to see any downsides to changing the database this way. The performance gains we saw were impressive, and the price of a little extra activity in TempDB was a very small price to pay. Of course, there may be reasons why your transactions must be kept atomic, but for every database I create in the future, I will be using Snapshot Isolation.

No comments:

Post a Comment