We recently migrated a database from SQL Server 2005 X86 to 2008 R2 X64. Running an Enterprise Edition License, we were able to take advantage of partitioning. It had several tables with similar structures; row count north of 2 Billion rows.
Old Structure:
-
3 fields
- TypeId INT
- ContentData VARCHAR(100)
- insertionDate DATETIME
- Unique constraint: TypeId, ContentData. Added InsertionDate to comply with partitioning requirements.
- Table had a partition per month.
Pros
- On the first week of each new month we marked the previous month partition as read-only.
- A one-time File backup was performed to the previous month partition.
- Deltas (new data) would always be located on the last partition
- The current months’ partition would be the only piece left in Read-Write mode.
Cons
- Searching for ContentData for a particular TypeId scanned as many times as partitions used
- Bulk Data comparison between new and old
TypeId + ContentData performed an index seek on each partition
1 2 3 4 | SELECT ContentData INTO #temp FROM sampletable WHERE TypeId = 41; |
1 | Table 'sampletable'. Scan count 92, logical reads 7613, physical reads 223, read-ahead reads 7808, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. |
New Structure:
-
3 fields
- TypeId INT
- ContentData VARCHAR(100)
- insertionDate DATETIME
- Unique constraint: TypeId, ContentData.
- Table has a partition per TypeId (range of 50 items each, e.g. 1,2, … ,50).
- Index on insertionDate with TypeId as include
- Page Compression enabled.
Pros
- Searching for a particular TypeId + ContentData generates a unique index seek plan, 1 scan count (1 partition).
- Scanning for ContentData for a particular TypeId generates an index scan for only 1 partition
- Bulk Data comparison between new and old TypeId + ContentData performs an index seek in one partition only
- Compression reduced I/O reads by 50%
Cons
- All datafiles are in read-write mode. Every file needs to considered in the backup maintenance plan
- Any partition can suffer block update even if it holds data from 4 years ago (cannot be tagged as historical)
- Retrieving deltas performs an index seek on every partition
1 2 3 4 | SELECT ContentData INTO #temp FROM sampletable WHERE TypeId = 41; |
1 | Table 'sampletable'. Scan count 1, logical reads 5449, physical reads 66, read-ahead reads 5445, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. |
Conclusion:
The table is mostly used for retrieving and comparing data. The retrieval call will always pass in TypeId as parameter. The system now performs an index scan for 1 partition (1 scan count). For data comparison, an index seek is performed for a unique value under 1 partition only. Data compression reduced I/O reads by 50%.
In the past, data extraction and comparison scanned through all the partitions generating excessive I/O calls, but we did not need to worry about continuously backing up older partitions as they were already flagged as read-only and backed up.
With this exercise we have effectively increased performance at the expense of maintenance. Well worth the change.