Summarizing Performance issues and workarounds after migrating from 2005 to 2008 R2 [SQL Server]

Based on our experience, having Hyper Threading (HT) enabled on one particular node let to I/O operations take a very long time. This was on a Windows 2008 R2 Cluster running SQL Server 2008 R2. An interesting fact was that it was neither reflected in the wait stats nor in the pssdiag we ran for Microsoft support.

The way we noticed low I/O was just by watching the OS counters for physical disk. I wrote about it here and here.

After fixing the issue by disabling HT, we started experiencing a very high CPU utilization due to a excessive amount of logical reads (20 million per query). This was due to a really bad plan. Our processes were performing anti-semi joins with tables that were partitioned and  the code that was performing extremely bad in 2008 R2 while doing just fine in 2005. I wrote about it here.

We pinpointed it out by running Adam Machanic’s sp_whoisactive while under high load  (which can be downloaded from here).

We also ran server side traces to find out the most expensive operations by sorting the highest I/O and CPU utilization metrics.

With the steps above we were able to tune the offending processes and go from 85% sustained CPU utilization to almost nil.

High CPU Utilization

Kerberos Double Hop and Delegation between different Active Directory Accounts [SQL Server]

In the past, when I had bare knowledge of Active Directory, I dealt with double hop queries using SQL Server Authentication.

What is the problem with SQL Authentication? Maintenance. The DBA team can literally spend hours maintaining usernames/passwords, linked servers, and of course giving appropriate grants/roles to a new user added to several database instances. There are also security concerns which is outside the scope of this post.

So what is a double hop query? It is a query ran from Machine A that requests data from machine B that at the same time requests data from Machine C.

For example:

If I execute a query on my local machine (A) connecting to Server1 (B) that requests data from Server2 (C):

1
2
3
4
5
-- From my desktop:
SELECT a.*
FROM OPENROWSET('SQLNCLI',
'Server=server2.domain.com\instance;Trusted_Connection=yes;',
'SELECT * from master.sys.objects') AS a;

The solution: Active Directory. A user can be added to the DBA Domain Group, and that group can be granted the sysadmin role on every production database server for example. Once that is defined, we can add/remove members of the group without even touching the database instances.

Now, there is a consensus of having each database instance use its own Active Directory Account as SQL Server Service user. Some shops have 1 account, others a few. It depends on the policy adopted.

Our challenge: Allow double hop queries between 2 database instances running different Active Directory Accounts.

The solution was similar to what we have done with 1 single account.

This is how me made it work for 1 account:

Prerequisites:

  1. You need domain admin or access to Active Directory under the domain managed
  2. The person in charge of making the change has access to the domain controller or has Remote Server Administration Tools installed locally.

Steps:

  • Via Command Prompt, execute the SETSPN command for the server, port and domain account assigned for the SQL Server Instance.
1
setspn -a MSSQLSvc/server1.domain.com:1433 DOMAIN\srvc_sqlaccount
  • For clustered solutions, run it with and without port number. It does not matter if it is a named instance.
1
2
setspn -a MSSQLSvc/serverclust1.domain.com DOMAIN\srvc_sqlaccount
setspn -a MSSQLSvc/serverclust1.domain.com:1433 DOMAIN\srvc_sqlaccount
  • This needs to be repeated for all the SQL Server Instances that you need to trust delegation
  • Open Active Directory Users and Computers, search for the domain account just used (DOMAIN\srvc_sqlaccount) in the example, double click and open the properties window.
  • Enable “Trust this user for delegation to specified services only”, “Use Kerberos Only”, Add, User and Computers, type in the account used (e.g. srvc_sqlaccount), select the machine(s) you want to trust delegation, OK, OK.

This is how we made it work with 2 accounts:

You need to follow the same exact process, but when enabling Kerberos delegation, you need to open the properties of the first domain account and add Kerberos Delegation to the second domain account.

Assuming non-clustered instances:

Steps:

  • For the first account and the first instance. Via Command Prompt, execute the SETSPN command for the server, port and domain account assigned for the SQL Server Instance.
1
setspn -a MSSQLSvc/server1.domain.com:1433 DOMAIN\srvc_sqlaccount_1
  • For the second account and the second instance. Via Command Prompt, execute the SETSPN command for the server, port and domain account assigned for the SQL Server Instance.
1
setspn -a MSSQLSvc/server2.domain.com:1433 DOMAIN\srvc_sqlaccount_2
  • Open Active Directory Users and Computers, search for the first domain account (DOMAIN\srvc_sqlaccount_1) in the example, double click and open the properties window.
  • Enable “Trust this user for delegation to specified services only”, “Use Kerberos Only”, Add, User and Computers, type in the first account used (e.g. srvc_sqlaccount_1)
  • Select the machine you want to trust delegation (server1.domain.com in the example), OK, OK.
  • Click Add again, User and Computers, type in the second account used (e.g. srvc_sqlaccount_2)
  • Select the machine you want to trust delegation (server2.domain.com in the example), OK, OK.
  • Wait enough time for the Domain Controllers to propagate the changes (15 minutes should be fine)

Verify Logins are Connecting Via Kerberos:

Run the following query, paying special attention to the auth_scheme column. If Kerberos authentication was successful, you should see the auth_scheme reflect Kerberos instead of NTLM:

1
2
3
4
5
6
7
8
9
10
11
12
SELECT
s.session_id
, c.connect_time
, s.login_time
, s.login_name
, c.protocol_type
, c.auth_scheme
, s.HOST_NAME
, s.program_name
FROM sys.dm_exec_sessions s
JOIN sys.dm_exec_connections c
ON s.session_id = c.session_id

Now we can have multiple Domain Accounts for multiple SQL Database Instances, each one delegating for double hop.

Hope this helps.

SQL Server: Truncating Dates

TRUNC is a function I used a lot when I worked with Oracle and helped me slicing my data in several different ways, and keeping the columns in date format.

That function is not available in SQL Server unless you work it around by using CONVERT or DATEPART. But both change the content to VARCHAR.

Well, several years ago I found a way to workaround the issue and keep the data in DATE format (and it took me this long to post it):

1
2
-- Truncate to the current hour
SELECT DATEADD(HH, DATEDIFF(HH,0,GETDATE()), 0)
1
2
-- Truncate to the current day
SELECT DATEADD(DD, DATEDIFF(DD,0,GETDATE()), 0)
1
2
-- Truncate to the current month
SELECT DATEADD(MONTH, DATEDIFF(MONTH,0,GETDATE()), 0)

As per BOL, we can use any of these datepart arguments:

  1. year (or yy, yyyy)
  2. quarter (or qq, q)
  3. month (or mm, m)
  4. dayofyear (or dy, y)
  5. day (or dd, d)
  6. week (or wk, ww)
  7. weekday (or dw, w)
  8. hour (or hh)
  9. minute (or mi, n)
  10. second (or ss, s)
  11. millisecond (or ms)
  12. microsecond (or mcs)
  13. nanosecond (or ns)

You can come up with all kinds of GROUP BYs, and filtering and keep the data in Date format at the same time.

A couple of days ago, while browsing the AskSSC community forum, a question came up: “How to get quarter start date and end date”. Because I wanted to exercise my brain a little, decided to help out and posted:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
DECLARE @min_date DATETIME, @max_date DATETIME;
SET @min_date = '1/1/2009';
SET @max_date = '12/31/2010';

-- Working Vars
SET @max_date = DATEADD(QUARTER, DATEDIFF(QUARTER,0,@max_date), 0) + 1;

WITH CTE_dummy AS (
    SELECT TOP(DATEDIFF(QUARTER, @min_date, @max_date) + 1) ROW_NUMBER() OVER(ORDER BY object_id) - 1 AS sequence from sys.columns
)
SELECT CASE WHEN (DATEPART(QUARTER, @min_date) + sequence) % 4 = 0 THEN 4 ELSE (DATEPART(QUARTER, @min_date) + sequence) % 4 END AS [Quarter],
    DATEADD(QUARTER, sequence, DATEADD(QUARTER, DATEDIFF(QUARTER,0, @min_date), 0)) AS [Start_Date],
    DATEADD(QUARTER, sequence, DATEADD(QUARTER, 1, DATEADD(QUARTER, DATEDIFF(QUARTER,0, @min_date), 0)) - 1) AS [End_Date]
FROM CTE_dummy;

For the solution I used the ROW_NUMBER() windowing function, but deals with dates truncation to tackle the core question.

Hope it helps.

SQL Server: Migrating from 2000 EE X86 to 2008 R2 SE X64

Today we had a production migration and brought over one of our legacy applications from a data center located up north to one located in South Florida.

The challenges:

1. The locations are disconnected and far from eachother
2. We are going up 3 releases. From 2000 to 2008 R2 (8.0 to 10.5)
3. We are Downgrading from Enterprise Edition to Standard Edition
4. We are going from a 32-bit architecture server to a 64-bit one (X86 to X64)

The solution. We separated this into 2 pieces. A pre-migration and production migration process.

Pre-Migration Process:

First, ensure that there are no enterprise edition features used. We were lucky this time as there were really none in the source database. But as an example, we got another server running 2005 EE which will be migrated to 2008 R2 SE and had partitioned tables. That is something to be taken care of before migration.

Second, ensure that the database has no corruption. A DBCC CheckDB needed to be performed on the source before backing it up.

Third, prepare for a test with a latest full backup. We FTP’d over the last full backup during a weekend and restored onto the target server running 2008 R2. We gzipped the files before transferring, but it was painfully slow still.

Fourth, restore and ensure that the databases open with no issues whatsoever. Update statistics and reindex any important table that is really defragmented. This is a challenge in Standard Edition as reindexing cannot be performed online.

Fifth, test and retest. This requires a lot of help from the Application Developers and QA Team. Ensure that functionality is intact and most importantly, that the database is always in a healthy condition (no excessive I/O reads, no high CPU utilization, etc).

Sixth, let the QA team sign-off.

Production Migration Process:

First. Plan the work. Create checklists with detailed steps to be performed before, during and after migration.

Second. Perform a full backup on the source server. We chose Saturday night. Disable the full backup job on the source server. Ensure that transaction log backups are still running at the source.

Third. Compress the backups, FTP to Florida, decompress and start restoring onto the target server.

Fourth. Bring over the transaction logs and continuously apply to the target server.

Fifth. In our environment we were able to disable outbound services and show a maintenance page on the web apps. This way the source database was accepting data but not processing it.

Sixth. Take a last transaction log backup, apply it to the target and open the databases.

Seventh. Update statistics, reindex for required objects and perform sanity check tests.

Eighth. Let the Infrastructure team make appropiate changes (like DNS) and enable services.

Ninth. Sync up any missing data. In our scenario, we were able to code an application that extracted missing transactions (just a few thousand) from the source server and apply to the target server. Let QA Test and Sign-Off.

Tenth. Perform a full backup and enable all maintenance jobs with appropriate schedules.

Notes:

Each migration will have its own steps and some will be more difficult than others. I have specified very high level bullet points of what needed to happen for this particular migration.

What if the unplanned happens:

First, every checklist should have a rollback plan. In our case, we had steps to follow in case we needed to stop the migration. On the DB side was pretty simple as we just stop restoring onto the target server, go back to the source and enable maintenance jobs. Of course there are other steps that needed to be followed by development and infrastructure members.

But today we had an interesting scenario. While recreating indexes on a 25 Million row table, the DBA stopped the SQL Agent, which was part of the process outlined on the checklist. The problem, it was stopped through the Services Panel instead of the SQL Configuration Manager. So what happened? The Clustered service freaked out and decided to perform a fail-over.

The instance started but the database status was “in recovery”. As per the SQL Log, it was going to take like 2 hours to rollback all the changes. We had 2 options; 1 to just execute the rollback plan, or 2 drop the database and restore it once more.

We decided to execute option 2 as the transaction logs were small. The bulk was restoring the full backup, but fortunately it was done in under 30 minutes; not bad for a 120 GB database.

Now how did we drop the “in recovery” database? We opened the SQL Configuration Manager, offlined the database instance, deleted the datafiles from the “in recovery” database and re-started the instance. Of course the SQL log was complaining about missing files, so we went ahead and dropped the database.

We are now live on a new 2008 R2 instance, and are current with backups. Interesting to see that with database compression a 120 GB database backup uses just 25 GB of disk space.

This was our last SQL Server 2000 instance.

SQL Server: Can I upgrade to 2008 R2 from 2008, 2005 or 2000? #TSQL2sDay

TSQL2sDay150x150
It is known that you can go up 2 major releases. Let’s consider 2008 R2 as a “half release” as we can “floor” 10.5 to 10. That is because we can go from 8.0 (2000) to 10.5 (2008 R2). Microsoft is smart as it is allowing several costumers running SQL Server 2000 to upgrade to 2008 R2.

But there is a caveat, you cannot upgrade from SQL Server 2008 SP2 to 2008 R2 RTM if you have the option to support 15,000 partitions enabled on 2008 SP2. So you are pretty much stuck until Microsoft releases a CU or SP to address it in R2.

Make sure you test before upgrading a production environment.