Max Memory Configuration set too low [SQL Server]

April 19th, 2011 by OZ | 4 Comments | Filed in Database, SQL Server

We have been experiencing an issue with a clustered node where instances will run extremely slow and the only process eating CPU resources is “NT Kernel & System”. We have upgraded drivers and even wiped out and reinstalled Windows and SQL Server (both 2008 R2) and the issue reappears.

In the process of troubleshooting with Microsoft, we needed to fail over instances to collect log information. We have done this several times. Because our latest cluster implementations include Active/Active nodes, we need to adjust Max Memory to SQL Server whenever all instances are in one node.

Yesterday something did not go right and a human error caused the instances to failed. Even experienced DBAs will eventually break something, but they will do anything possible to correct. Memory was set to 27 instead of 27000 MB. No connections could be made to the instance.

There are 3 ways this situation can be resolved:

  1. Opening a dedicated connection to the instance (DAC) and changing the max memory configuration.
  2. Shutting down the instance and starting it via command prompt using the -f option. It will fire up the instance with minimal configuration. After that change the max memory configuration.
  3. Bouncing the instance and changing the memory configuration as soon as the instance starts. We are taking advantage of a fresh instance with no allocated cache yet.

Changing the memory configuration:

1
2
3
4
5
6
7
8
sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'max server memory', 27000;
GO
RECONFIGURE;
GO

Coaching point, double or even triple check before issuing a configuration command.

SQL Server Memory Configuration via SSMS

SQL Server Memory Configuration via SSMS

Share

Tags: ,

Windowed Functions empowering analytics [#TSQL2sday]

March 8th, 2011 by OZ | 2 Comments | Filed in Analytics, Database, SQL Server
T-SQL Tuesday #16

T-SQL Tuesday #16

This blog entry is participating in the T-SQL Tuesday #16 for the month of February, hosted by Jes Schultz Borland (Blog|Twitter).

Back in 2001 I was involved in a project migrating an e-commerce site to a new platform. We were mapping and moving data from an Oracle 8.1.5 to an Oracle 8.1.7 instance. The framework was the beloved Dynamo running on Sun Solaris. The Dynamo engineers decided to use sequencers on each entity (e.g. product, category, etc). I was trying to figure out the best reusable PL/SQL block to move the data creating the correct sequence until I was pointed out that Oracle 8.1.6 supported Analytical Functions. After reading and practicing I discovered the power of those functions.

For the purpose of this blog I will refer the Analytical Functions as Windowed Functions instead. They are very useful as it allows the user to crunch measures of subsets of data maintaining the “raw” detail level data. For example, it allows us to get the average price of a product category while also displaying the unit price of each product. It can also help us rank a product by its price and “window” it by category.

Examples

Basic Aggregation:

1
2
3
4
5
-- Qty of products and average price
SELECT
    COUNT(productKey) AS [Quantity]
   ,AVG(StandardCost) AS [AverageCost]
FROM [dbo].[DimProduct]

Distribution of Data:

1
2
3
4
5
6
-- Tier by Product Cost (4 tiers)
SELECT
    NTILE(4) OVER(ORDER BY [StandardCost]) AS [Tier]
   ,productKey
   ,StandardCost
FROM [dbo].[DimProduct]

Average per Category with raw data:

1
2
3
4
5
6
7
-- Take the Average for each Product Subcategory and attach to the detail level data
SELECT
    AVG(StandardCost) OVER(PARTITION BY [ProductSubcategoryKey]) AS [AverageCostSubPerCat]
   ,productKey
   ,[ProductSubcategoryKey]
   ,StandardCost
FROM [dbo].[DimProduct]

Ranking:

1
2
3
4
5
6
7
8
-- Rank Products based on the minimum and maximum cost per Product Subcategory
SELECT [ProductSubcategoryKey]
   ,RANK() OVER(PARTITION BY [ProductSubcategoryKey] ORDER BY StandardCost, [ProductKey]) AS sequence_min
   ,RANK() OVER(PARTITION BY [ProductSubcategoryKey] ORDER BY StandardCost DESC, [ProductKey] DESC) AS sequence_max
   ,[ProductKey]
   ,StandardCost
FROM [dbo].[DimProduct]
WHERE StandardCost IS NOT NULL

Averages excluding certain ranks:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- Get the Average and Qty. of the products per category excluding the minimum and maximum cost per Product Subcategory (above)
WITH CTE_sequence AS (
  SELECT [ProductSubcategoryKey]
     ,RANK() OVER(PARTITION BY [ProductSubcategoryKey] ORDER BY StandardCost, [ProductKey]) AS sequence_min
     ,RANK() OVER(PARTITION BY [ProductSubcategoryKey] ORDER BY StandardCost DESC, [ProductKey] DESC) AS sequence_max
     ,[ProductKey]
     ,StandardCost
  FROM [dbo].[DimProduct]
  WHERE StandardCost IS NOT NULL
)
SELECT [ProductSubcategoryKey]
  ,COUNT([ProductKey]) AS Quantity
  ,AVG(StandardCost) AS Average_Cost
FROM CTE_sequence
WHERE sequence_min > 0
  AND sequence_max > 0
GROUP BY [ProductSubcategoryKey]

There are more functions that can be used and the user even has the option to dice the data further by using HAVING and slicing with GROUP BY CUBE | ROLLUP which plots the data in “pivotable” format.

Share

Tags: , , , , , ,

Allow me to sleep and focus on number crunching [#TSQL2sday]

January 11th, 2011 by OZ | 1 Comment | Filed in Analytics, Database, SQL Server
T-SQL Tuesday #014

T-SQL Tuesday #014

It’s 2011 already and the first T-SQL Tuesday for the year has arrived.

This time the event is hosted by Jen McCown (Blog|Twitter), who just received an MVP nomination (congratulations Jen).

I have come up with a list that I want covered by my team this year, but my main goals are two: Allowing me to sleep properly and investing time on number crunching.

What does “allow me to sleep” mean?

Last year we started our migration project of all production database servers to High Availability solutions. We shall continue this year and finalize it. On the same token, homogenize our environment so every single business unit runs on SQL Server 2008 R2. We really spent time figuring out processes that cannot run in SQL Server 2005 (e.g. change data capture) and I need to make sure it does not happen anymore.

On the storage side, there are 2 projects running in parallel. One is to allow growth on a scalable solution which is already in place (stop concerning about hitting volume limits), and the second one to reduce data file footprint. The first project is trivial as we can hookup additional trays and expand volumes. The second one is more complex as we need to identify tables that hold historical data, partition them (if applicable), phase out old data and archive without impacting production applications.

On the monitoring side, we are creating a new project to see if we shall stick with Management Data Warehouse or get a 3rd party tool to give us bird’s eye view of our systems at the DB level, define KPIs, and report based on them. We got proper HW and Service level monitoring in place already.

What does “focus on number crunching” mean?

This has 2 phases.

The first one is to work on a Master Metadata repository project and figure out relationships between different business entities. Then work with Business Owners and Data Analysts to figure out what should we report on, how far back should we do so, and what KPIs should be defined. This can allow us to continue building our Data Warehouse, extending ETLs, and allowing business unit A to correlate its data with Business Unit B.

This project is also tied with data file footprint reduction as we will be able to rollup information that shall be available on our DSS without the need of all the transactional data.

The second phase is analytics. This can be tied with data mining and statistics and will definitely involve some of the information we have stored in our cubes. We are still figuring this one out.

That is it for me on the technical side. On the managerial and personal? I would write about them on future blog posts.

What are your resolutions for 2011?

Share

Tags: , , , , ,

Table Structure, Trading Maintenance with Performance [SQL Server]

December 9th, 2010 by OZ | No Comments | Filed in Database, SQL Server

We recently migrated a database from SQL Server 2005 X86 to 2008 R2 X64. Running an Enterprise Edition License, we were able to take advantage of partitioning. It had several tables with similar structures; row count north of 2 Billion rows.

Old Structure:

  1. 3 fields
    1. TypeId INT
    2. ContentData VARCHAR(100)
    3. insertionDate DATETIME
  2. Unique constraint: TypeId, ContentData. Added InsertionDate to comply with partitioning requirements.
  3. Table had a partition per month.

Pros

  1. On the first week of each new month we marked the previous month partition as read-only.
  2. A one-time File backup was performed to the previous month partition.
  3. Deltas (new data) would always be located on the last partition
  4. The current months’ partition would be the only piece left in Read-Write mode.

Cons

  1. Searching for ContentData for a particular TypeId scanned as many times as partitions used
  2. Bulk Data comparison between new and old
    TypeId + ContentData performed an index seek on each partition
1
2
3
4
SELECT ContentData
INTO #temp
FROM sampletable
WHERE TypeId  = 41;
1
Table 'sampletable'. Scan count 92, logical reads 7613, physical reads 223, read-ahead reads 7808, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Original Plan

Original Plan

New Structure:

  1. 3 fields
    1. TypeId INT
    2. ContentData VARCHAR(100)
    3. insertionDate DATETIME
  2. Unique constraint: TypeId, ContentData.
  3. Table has a partition per TypeId (range of 50 items each, e.g. 1,2, … ,50).
  4. Index on insertionDate with TypeId as include
  5. Page Compression enabled.

Pros

  1. Searching for a particular TypeId + ContentData generates a unique index seek plan, 1 scan count (1 partition).
  2. Scanning for ContentData for a particular TypeId generates an index scan for only 1 partition
  3. Bulk Data comparison between new and old TypeId + ContentData performs an index seek in one partition only
  4. Compression reduced I/O reads by 50%

Cons

  1. All datafiles are in read-write mode. Every file needs to considered in the backup maintenance plan
  2. Any partition can suffer block update even if it holds data from 4 years ago (cannot be tagged as historical)
  3. Retrieving deltas performs an index seek on every partition
1
2
3
4
SELECT ContentData
INTO #temp
FROM sampletable
WHERE TypeId  = 41;
1
Table 'sampletable'. Scan count 1, logical reads 5449, physical reads 66, read-ahead reads 5445, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
New Plan

New Plan

Conclusion:

The table is mostly used for retrieving and comparing data. The retrieval call will always pass in TypeId as parameter. The system now performs an index scan for 1 partition (1 scan count). For data comparison, an index seek is performed for a unique value under 1 partition only. Data compression reduced I/O reads by 50%.

In the past, data extraction and comparison scanned through all the partitions generating excessive I/O calls, but we did not need to worry about continuously backing up older partitions as they were already flagged as read-only and backed up.

With this exercise we have effectively increased performance at the expense of maintenance. Well worth the change.

Share

Tags: , , , , ,

Denali, the next release [SQL Server]

November 17th, 2010 by OZ | No Comments | Filed in Database, SQL Server

After a week at the Pass Summit United 2010 and after catching up at work with several tasks, I am now able to update my blog with some exciting information. SQL Server codename “Denali” CTP1 was released formally on 11/09/2010 (version 11.0). I was able to obtain a DVD copy while at the summit, and now it can be downloaded too.

The key takeaways:

  1. AlwaysON. This is a full blown HADR solution that allows multiple asynchronous and synchronous
    SQL Server Codename "Denali" Community Technology Preview 1

    SQL Server "Denali" CTP1

    mirrors of a database with transparent fail over option to a local and remote datacenter. This is a feature I am really looking forward to.

  2. VeritPaq embedded in the relational engine. The SQL Server engineers ported the same code used for PowerPivot into the RDBMS. The feature will allow column-level ultra-high compression of repetitive data which will dramatically increase query efficiency by ten times at least.
  3. BISM with DAX support. Business Intelligence Semantic Model in Analysis Services which will allow retrieval of data using DAX queries; the same ones used in PowerPivot.
  4. Sequencer generator. It is never too late for a feature like this.
  5. Enhanced Windowed Functions.
  6. File Table Support. I have been waiting this feature since the Oracle 9i days. It will allow us to manage files directly within SQL Server.

There are certainly more enhancements which will continue to pop-up as the product matures.

Now, it is a challenge for us to learn and prepare for the new features after just upgrading to 2008 R2. Well, Microsoft is doing a pretty good job by keeping us busy and I am OK with that. They were kindly enough to publish Books Online for Denali though.

Share

Tags: , , , ,

site tracking with Asynchronous Google Analytics plugin for Multisite by WordPress Expert at Web Design Jakarta.