Not in every scenario. But as a rule of thumb, test before you perform a split operation in a production environment.
Every year around this time we plan, test and deploy partitions for tables that will hold historical data from the next year. We typically create one partition per month and in one case a partition per day (which has its own sliding window schedule every month).
This has been a very busy year, and between the migrations and consolidations a historical table did not have partitions added for October and November. I took the decision to keep the data as-is and just add a new partitions starting on 11/12; one for each day. This table holds around 30 million rows per day.
Now, we usually partition tables using RIGHT functions in order to have the last boundary empty and able to split it beforehand without any locking. The table I mention in the above paragraph is using a LEFT partition function and all the current data is sitting in the rightmost partition.
Issuing a Split partition of the rightmost data will let the engine scan and move the data for the current partition. This is something that can be fine in systems with a few million rows, but not with a partition holding 500+ million rows. We tried on a testing environment and as expected it locked the table for a very long time. DML operations were not able to be performed against it until the split succeeded.
-- Using new filegroup
ALTER PARTITION SCHEME [ps_Left_day_sliding] NEXT USED [OEN_Lists_2010_11];
ALTER PARTITION FUNCTION [pf_Left_day_sliding] () SPLIT RANGE ('2010-11-12 00:00:00.000');
This is the table which has a partition per day.
We are not going to purse that route. We will create a brand new table with proper RIGHT partitions and swap it with the current one. We have that liberty as the data is purely historical, and DBAs retrieve from it when required. They will know where to look at after reading at the documentation they write after performing such a change.
We went ahead and also tested with RIGHT just to double make sure. It worked just fine.
CREATE PARTITION FUNCTION [pf_Left_day_sliding1](datetime) AS RANGE RIGHT FOR VALUES (N'2010-09-03T00:00:00.000', N'2010-09-04T00:00:00.000', N'2010-09-05T00:00:00.000', N'2010-11-12T00:00:00.000') ;
CREATE PARTITION SCHEME [ps_Left_day_sliding1] AS PARTITION [pf_Left_day_sliding1] TO ([PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY], [OEN_Lists_2010_11]) ;
CREATE TABLE [dbo].[batchresults_buffer1](
[column1] [bigint] NOT NULL,
[column2] [datetime] NOT NULL,
[column3] [int] NOT NULL,
[column4] [int] NOT NULL,
[column5] [int] NOT NULL,
[column6] [varchar](200) NOT NULL,
) ON [ps_Left_day_sliding1]([column2]);
CREATE UNIQUE CLUSTERED INDEX [PK__batchresults_buf__2077C861] ON [dbo].[batchresults_buffer1]
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [ps_Left_day_sliding1]([datecreated]) ;
INSERT INTO [batchresults_buffer1] (column1, column2, column3, column4, column5, column6)
column1, column2, column3, column4, column5, column6
FROM batchresults_buffer; — This table contains sample data, 50 million rows.
ALTER PARTITION SCHEME [ps_Left_day_sliding1] NEXT USED [OEN_Lists_2010_11];
ALTER PARTITION FUNCTION [pf_Left_day_sliding1] () SPLIT RANGE (‘2010-11-12 00:00:00.000’);
I do not see that need for us to use LEFT. All future deployment will use RIGHT instead.