OVER clause in SQL Server 2012

Changes to the OVER clause

RANGE / ROWS clause

The new ROWS keyword in SQL Server 2012 expression allows us to specify the window frames over which our expressions will be evaluated.

The following ROWS clause, for example, allows us to restrict a calculation to all previous rows in a dataset for the defined partition and the provided ORDER BY clause
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

Note that CURRENT ROW actually means all rows in the current partition that have the same ordering value (as defined by the ORDER BY clause) as the current one.

Self joins are no longer required to calculate running values.

As an example, the following query will return an event history from an event list. I considered a CRM system where cases/incidents/tickets can be created, closed and reopened. The raw event list should be transformed into an event history, displaying for each incident the periods over which it was opened.

WITH EventList AS (
SELECT *
FROM (
VALUES
(1, ‘NEW’, CAST(‘2012-01-04 15:00:00’ AS datetime))
,(1, ‘CLOSE’, CAST(‘2012-01-04 15:37:00’ AS datetime))
,(2, ‘NEW’, CAST(‘2012-01-05 15:00:00’ AS datetime))
,(2, ‘CLOSE’, CAST(‘2012-01-06 15:00:00’ AS datetime))
,(2, ‘REOPEN’, CAST(‘2012-01-07 15:00:00’ AS datetime))
,(2, ‘CLOSE’, CAST(‘2012-01-09 15:00:00’ AS datetime))
,(1, ‘REOPEN’, CAST(‘2012-01-18 15:00:00’ AS datetime))
)
AS DATA(ID, EventCode, EventDate)
)
SELECT
ID
, EventCode
, EventDate As StartDate
, MIN(CASE WHEN EventCode = ‘CLOSE’ THEN EventDate END)
OVER (PARTITION BY ID ORDER BY EventDate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
AS EndDate
FROM EventList

Nice and simple.

A questionable design decision

Due to a questionable design decision, the behavior of the new extended OVER clause may be counter-intuitive.

Consider the expression:
COUNT(EventDate) OVER (PARTITION BY ID)

This gives you the number of rows for the current ID.

The following expression is now legal in SQL Server 2012 – earlier versions would have thrown an error (Incorrect syntax near ‘order’).
COUNT(EventDate)
OVER (PARTITION BY ID ORDER BY EventDate )

Since COUNT is an additive function – this implies the order of evaluation does not matter – I expected this expression to return exactly the same results. Furthermore, I expected this expression to return exactly the same result independently of the field specified in the ORDER BY clause.

However, if you use such an expression , you will find out SQL Server actually restricts the partition depending on the fields listed in the ORDER BY clause. The MSDN documentation is quite clear about this ‘If ROWS/RANGE is not specified but ORDER BY is specified, RANGE UNBOUNDED PRECEDING AND CURRENT ROW is used as default for window frame.’

If I provide no ORDER BY or ROWS/RANGE clause, SQL Server will assume RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

Why should I expect a different behavior, when I only explicitly modify the ORDER BY clause?

Advertisements

New functions in SQL Server 2012

A quick look at the new functions in SQL Server 2012.

PARSE

This function allows you to parse a string using an optional culture parameter. It is meant for conversions to date/datetime and numeric types. This is a CLR (.Net) function and it expects culture parameters supported by the .Net framework.

An example taken from the MSDN web page:
SELECT PARSE(‘Monday, 13 December 2010’ AS datetime2 USING ‘en-US’) AS Result

The advantages over the CONVERT function are:

  • a wider array of culture/style parameters
  • an improved readability

I noticed a performance hit the first time I ran this ‘query’. I guess the server had to load the .Net assembly first. I wonder if the assembly will stay in memory for the life-time of the DBE service, or if it may unload now and then. This will not be an issue if you do not store dates as text in your database and do your formatting in client applications, which are good practise, and only use this function to import data in a computer-unfriendly format.

TRY_CONVERT

This function is similar to the CONVERT function but it will return NULL instead of casting an error when conversion failed. I had to import data with a string column containing date information. The time stamps had a US or a German format depending on the settings of the user who had pulled the data. Using TRY_CONVERT and COALESCE, importing the data as a DATETIME would have been a breeze.

TRY_PARSE

This function is to PARSE what TRY_CONVERT is to CONVERT.

DATEFROMPARTS, DATETIME2FROMPARTS, …

DATEFROMPARTS lets you easily define a date.

An example from the MDSN web page:
SELECT DATEFROMPARTS ( 2010, 12, 31 ) AS Result

This looks slightly better than
SELECT CAST(‘20101231’ AS DATE)

However, SELECT DATEFROMPARTS ( year, month, day ) looks much better than
CAST(CAST(year* 10000 +month*100 + day AS VARCHAR(8)) AS DATE)

EOMONTH

This function returns the last day of the month for a given date or for the input date offset by a given number of month
SELECT EOMONTH ( GETDATE() ) AS LastDayOfCurrentMonth
SELECT EOMONTH ( GETDATE() ,1) AS LastDayOfNextMonth

CHOOSE

Returns the item at the specified index from a list of values.

Unless you like to hard code labels rather than create lookup tables, you will probably only use this function for application developpement, when users pick items from dropdowns.

Important stuff: Parameter expressions are evaluated only when they are needed:
SELECT CHOOSE(1, 321213  ,1/0)
SELECT CHOOSE(1, 321213  ,’zer’+ 1)

IIF

In case you are too ‘lazy’ to write a CASE statement. SQL Server will translate the IIF expression into the equivalent CASE statement.

Parameter expressions are only evaluated for the relevant branch:
SELECT IIF(1 = 1, 321213 ,1/0)
SELECT IIF(1 = 1, 321213 ,’zer’+ 1)

This is not the same behavior as in SSRS, where an expression might return an error if an irrelevant branch returns an error.

CONCAT

The + sign in SQL can have different meanings:

  1. when used with two strings, it represents a concatenation operator
  2. when used with two numeric values, it represents an addition operator
  3. when used with a string and a numeric value, it represents an addition operator (!!!)

I will make no further comments on the design decision behind point number 3, but thanks to the CONCAT function you now have a way to tell SQL server to implicitly cast numeric values to alphanumeric values, and not the other way around.

FORMAT

To return a value as a formatted string making use of the capabilities of the .Net framework, although you usually had better do the formatting on the client side, this might come handy.

Tip: avoid posting questions on this function in forums where Celko is active.

LOG

You can now specify a base argument for your logarithmic calculations.

%d bloggers like this: