Equality and trailing spaces in SQL

While reading posts on a forum, my attention was caught by a SQL oddity causing equality comparisons on strings with trailing spaces to possibly return unexpected results.

For example,
‘abc’ = ‘abc         ‘
will be considered to be true.

A fellow blogger provides more information about this.

The keyword for this behavior is “padding”.

To keep it simple, before a comparison between two strings is made, the server will add trailing spaces to the shorter one to make sure both strings have the same size. According to the post mentioned above, this is standard compliant.

More information can also be found in this article (MS knowledge base).

ARGMAX and ARGMIN in Excel

A blogger was wondering, whether it was possible to retrieve a reference to the cell where the result of the functions MAX or LARGE was found. With the proper combination of INDEX and MATCH, this is indeed possible.

Basically, the idea is to use the result of the MAX (or MIN or LARGE function) as an input for the standard INDEX-MATCH formula.

Suppose the range containing your data is called data_range.

third_largest = LARGE(data_range,3) returns the 3rd largest value in data_range.
p = MATCH(third_largest ,data_range,0) returns the relative position within data_range

The following returns a reference to the cell placed on the p-th row of the provided data range:

If your data was arranged horizontally, you would use
which tells excel to look at the first row and p-th column of the data range.

Put together, if  your data is arranged in a column, this gives:

If you enter this formula on a spreadsheet, it will look like it returns the same value as the simpler LARGE(data_range,3) formula. This only appears to be so. When applied to a range, the INDEX function really returns a reference to a cell.

Practically, this means you can nest this formula in functions that take ranges as arguments. Like OFFSET, ROW, COLUMN, …

For example,
returns the absolute row number at which the first occurrence of the value was found.

This technique has its limits of course: it only allows you to retrieve the first cell which meets the given value.

Is it possible to retrieve all values for which a MAX value is reached?

Yes, but it quickly becomes more complex:

This formula, when entered as an array formula, will return all absolute row numbers of the cells that contain the max value.

The IF part checks whether the cells in data_range contain the max value. If this is the case, it will return the absolute row number for the range, else it will return the value FALSE. The ROW-OFFSET part is a trick to dynamically build an array containing the values 1 to N, where N stands for the number of cells containing the max value The LARGE part, in our example, will return an array with the relevant values. Note the large function ignores boolean values (TRUE/FALSE).

Let me note, at this point, that before you write such a formula in your worksheets, you may want to check first whether no other option is available (writing intermediate formulas, UDF in VBA, pivot table, …)

Some interesting facts about CUBE functions in Excel

Did you know that … ?
– references to a cell containing a CUBE function return an OLAP object, just as the source cell;
– range operations on cells containing CUBE formulas return ranges of OLAP objects, that you can use in other CUBE functions;
– a copy & paste as value operation will remove the formula, but the destination cell will contain a new OLAP object containing the original definition.

It means you can:
– use your favorite functions (INDEX, VLOOKUP, …) on ranges of cells containing cube functions and use the result in a CUBEVALUE function;
– let end users choose the sets and measures they want to see with simple data validation lists that use range of cells containing CUBE formulas. You might not even need to use any lookup functions or IF statements nested within a CUBE formula to build your dashboard. Just keep it simple.

The fact that copy & paste as value operations on CUBE formula create OLAP objects may hurt you:
1. The objects remain bound to the source, as can be seen when trying such an operation in another workbook;
2. The objects will be updated with every connection refresh. The fact the target no longer contains any formula, may lead end-users to think they made a backup of earlier values, whereas the result will be updated on every connection;
3. The result of a copy & paste as value operation will retain the original definition. For example, copying a cell containing a CUBESET function will keep its references to the same members even after the original set or cells referenced have been modified. However, if the definition referred to a named set, then the result object will also change when this named set changes.

%d bloggers like this: