## Equivalent of VLOOKUP in DAX – Part II – using TOPN

In my previous entry, I presented the LOOKUPVALUE function. In this article, I will explain how you can use the TOPN function to mimic the behavior of VLOOKUP when doing “approximate” searches.

Ken Puls’s article presents a typical use case for this pattern. Let me start with briefly sketching his problem and describe how he solves it.

## The effective tax rate problem

Ken uses data representing sales at a daily level. His problem is to determine the effective tax rate for a given row, when tax rates change over time.

In his article, he uses meaningful sales data. Since the core of the problem is to find the relevant tax rate for a specific date, and I want to focus on that, my ‘Data’ table looks like this:

 Date 06/01/2010 07/01/2010 08/01/2010 03/01/2013 04/01/2013 05/01/2013

Here is the ‘Tax Rates’ table. [Start Date] represents the day when a tax rate became effective.

 Start Date Tax Rate 07/01/2010 0,12 04/01/2013 0,05

Note that Gerhard Brueckl’s problem is similar: he presents a way to assign the relevant fiscal period to a date in a dimension table. In his model, fiscal periods are defined by their start date as well.

## A solution

Ken Puls’ idea is to add a calculated column to his data. This column will contain the value of [Start Date] for the relevant tax rate. He will then create a relationship between the ‘Data’ table and ‘Tax Rates’ table based on this new column.

His calculated column contains the MAX value of [Start Date] for all rows of ‘Tax Rate’ where ‘Tax Rate’[Start Date] is not greater than the current date in the data table. In other words, he replicates the logic of the following Excel formula:

`= VLOOKUP( [Date] , 'Tax Rates', 1, TRUE)`

The DAX expression for a calculated column looks like this:

```[Tax Rate Start Date] :=
CALCULATE(  MAX( 'Tax Rates'[Start Date])
, FILTER('Tax Rates', 'Tax Rates'[Start Date] <= 'Data'[Date] )
)```

It filters the ‘Tax Rates’ table, and calculates the max value of ‘Tax Rates'[Start Date] for the filtered table.

## Extending the solution (TOPN)

We can use TOPN to mimic the behaviour of VLOOKUP.

For an approximate search, we will follow this logic:

• use a FILTER expression to eliminate irrelevant rows (as in the previous formula)
• use the TOPN function to filter the result and only return the last row in ‘Tax Rates’
• return the value in the required column for the row we just found.

Here, “the last row”  means the row with the latest [Start Date].

Let us rewrite Ken’s expression using the TOPN function:

```[Effective Tax Rate - Start Date] :=
CALCULATE (  VALUES('Tax Rates'[Start Date])
, TOPN(1
, FILTER('Tax Rates', 'Tax Rates'[Start Date] <= 'Data'[Date]  )
, 'Tax Rates'[Start Date]
)
)```

The advantage of this pattern is that we can use it to query any column in our lookup table:

```[Effective Tax Rate - Rate] :=
CALCULATE ( VALUES('Tax Rates'[Tax Rate])
, TOPN(1
, FILTER('Tax Rates', 'Tax Rates'[Start Date] <= 'Data'[Date]  )
, 'Tax Rates'[Start Date]
)
)```

## Dealing with ties

There might be a problem with the previous formula: TOPN may return more than 1 row. This will be the case, if there are ties in your data.

In our example, if the ‘Tax Rates’ table has several entries for the same [Start Date], then the VALUES expression might return more than one value. In that case, the expression will fail.

Examples:

 Start Date Tax Rate Start Date Tax Rate 07/01/2010 0,12 07/01/2010 0,12 04/01/2013 0,05 04/01/2013 0,05 07/01/2010 0,12 07/01/2010 0,13 The formula still works. The formula returns an error.

This may be totally fine. Most of the time, this is what we want: if a reference table contains contradictory information, we want to notice it.

However, in some cases, you may want your formula to ignore that, and take one of these values anyway.

In that case, you can add a sort expression to the TOPN expression, so as to guarantee that only one row (or value) is returned.

Edit: Although this is what I first wrote below, SAMPLE does not behave like a TOPN with ties. How I came to this conclusion is unclear, as I can no longer find the test data I used. Although, this might seem to be true in this specific case, you should not assume this will work every time.

However, you can wrap the TOPN expression within a SAMPLE expression to only get one row (or more if required).

You may also use the SAMPLE function instead of the TOPN function. What does it do? Basically the same as TOPN except in case of a tie. In that case, SAMPLE will exactly return the required number of rows.

• Which rows? The engine will decide by itself.
• Will it always return the same rows? This is not guaranteed.
• Will the rows returned change on each execution? This is not guaranteed.

As a side-note, if you want to write a query that returns exactly 3 rows from your ‘Data’ table, but want these rows to be randomly drawn between each execution, then you can do the following.

`EVALUATE SAMPLE( 3,'Data', RAND() )`

## Equivalent of VLOOKUP in DAX–Part I

One frequent DAX requirement is to write a formula that behaves like the VLOOKUP function in Excel.

Gehrard Brueckl recently blogged about how to map a date to a fiscal period, when fiscal periods are defined in a separate table, and each period is defined by its start date. Ken Puls recently wrote an article about calculating an effective tax rate with DAX.

In this first post, I will present the LOOKUPVALUE function.

VLOOKUP, a classic Excel function, accepts two values (TRUE or FALSE) for its last argument.

Assuming we defined names for Value, Range, i, the formula will look like this:
=VLOOKUP(Value, Range, i, FALSE)

VLOOKUP will look for the first occurrence of Value in the first column of Range. If the value cannot be found, the formula will return #NA, else it will return the value in the i-th column on the same row.

Note that VLOOKUP also works with arrays.

When the last parameter is TRUE (or omitted), then VLOOKUP requires your data to be sorted in ascending order. It will look for the last value in the first column of Range that is not greater than Value, and will return the value in the i-th column on the same row, or #NA if none exists.

More details on VLOOKUP here, there, or everywhere on the internet.

# LOOKUPVALUE

## Syntax

DAX V2 introduced a LOOKUPVALUE, that has a related behavior.

A notable difference is that LOOKUPVALUE allows you to provide criteria on several columns. In Excel formulas, this would require you to use concatenate columns, or even replace VLOOKUP with some INDEX/MATCH formula.

The syntax looks like this:
= LOOKUPVALUE( Table[OutputColumn], Table[LookupColumn1], “SearchValue1”,  Table[LookupColumn1],  “SearchValue2”)

The syntax looks like the vector form of the LOOKUP function, if you put the order of arguments aside. It actually looks like the syntax of the LOOKUPS function, which does not actually exist.

Basically, you first have to define which column contains your output, then define your first lookup column, then provide the value you are looking for, then define your second lookup column, and so on …

## Behavior

LOOKUPVALUE behaves differently from VLOOKUP:

1. whereas VLOOKUP works with arrays, LOOKUPVALUE does not work with column expressions. In other words, the columns arguments must reference columns that physically exist in your model. Also, remember that each column must belong to the same table.
2. if several rows in your table match your criteria, then VLOOKUP in Excel will only return the first (or last) match. On the contrary, LOOKUPVALUE in DAX may return an error if several rows match your criteria. This will occur if [OutputColumn] does not contain the same value for all matching rows.
3. a BLANK value is returned if there is no match.

Also, not surprisingly, you must address columns by names and cannot use indexes to do this.

## When to use LOOKUP value?

When your lookup table is a parameter of your model. By that, I mean a static table that you can import in your model.

Obviously, when the behavior described above suits your requirements.

When you want to calculate search values on the fly, or use different search values that come from different tables. In other words, when you cannot use PowerPivot relationships.