I have been thinking about this for a while, a series of blogs where I try to explain how a particular piece of DAX code works.
For this one I am going to share 2 examples and explain their workings and how they can still compute the correct result.
For First example we have ADDCOLUMNS with SAMEPERIODLASTYEAR and we will see how they work and are still able to compute the correct result even though there is no filter context, which is created in case of SUMMARIZECOLUMNS with SAMEPERIODLASTYEAR.
This is how the model looks like it only has 2 tables:
In the below example we see the code returns a table of 3 columns Year, Sales Amount and Sales Amount Previous Year, the goal here is to understand how SAMEPERIODLASTYEAR is actually able to work correctly here even though there is no Filter Context created by ADDCOLUMNS and some other minute details that are not evident if you don't pay enough attention.
I think I will start from the very basics of Context Transition performed by the [Sales Amount] measure so that even a new user can understand the SAMEPERIODLASTYEAR section easily.
So now that the working of how Sales Amount is calculated is established we can try to reason how SAMEPERIODLASTYEAR, which is a function that transforms dates into previous year's date is actually working inside ADDCOLUMNS, because when SAMEPERIODLASTYEAR is paired with CALCULATE inside SUMMARIZECOLUMNS we know it is operating under the Filter Context created by SUMMARIZECOLUMNS but in case of ADDCOLUMNS there is no Filter Context that gets created by ADDCOLUMNS, instead it gives us the Row Context for new columns.
Recap of CALCULATE's evaluation order:
Step 1: CALCULATE first evaluates its filter arguments in the original filter context where it is called, it can be empty like a Calculated Column/Table or it can have many columns in filter context like a Visual, a predicate statement like
Products[Brand] = "Contoso"
is evaluated in the filter context outside CALCULATE, even though external context on the same column won't impact it because the predicate statement is expanded into
FILTER ( ALL ( Products[Brand] ), Products[Brand] = "Contoso" )
but if your filter argument of CALCULATE was
FILTER ( VALUES ( Products[Brand] ), Products[Brand] = "Contoso" )
then it would respect the original filter context.
Step 2: Performs Context Transition on the Row Context where it is called and converts all the values of all the column of the current row into equivalent filter context which filters the FIRST argument of CALCULATE
ProductKey | Brand | Color | Category | Calculated Column |
1 | Contoso | Red | Audio | xyz |
A table like above becomes this in the code for the calculated column, this is just imaginary:
CALCULATE (
[Measures],
Products[ProductKey] = 1,
Products[Brand] = "Contoso"
Products[Color] = "Red"
Products[Category] = "Audio"
)
Well logically, the context transition generally in this exact situation depends on the primary key if that is available, but since you can use CALCULATE modifiers after context transition to alter the effect of different columns it helps in visualizing that all columns take part in the context transition.
Step 3: Application of CALCULATE modifiers take place in the next step, functions such as ALL, REMOVEFILTERS, ALLEXCEPT, ALLSELECTED, ALLNOBLANKROW, USERELATIONSHIP, CROSSFILTER, KEEPFILTERS etc, they all are CALCULATE modifiers, after context transition CALCULATE evaluates them so that they can alter the behaviour of the context transition or relationships if any, if not then alter the behaviour of the existing filter context.
Step 4: Apply the predicate statements and filter arguments evaluated in Step 1 and apply them to the filter context, either overwrite them or intersect them using KEEPFILTERS.
So now that we have a understanding of the CALCULATE's order of evaluation let's break down the following code:
ADDCOLUMNS (
VALUES ( Dates[Year] ),
"Sales Amount", [Sales Amount],
"Sales Amount PY",
CALCULATE (
[Sales Amount],
SAMEPERIODLASTYEAR ( Dates[Date] )
)
)
The first step is to evaluate the filter argument of CALCULATE which in this case is SAMEPERIODLASTYEAR, this could have been any other function like DATEADD, MIN, MAX etc.
As a user you might think SAMPERIODLASTYEAR will be or is always evaluated in the Filter Context, but if you remember I told you earlier ADDCOLUMNS doesn't create a Filter Context which is done by SUMMARIZECOLUMNS, so how it works?
SAMEPERIODLASTYEAR invokes context transition. ( In case of SUMMARIZECOLUMNS it will fetch active dates from the active filter context )
Let that sink in, actually all of the Time Intelligence functions invoke Context Transition when they are called in row context, this is their default nature. internally you could say SAMPERIODLASTYEAR looks like
SAMEPERIODLASTYEAR ( CALCULATETABLE ( DISTINCT ( Dates[Date] ) )
and since SAMPERIODLASTYEAR is just a syntax sugar for DATEADD:
DATEADD (
CALCULATETABLE ( DISTINCT ( Dates[Date] ) ),
-1,
YEAR
)
So if this function invokes context transition then it means it convert the current Year in the row context and move it into filter context under which it will get the dates of the current year and transform them into previous year, so you can think of the evaluation like this:
But here is a thing, not only SAMEPERIODLASTYEAR but the CALCULATE also performs a context transition which converts Dates[Year] into Filter context so there is an intersection of the current year and dates of the previous year, ideally in this situation the result should be blank because no date can exist in 2 years.
For Example, for 2019 SAMEPERIODLASTYEAR performs a context transition through which it gets the dates of 2018 and CALCULATE also initiates context transition on Dates[Year] and gets 2019 and creates a filter context under which the first argument of CALCULATE will be evaluated, if we think about it a calculation that returns 2 different year and dates should return a blank.
As you can see from above image only when the Dates and Years are from the same year we get a value otherwise it returns a blank, so why is that the original calculation is working fine?
This has to do with the fact that the relationships between Dates and Sales is based on a Date field and whenever Date Column is used in a relationship, the DAX Engine automatically applies a REMOVEFILTERS ( Dates ) in each calculation that you create.
To demonstrate this, I made a duplicate of the Date Column and the Date table is currently not marked as a Date Table
What happens if I use this column in the SAMEPERIODLASTYEAR? It returns BLANK as I explained above.
So what happened?
Step 1: CALCULATE evaluates its filter argument i.e. SAMEPERIODLASYEAR, which converts Dates[Year] into filter context through context transition and evaluates Dates[Date2] and returns the Dates2 of the Previous Year
Step 2: CALCULATE performs the Context Transition and converts the currently iterated Dates[Year] into filter context and keeps it aside this will only impact the first argument of CALCULATE
Step 3: Apply CALCULATE modifiers, in this case we don't have any, but in case of mark as date table there is a REMOVEFILTERS ( Dates ).
Step 4: Apply the filter context and evaluate the first argument, since this time the dates list and year are from 2 different years the result is a blank
What if I replicate the behavior of Date based Relationships by adding a REMOVEFILTERS/ALL modifier?
It works again! So what happened?
using REMOVEFILTERS I removed the context transition created by outer CALCULATE and not the context transition created by inner CALCULATE/SAMEPERIODLASTYEAR.
To demonstrate I can remove the context transition created by SAMEPERIODLASTYEAR by expanding the code and introducing a REMOVEFILTERS locally for SAMEPERIODLASTYEAR which just returns the grand total, this ignores the year transformed into filter context as shown below:
So what you should learn from this example is if a relationship between 2 tables is based on Date/DateTime then if that column is filtered either directly or through context transition like SAMEPERIODLASYEAR or any other time intelligence function, then it will ignore the filters coming from other columns of the Date table.
So what if the relationship isn't based on a Date/DateTime column how would calculations work then?
In those cases either you manually add a REMOVEFILTERS in each calculation that requires removal of filter from other columns or you use the feature "Mark as Date Table" so that engine know that it has to add REMOVEFILTERS every time the Date Column is filtered.
To demonstrate this I have added a DateKey text column on both Date and the Sales Table:
Dates:
Sales:
And the new relationship uses these 2 columns:
After this if I run the same code it returns blank:
Solution? Mark as Date Table! I marked the Dates[Date] as the Date column, because of this small change the engine will now start adding REMOVEFILTERS in each calculation that applies a filter on the Dates[Date] column.
See the red arrow pointing towards the time icon on the Dates table.
At this point some of you might think "but how is Date Column getting directly filtered?" the thing is SAMPERIODLASTYEAR is retrieving the dates of the previous year and then CALCULATE applies those dates to the Dates[Date] column under which the Measure is evaluated, this is where the Dates[Date] column is getting filtered.
Another common example of Context Transition performed by Time Intelligence function is to use FIRSTDATE/LASTDATE instead of MIN/MAX.
If I write the above calculation and try to get a count of dates that are prior to the current date then FIRSTDATE/LASTDATE just return the total count of rows in the Dates table.
The reason being LASTDATE is initiating context transition for each row of ALL ( Dates[Date] ) so each rows passes the check because it is always equal to itself, so for each cell in a visual FILTER is always going to return ALL the dates because every row passed the check.
Since LASTDATE is filtering the Date[Date] column it is ignoring the existing Filter Context on Dates[Date] created by SUMMARIZECOLUMNS because the relationship is based on the Date Column so there is an implicit REMOVEFILTERS.
And I can prove that easily by referring to the Logical Query plan that the DAX Engines generate.
On line 2 the query plan says
Scan_Vertipaq: RelLogOp DependOnCols()() 0-0 RequiredCols(0)('Dates'[Date])
Which means the DAX's Formula Engine is requesting the column Dates[Date] from Storage Engine because of the SUMMARIZECOLUMNS ( Dates[Date] part of the code.
On Line 5 There is another retrieval of the Date column but this is different from Line 2 as this time it is marked with 1 while the previous execution was marked with 0.
Scan_Vertipaq: RelLogOp DependOnCols()() 1-1 RequiredCols(1)('Dates'[Date])
On Line 6 where the <= comparison happens it says DependOnCols(1) Dates[Date] which means LASTDATE & Dates[Date] depends on ALL ( Dates[Date] ) for context Transition and this is why we get grand total.
LessThanOrEqualTo: ScaLogOp DependOnCols(1)('Dates'[Date]) Boolean DominantValue=NONE
On Line 9 you can see LastDate operator which refers to LASTDATE function and it also only Depends on (1) Dates[Date] and not (0) Dates[Date] which is outside the Measure
LastDate: RelLogOp DependOnCols(1)('Dates'[Date]) 2-2 RequiredCols(1, 2)('Dates'[Date], 'Dates'[Date])
using MAX instead of LASTDATE
When the LASTDATE is replaced with MAX not only the calculation is correct but MAX now doesn't depend on ALL but depends on SUMMARIZECOLUMNS ( Dates[Date] for its value, i.e MAX doesn't initiate context Transition but gets evaluated under Filter Context.
On line 2 Engine gets the Dates[Date] column for SUMMARIZECOLUMNS which is marked with RequiredCols(0)
Scan_Vertipaq: RelLogOp DependOnCols()() 0-0 RequiredCols(0)('Dates'[Date])
Then after Filter Operator, on Line 5 there is a Scan_Vertipaq operator that retrieves the Date Column for ALL ( Dates[Date] ) which is marked with RequiredCols(1)
Scan_Vertipaq: RelLogOp DependOnCols()() 1-1 RequiredCols(1)('Dates'[Date])
On Line 6 the comparison operation this time depends on both External and Internal Date column, this is why it says DependOnCols(0, 1)('Dates'[Date], 'Dates'[Date]), 0 = External / SUMMARIZECOLUMNS ( Dates[Date] ), 1 = Internal / ALL ( Dates[Date] ).
LessThanOrEqualTo: ScaLogOp DependOnCols(0, 1)('Dates'[Date], 'Dates'[Date]) Boolean DominantValue=NONE
Then on line 8 there is Max_Vertipaq operator which returns the value of MAX ( Dates[Date] ) and it only depends on External Dates[Date] 0 and not Internal Dates[Date] 1
Max_Vertipaq: ScaLogOp DependOnCols(0 ('Dates'[Date]) DateTime DominantValue=BLANK
So hopefully this long blog helps in understanding how both SAMERIODLASTYEAR and FIRSTDATE are actually performing the context transition and can return different results in different setups.
Let me know in the comments if this blog was useful in anyway and if there are some topics or DAX code that you have written or seen but do not understand their workings. I will see you in the next deep dive soon.
You can download the Power BI project from here: using https://download-directory.github.io/ and GitHub Folder
Any idea why this works even when the Date table isnt marked as a Date EVALUATE
ADDCOLUMNS (
VALUES ( Dates[Year] ),
"Sales Amount", [Sales Amount],
"Sales Amount PY",
CALCULATE (
[Sales Amount],
SAMEPERIODLASTYEAR(Dates[Date]),
ALL(Dates)
)
)
Thanks for sharing. I'm not sure I understood everything; I need to reread it.