As I’ve built data warehouses and SSAS cubes over the years, I’ve accumulated components that I include in every cube that I build. They exist because they cleanly solved recurring modeling problems and/or made MDX queries lore efficient. I’ll share these with you over time in the hopes that you might find it as useful as I have. The first one of these components is the time measure group.
The problem
Most sales related cubes are going to use the time dimension to try and figure things out such as:
1) Sales divided up by work weeks for a given period. The user needs to know for a given quarter how many sales orders per day were generated on average – but only counting the days the establishment was actually open.
2) Length of time between the first order placed by a customer and the next order
3) How many Fridays in a period
4) How many days a given employee has worked in a period
All of these require counting workdays, weekdays, etc. Granted this can be done via set operations in a time dimension, but I’ve found those to be clumsy and inefficient. SSAS excels in managing aggregates at the cost of being a bit slow in scrolling through individual dimensions and detail data. Let me give you an example of trying to figure out (1) using the Date dimension in AdventureWorks 2008 R2:
with member measures.CountOfWeekDays as count( filter( descendants( [Date].[Calendar].currentmember ,[Date].[Calendar].[Date] ) ,([Date].[Calendar].Properties( "Day of Week" ) < "7" and [Date].[Calendar].Properties( "Day of Week" ) > "1") ) ) member measures.SalesPerDay as [Measures].[Order Count]/measures.CountOfWeekDays select [Date].[Calendar].[CY 2008] on rows , { measures.CountOfWeekDays , [Measures].[Order Count] , measures.SalesPerDay } on columns from [Adventure Works] It’s not pretty, but this works for the day of week. If you want to track holidays and such, you’ll need to add properties to the dimension and add a filter to those as well. In fact, you can load up the dimension with all sorts of properties you’d like to filter on such as holidays, sales days, etc. - and it’ll work... as long as you only have one establishment with one calendar. What if you have multiple stores, for instance? Let’s say some stores are open on Saturday while others aren’t. Then this method won’t work anymore because now these properties no longer belong to one calendar, but to many stores (as many as one per store).
The query that would respond to the second question is downright painful using the time dimension in MDX. First we need to get the first order and the second order, get the dates for each, create a date span with those two dates and count the members. Ouch. But here goes …
with member measures.DaysBetweenFirstandNextOrder as count({filter( [Date].[Date].[All].children ,( [Customer].[Customer].currentmember ,[Measures].[Internet Order Quantity] ) > 0 ) }.item(0):{filter( [Date].[Date].[All].children ,( [Customer].[Customer].currentmember ,[Measures].[Internet Order Quantity] ) > 0 ) }.item(1)) set CustomerList as {head([Customer].[Customer].[All].children,20)} select CustomerList on rows ,measures.DaysBetweenFirstandNextOrder on columns from [Adventure Works]
This gives me the number of days between the first order and the next order for my customers. This is a very poorly performing query which may or may not be improved with some tweaking. It gets even worse if we want to count how many workdays passed between orders. I don’t even want to write that query. One could use the ETL to put in a measure containing the number of days since a previous order, but that would require knowing during the initial design phase that queries like this will come up. Also, you would lose the flexibility the cube offers to answer the question within a given context, such as product line, or specific time period. Besides, on large stores, this can add enormous overhead to an ETL load.
There is a better way to answer these questions straight from the cube: by creating a Date fact table. Essentially this is a fact table that contains the measure “1” for each day, another measure “1” for each weekday, another measure “1” for workdays, etc. This fact table obviously uses the Time dimension but can also use the Store dimension if you need to keep track of workdays by store or location.
Just to illustrate the point, consider creating your time-based fact table from the Date dimension itself (I also created a small bank holiday table to join with this one – but this is optional):
SELECT [DateKey]
,1 as DayCount
,CASE
WHEN DayNumberOfWeek between 2 and 6 then 1
ELSE 0
END as WeekDays
,CASE
WHEN DayNumberOfWeek in (1,7) then 1
ELSE 0
END as WeekEnd
,CASE
WHEN H.[Bank Holiday] = 'TRUE' then 1
ELSE 0
END as BankHoliday
,CASE
WHEN coalesce(H.[Bank Holiday],'FALSE') = 'FALSE' and (DayNumberOfWeek between 2 and 6) then 1
ELSE 0
END as WorkDay
INTO [AdventureWorksDW2008R2].[dbo].[FactDate]
FROM [AdventureWorksDW2008R2].[dbo].[DimDate] T left
JOIN dbo.bank_holidays H ON T.FullDateAlternateKey = H.DateValue
GO
Then just build your fact table based on that table with the Time dimension as the only dimension. You should then be able to create a measure group like this one:
Now let’s look at how we re-write the query for (1) with the new measure group:
with member measures.CountOfWeekDays as [Measures].[Week Days] member measures.SalesPerDay as [Measures].[Order Count]/measures.CountOfWeekDays select [Date].[Calendar].[CY 2008] on rows , { measures.CountOfWeekDays , [Measures].[Order Count] , measures.SalesPerDay } on columns from [Adventure Works]
It looks simple to understand, needs no cumbersome filtering, and offers the flexibility to examine sales by bank holidays or total work days.
Here is the second query designed to count the weekdays:
with member measures.DaysBetweenFirstandNextOrder as sum({filter( [Date].[Date].[All].children ,( [Customer].[Customer].currentmember ,[Measures].[Internet Order Quantity] ) > 0 ) }.item(0):{filter( [Date].[Date].[All].children ,( [Customer].[Customer].currentmember ,[Measures].[Internet Order Quantity] ) > 0 ) }.item(1),[Measures].[Week Days]) set CustomerList as {head([Customer].[Customer].[All].children,20)} select CustomerList on rows ,measures.DaysBetweenFirstandNextOrder on columns from [Adventure Works]
It’s of course hard to feel the full benefits of the performance gains with AdventureWorks because of the modest size of the database compared to most real-world applications, but you can certainly appreciate the ease and flexibility this system offers as compared to trying to use the Date dimension to count types of days. And if you have separate regional calendars each with their own workdays and bank holidays, then all you have to do is add a Location or Store ID to the fact table, add the dimension to it, and you’ll be able to apply the same logic by Geographical properties.
And remember, this is a small fact table that only has to be built and processed once. It requires no maintenance after it’s been built.
Categories: Data Warehouse, MDX, Performance, SSAS
Leave a Reply