Improve your MDX with “Time” Fact Tables

As I’ve built data warehouses and SSAS cubes over the years, I’ve accumulated components that I include in every cube that I build. They exist because they cleanly solved recurring modeling problems and/or made MDX queries lore efficient. I’ll share these with you over time in the hopes that you might find it as useful as I have. The first one of these components is the time measure group.

The problem

Most sales related cubes are going to use the time dimension to try and figure things out such as:

1)      Sales divided up by work weeks for a given period. The user needs to know for a given quarter how many sales orders per day were generated on average – but only counting the days the establishment was actually open.

2)      Length of time between the first order placed by a customer and the next order

3)      How many Fridays in a period

4)      How many days a given employee has worked in a period

All of these require counting workdays, weekdays, etc. Granted this can be done via set operations in a time dimension, but I’ve found those to be clumsy and inefficient. SSAS excels in managing aggregates at the cost of being a bit slow in scrolling through individual dimensions and detail data. Let me give you an example of trying to figure out (1) using the Date dimension in AdventureWorks 2008 R2:

The query that would respond to the second question is downright painful using the time dimension in MDX. First we need to get the first order and the second order, get the dates for each, create a date span with those two dates and count the members. Ouch. But here goes …

 

This gives me the number of days between the first order and the next order for my customers. This is a very poorly performing query which may or may not be improved with some tweaking. It gets even worse if we want to count how many workdays passed between orders. I don’t even want to write that query. One could use the ETL to put in a measure containing the number of days since a previous order, but that would require knowing during the initial design phase that queries like this will come up. Also, you would lose the flexibility the cube offers to answer the question within a given context, such as product line, or specific time period. Besides, on large stores, this can add enormous overhead to an ETL load.

There is a better way to answer these questions straight from the cube: by creating a Date fact table. Essentially this is a fact table that contains the measure “1” for each day, another measure “1” for each weekday, another measure “1” for workdays, etc. This fact table obviously uses the Time dimension but can also use the Store dimension if you need to keep track of workdays by store or location.

Just to illustrate the point, consider creating your time-based fact table from the Date dimension itself (I also created a small bank holiday table to join with this one – but this is optional):

Then just build your fact table based on that table with the Time dimension as the only dimension.  You should then be able to create a measure group like this one:

Measure Group

Now let’s look at how we re-write the query for (1) with the new measure group:

It looks simple to understand, needs no cumbersome filtering, and offers the flexibility to examine sales by bank holidays or total work days.

Here is the second query designed to count the weekdays:

It’s of course hard to feel the full benefits of the performance gains with AdventureWorks because of the modest size of the database compared to most real-world applications, but you can certainly appreciate the ease and flexibility this system offers as compared to trying to use the Date dimension to count types of days. And if you have separate regional calendars each with their own workdays and bank holidays, then all you have to do is add a Location or Store ID to the fact table, add the dimension to it, and you’ll be able to apply the same logic by Geographical properties.

And remember, this is a small fact table that only has to be built and processed once. It requires no maintenance after it’s been built.

 



Categories: Data Warehouse, MDX, Performance, SSAS

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: