SSAS Musings: MDX

Posts mit dem Label MDX werden angezeigt. Alle Posts anzeigen

Freitag, 19. November 2010

Implicit Recursion

Darren Gosbell advises not to use Implicit Recursion in MDX.

Mittwoch, 20. Oktober 2010

Boyan Penev writes in his Blog about Currency Conversion in SSAS. He provides a lot of interesting links and includes also practical examples. His Blog is absolutly worth reading (and is now included into the link section of this page).

Dienstag, 26. Januar 2010

Quasi-Cursor-Reading of Large Cellsets

One annoyance especially when batch-testing Cubes/Dimensions against the relational data source is that the MDX/XMLA-Architecture has no concept of continuation or open-cursor-reading.

And we all know how soon SSMS or MDX Studio present an OutOfMemoryException!

One particularly useful construct in that regard is the

SubSet(Set,Position,Length)

MDX-function which can be easily used to technically partition large result spaces into manageable junks.

Montag, 25. Januar 2010

Grouping in SQL and MDX

In his Blog MDX equivalent of a filtered GROUP BY in SQL Darren Gosbell shows how to translate a SQL query (with a restriction in the WHERE-clause and a GROUP BY aggregating elements of a higher hierarchy level) into a corresponding MDX query. Although MDX looks like SQL it's something completly different, so every translation is beneficial.

Montag, 21. Dezember 2009

Changing results with IIF hints

Recently we saw some strange effects with calculations using a conditional IIF-logic: the calculation should show the difference between two measures if the first measure is greater than 0:

IIF([Measures].[FirstMeasure]=0,NULL,
        ([Measures].[FirstMeasure]-[Measures].[SecondMeasure])

A query using the calculation returned the expected results for the [All]-node of a given hierarchy but wrong results for other levels. Since the given measures are involved in some scoped assignments we assumed an error in our logic, and were quite suprised when we discovered that a CASE-block returned the correct results:

CASE [Measures].[FirstMeasure] 
         WHEN 0 THEN NULL
         ELSE [Measures].[FirstMeasure]-[Measures].[SecondMeasure]
    END

Further research showed that IIF also brought the correct results when we used the eager hint in the calculation. The hint is a new feature in SSAS 2008 and documented here. The SQLCAT team explains the hint in the blog IIF Function Query Hints in SQL Server Analysis Services 2008:

the IIF mdx function is a commonly used expression that can be very costly to evaluate. It takes 3 arguments:

iif(< condition >, < then branch >, < else branch >)

Where the condition evaluates to true, the value from the “then branch” is used otherwise the “else branch” expression is used.

Note the term “used” – one or both branches may be evaluated even if its value is not used. It may be cheaper for the engine to evaluate the expression over the entire space and use it when needed (termed an eager plan) rather than chop up the space into a potentially enormous number of fragments and evaluate only where needed (a strict plan).

Apparently the different execution pathes can bring different results - and that is something I would not expect...

Freitag, 20. November 2009

MDX Query Performance in SSAS

there are a few documents you should know if you want to optimize the performance of the MDX queries on your SSAS cubes:

The document SQL Server 2008: Analysis Services Performance Guide (download here) provides information on the tuning of server resources and shows strategies to enhance the performance of cube processing and query access.

The paper Identifying and Resolving MDX Query Performance Bottlenecks in SQL Server 2005 Analysis Services (download here) discusses the troubleshooting tools integrated into Microsoft Windows and SQL Server and shows how to use these tools to identify and resolve MDX query performance bottlenecks with individual MDX queries. Most of the given advices are still true for SSAS 2008 (as far as I can see...)

Some interesting papers and articles can be found at the web site of the Microsoft SQL Server Development Customer Advisory Team.

Unfortunatly there are many advices which were of interest in SSAS 2005 but are no longer relevant in SSAS 2008 - so you have to be very careful when adopting the strategies shown in various blogs and newsgroups in the Net.

Donnerstag, 19. November 2009

Centrally Managing Formatting Properties in MDX Scripts

This is a quite current goody already sketched by Marc Russo:

Formatting is indeed in the twilight zone between server/analytic logic and client display issues. Originally you would want your cube to be agnostic of all that representational stuff which may change with the eye of the beholder.

On the other hand, it is unbearable to walk through the conditinal formatting result of ~100 affected measures * 30 affected reports once a formatting rule changes.

A good compromise between the "this-is-client-stuff" and the "incorporate-it-in-each-and-every-measure' is using the power of MDX scripting. The scoping approach that Marc Russo describes (and from which I learned the invaluable [Measure].CurrentMember trick) will not work, as soon as
you need to address targetted measures (such as "all currencies" or "all indexes") across calculated and real measures (an explicit measure set in a scope may only contain non-calculated measures).

So here is how it works in these cases (and similar for FORE_COLOR, BACK_COLOR, etc):

FORMAT_STRING({
 [Measures].[REAL_MEASURE_1]
 ,[Measures].[CALCULATED_MEASURE_2]
 ,...
 })=// YOUR MDX CALCULATION using [Measures].CurrentMember and returning a format string
 ;

Mittwoch, 18. November 2009

MDX Studio

As Mosha Pasumansky states MDX Studio is a

tool which helps users of Microsoft Analysis Services to analyze complex MDX expressions, monitor performance characteristics of MDX queries and get insight into how MDX interacts with other UDM features such as attribute relationships. MDX Studio provides unique visualization representation of MDX expression and allows the user to interact with different stages of MDX execution.

It's indeed an extremly useful tool: it shows valuable information that is not included in profiler traces such as the number of cache hits, calculated cells, memory usage etc. Another important feature is the ability to show a hierarchical tree of execution steps (which Mosha describes here). Very helpful is also the ANALYZE option that (- it's not a surprise -) analyzes a given query or mdx script, shows possible issues, and provides links to wider explanations (mainly from Moshas Blog).

The current version of the tool (a work in progress) can be downloaded from here.

Dienstag, 17. November 2009

EXISTING keyword

As the documentation states the EXISTING keyword forces a specified set to be evaluated within the current context instead of being evaluated within the context of the cube that contains the members of the set. Mosha Pasumasky declares "that Exists function and EXISTING operator are generalization of hierarchy related functions such as Descendants. Indeed, the process of building an hierarchy is nothing more then applying Exists between attributes which build individual levels of that hierarchy." So EXISTING shows only elements that are actually relevant in the given context.

If we are looking for the number of cities in each country in the Adventure Works Cube and use the following query:


WITH
  MEMBER [CITY_COUNT] AS COUNT([GEOGRAPHY].[CITY].[CITY].MEMBERS)
SELECT {[CITY_COUNT]} ON 0
       ,[GEOGRAPHY].[GEOGRAPHY].[COUNTRY].MEMBERS ON 1
  FROM [ADVENTURE WORKS]

we do not get the intended result:


                CITY_COUNT
Australia              587
Canada                 587
France                 587
Germany                587
United Kingdom         587
United States          587

That's the total count of cities for every country - apparently the evaluation took place in the context of the cube and not in the current context. With EXISTING we get what we want:


WITH
  MEMBER [CITY_COUNT] AS COUNT(EXISTING([GEOGRAPHY].[CITY].[CITY].MEMBERS))
SELECT {[CITY_COUNT]} ON 0
       ,[GEOGRAPHY].[GEOGRAPHY].[COUNTRY].MEMBERS ON 1
  FROM [ADVENTURE WORKS]

provides the result:


                CITY_COUNT
Australia               40
Canada                  50
France                  36
Germany                 51
United Kingdom          35
United States          375

As already mentioned here EXISTING can also be used to solve some problems with multiselects.

In his Blog Chris Webb wrote about the performance of EXISTING in combination with NONEMPTY and stated "if you are using NonEmpty and Existing together, do the NonEmpty first and then the Existing" - but admitted that he has no sound explanation for the observed effect.

Multiselect

Multiselects are a source of constant discomfort in SSAS 2005 ff. When there are multiple members in the slice of a MDX query the currentmember function no longer defines a single element and that's the reason for many problems with calculated members. Some of these problems can be solved by the use of special syntax elements (most notably the EXISTING function) but client tools like Excel 2007 generate their own queries so the developer can not control the design (and Excel makes things even worse by using subselects which bring even more annoyance than simple where-clause multiselects).

At least there are some solutions to avoid the most unpleasent consequences - and most of these solutions come from Mosha Pasumansky:

In his blog articles Writing multiselect friendly MDX calculations and Multiselect friendly MDX for calculations looking at current coordinate Mosha Pasumansky shows a way to avoid the multiselect problems by using the function EXISTING and sum up the results. Another solution (using a stored procedure) is shown in How to detect subselect inside MDX calculations (aka Multiselect in Excel 2007) - but Mosha warns "that the solution is neither elegant nor efficient, and probably is mostly of the theoretical interest." In his article I need your help or how to make multiselect work seamlessly in MDX Mosha suggests a solution for all multiselect problems - but it seems he kept that secret...

The blog Korrekte MDX-Berechnungen mit den Analysis Services trotz Multiselect makes the suggestion to define "empty" measures (initially filled with NULL values) and then to overwrite them with scoped assignments on every relevant level - so correct values can be aggregated for higher levels.

All these solutions have an impact on performance, so Mosha writes: "Therefore, it is certainly a choice of cube designer - keep the simpler and more performance friendly expression, but accept that multiselect on the Time dimension will cause errors, or allow more functionality and enable multiselects on Time, but be prepared for slower performance in all cases."

Nested Scopes and Attribute Relationships

Nested Scopes are a convenient syntax to structure subcube/crossjoin calculations in your MDX scripts. Most often, you would want to use more than one attribute from the same dimension in such a context and, in addition, navigate the associated hierarchies in the scoped calculations:

SCOPE({[DIMENSION].[HIERARCHY_1].[LEVEL_OR_ATTRIBUTE_1].&[KEY_1]},...);
   SCOPE({[DIMENSION].[HIERARCHY_2].[LEVEL_OR_ATTRIBUTE_2].&[KEY_2]},...);
        // Your calculation involving [DIMENSION].[HIERARCHY_1].CurrentMember
        // will result in navigating [DIMENSION].[ATTRIBUTE_1].[All]!
   END SCOPE;
END SCOPE;

However, with nested scopes it seems that similar overriding rules than with concatenating ordinary tuples apply: Suppose the attribute ATTRIBUTE_2 associated with the LEVEL_OR_ATTRIBUTE_2 level is related to the attribute ATTRIBUTE_1 (i.e., KEY_2 is unique for KEY_1, in contrast to the related from relationship where KEY_1 is unique for KEY_2), the example calculation context above would result to the All member of ATTRIBUTE_1. Duh.

Writing the nesting the other way round (which would seem semantically equivalent from only superficially reading the specs, as we usually do) the delivers the expected result

SCOPE({[DIMENSION].[HIERARCHY_2].[LEVEL_OR_ATTRIBUTE_2].&[KEY_2]},...);
    SCOPE({[DIMENSION].[HIERARCHY_1].[LEVEL_OR_ATTRIBUTE_1].&[KEY_1]},...);
        // Your calculation involving [DIMENSION].[HIERARCHY_1].CurrentMember
        // will finally navigate [DIMENSION].[ATTRIBUTE_1].&[KEY_1]!
    END SCOPE;
END SCOPE;

SSAS Musings