Montag, 21. Dezember 2009

Changing results with IIF hints

Recently we saw some strange effects with calculations using a conditional IIF-logic: the calculation should show the difference between two measures if the first measure is greater than 0:
IIF([Measures].[FirstMeasure]=0,NULL,
        ([Measures].[FirstMeasure]-[Measures].[SecondMeasure])
A query using the calculation returned the expected results for the [All]-node of a given hierarchy but wrong results for other levels. Since the given measures are involved in some scoped assignments we assumed an error in our logic, and were quite suprised when we discovered that a CASE-block returned the correct results:
CASE [Measures].[FirstMeasure] 
         WHEN 0 THEN NULL
         ELSE [Measures].[FirstMeasure]-[Measures].[SecondMeasure]
    END
Further research showed that IIF also brought the correct results when we used the eager hint in the calculation. The hint is a new feature in SSAS 2008 and documented here. The SQLCAT team explains the hint in the blog IIF Function Query Hints in SQL Server Analysis Services 2008:
the IIF mdx function is a commonly used expression that can be very costly to evaluate. It takes 3 arguments:

iif(< condition >, < then branch >, < else branch >)

Where the condition evaluates to true, the value from the “then branch” is used otherwise the “else branch” expression is used.

Note the term “used” – one or both branches may be evaluated even if its value is not used. It may be cheaper for the engine to evaluate the expression over the entire space and use it when needed (termed an eager plan) rather than chop up the space into a potentially enormous number of fragments and evaluate only where needed (a strict plan).

Apparently the different execution pathes can bring different results - and that is something I would not expect...

Freitag, 18. Dezember 2009

Subtotals for Named Sets

Although still in its infancy, named sets could really help you to satisfy your numerous customers ad-hoc aggregation requirements ...

... if they only would show sub-totals in excel and if you only could mount non-disjunct instances of those beasts one after the other.

Here is a tip that makes use of the implicit property of named sets to represent intersections when confronted with members of different levels from a common hierarchy.

In its extreme application, adding the [All] member to a set will deliver you a sub-total that is at the same time correctly named "Gesamtergebnis" or "Total" when displayed in excel.

Do not call a dimension attribute "DESCRIPTION"!

At least the Office Web Components (OWC) browser integrated into BI Development Studio and Management Studio will not show the values as a member property when the attribute (Name, not key nor translation) is called "DESCRIPTION".

Donnerstag, 17. Dezember 2009

Federating a Secondary SQL Server Data Source into a Primary-Oracle Data View

It is a common, though tricky, thing to host some, but not all of your analytical data that is to be fed into the cubeware of your trust in the nearby, type-consistent and easy-to-maintain SQL Server instance.

This methodology has also the nice side-effect that it will not give the .. say Oracle-minded ... datawarehouse people much to reason about such they will not get distracted from their ... um ... daily slee.., ah, work was the political correct term.

The idea will however bite you the first time you notice that you cannot mix dimensions and facts between two seperate Oracle/SQL Server data views. So you will read diverse blog entries trying to mix the two data sources within a single view.

Oracle will be your primary data source. "Add table" in the view allows you weel to choose a registered SQL Server as a secondary data source, but the table chooser stays empty!

If you read above-mentioned thread to its end, there will be the definitive hint, which is "Add Named Query" and .... here you are ... nearly.

Be aware that this will operate on some "Remote view" feature of the Oracle ODBC driver. For example, Oracle tends to lower-case only the key columns, which can fool you when switching an existing dimension from a seperate data view to the now merged view.

I would not recommend trying it the other way round (Oracle secondary, SQL Server primary data source) unless your primary data source will become bigger than the secondary one ... ok, we still hope that ... since it is to expect that using this kind of federation for practical volumes of fact data will not be funny.

Mittwoch, 9. Dezember 2009

How to translate the [All] member

I was always hesitating on that issue because I did not want to affect a thousand lines of MDX-Code in which [Dimension].[Hierarchy].[All] is a common construct, e.g., to exclude unwanted detail-calculations from a calculation scope.

But per default, there is no translation row in the dimension editor that allows one to override the display name of that ubiquitous and ugly (though, comprehensible) node.

Here comes the (straightforward) step: Simly set the property "AttributeAllMemberName" of the relevant dimensions explicitely to the default value "All" and - voilà - you have the translation rows for your pleasure without having to reinterpret your scripts.

Please be aware that the member must be also (re-)translated for all the hierarchies (these rows will also automatically appear). Similarly, you can treat the "UnknownMemberName" ...

Montag, 7. Dezember 2009

Non-Pivot-Friendly MDX through Excel 2007 [Updated]

I am sooooo glad for Greg Galloway´s tiny little comment to this post.

Because we were looking for a solution to integrate MDX queries into Excel which are not very pivot-table friendly.

Why do you ever want to do something esoteric like that? Suppose you want to compare very specific data-points in your tuple-space with each other (a monthly real margin from one year in one store with a daily plan margin from a region in a completely different year). Then, your pivot table will necessarily cross-join all the individual coordinate components (year/month/day(s) of both years, affected region/store(s), plan & real measures) and come with a solution set that is orders of magnitude too large and cannot be reasonably filtered without going nuts.

That is where you should rather build up a small (remember: SMALL in the sense of aggregated fact data) pivot table, do a drill-though and then save the resulting query table as a nice template for your future experiments.

I have not been able to create that query table by any other means in Excel 2007!

Update:

There is another way for executing and displaying mdx described by Hilmar Buchta which might be more useful when you need full (VBA) control over the connection/command behaviour, e.g., when you first need to check whether the result is empty (and hence would destroy the layouted table structure).

Freitag, 4. Dezember 2009

Monitoring current SSAS acitivities

A question DBA's are always interested in is: who is connected to my database and what are they doing? For SSAS 2008 there are some Dynamic Management Views (DMV) to answer this question(s) (and many others). The query:
SELECT TABLE_NAME
  FROM $system.dbschema_tables
 WHERE TABLE_SCHEMA = '$SYSTEM'
   AND LEFT(TABLE_NAME,8) = 'DISCOVER'
 ORDER BY table_name
provides the following list of tables (I added links to and descriptions from the documentation; some of the linked pages are not very extensive):

- DISCOVER_COMMAND_OBJECTS: Provides resource usage and activity information about the objects in use by the referenced command.
- DISCOVER_COMMANDS: Provides resource usage and activity information about the currently executing or last executed commands in the opened connections on the server.
- DISCOVER_CONNECTIONS: Provides resource usage and activity information about the currently opened connections on the server.
- DISCOVER_DB_CONNECTIONS: Provides resource usage and activity information about the currently opened connections from the server to a database.
- DISCOVER_DIMENSION_STAT: This schema rowset returns statistics on the specified dimension.
- DISCOVER_ENUMERATORS: Returns a list of names, data types, and enumeration values of enumerators supported by the Microsoft XML for Analysis (XMLA) provider for a specific data source.
- DISCOVER_INSTANCES: Describes the instances on the server.
- DISCOVER_JOBS: Provides information about the active jobs executing on the server.
- DISCOVER_KEYWORDS: Returns information about keywords reserved by the Microsoft XML for Analysis (XMLA) provider.
- DISCOVER_LITERALS: Returns information about literals, including data types and values, supported by the Microsoft XML for Analysis (XMLA) provider.
- DISCOVER_LOCKS: Provides information about the current standing locks on the server.
- DISCOVER_MASTER_KEY: This schema rowset returns the server's master encryption key (the key that is used by the server to encrypt sensitive data on disk).
- DISCOVER_MEMORYGRANT: This rowset returns a list of internal memory quota grants that are taken by jobs that are currently running on the server.
- DISCOVER_MEMORYUSAGE: This schema rowset returns the DISCOVER_MEMORYUSAGE statistics for various objects allocated by the server.
- DISCOVER_OBJECT_ACTIVITY: This schema rowset provides resource usage per object since the start of the service. Some further information on this view can be found in Vidas Matelis' Blog.
- DISCOVER_OBJECT_MEMORY_USAGE: Provides information about memory resources used by objects.
- DISCOVER_PARTITION_DIMENSION_STAT: This schema rowset returns statistics on the dimension associated with a partition.
- DISCOVER_PARTITION_STAT: This schema rowset returns statistics on aggregations in a particular partition.
- DISCOVER_PERFORMANCE_COUNTERS: This schema rowset returns the value of one or more specified performance counters.
- DISCOVER_PROPERTIES: Returns a list of information and values about the standard and provider-specific properties that are supported by the Microsoft XML for Analysis (XMLA) provider for the specified data source.
- DISCOVER_SCHEMA_ROWSETS: Returns the names, restrictions, description, and other information for all enumeration values and any additional provider-specific enumeration values supported by the Microsoft XML for Analysis (XMLA) provider.
- DISCOVER_SESSIONS: Provides resource usage and activity information about the currently opened sessions on the server.
- DISCOVER_TRACE_COLUMNS: This schema rowset contains the DISCOVER_TRACE_COLUMNS schema rowset.
- DISCOVER_TRACE_DEFINITION_PROVIDERINFO: This schema rowset returns basic information about the trace provider, such as its name and description.
- DISCOVER_TRACE_EVENT_CATEGORIES: This schema rowset contains the DISCOVER_TRACE_EVENT_CATEGORIES schema rowset.
- DISCOVER_TRACES: This schema rowset contains the DISCOVER_TRACES schema rowset.
- DISCOVER_TRANSACTIONS: This schema rowset returns the current set of pending transactions on the system.

Some of these views have to be queried with the function SYSTEMRESTRICTSCHEMA to query restricted schema rowsets - you find some examples for the use of this function in Vidas Matelis' Blog.

To answer the initial question - who is on my db? - you could use the (simple) query
SELECT *
  FROM $system.DISCOVER_SESSIONS

Another option to get these information is to use the small GUI-application Activity Viewer (download from codeplex).

Dienstag, 1. Dezember 2009

Editions

The msdn site Features Supported by the Editions of SQL Server 2008 shows the differences between the SQL Server editions. Regarding SSAS the Standard Edition (SE) misses the following features compared to the Enterprise Edition (EE):

Data Warehouse Scale and Performance
- Auto parallel partition processing
- Partitioned cubes
- Distributed partitioned cubes

Multi-Dimensional Analytics
- Financial aggregations
- Custom rollups
- Semi-additive measures
- Writeback dimensions
- Linked measures and dimensions
- Binary and compressed XML transport
- Account intelligence
- Perspectives
- Analysis Services shared, scalable databases

Data Mining (in this category SE is quite weak)
- Cross validation
- Models on filtered subsets of mining structure data
- Time series: custom blending between ARTXP and ARIMA models
- Time series: prediction with new data
- Unlimited concurrent data mining queries
- Advanced configuration and tuning for algorithms
- Algorithm plug-in API
- Parallel model processing
- Time series: cross-series prediction
- Unlimited attributes for association rules
- Sequence prediction
- Multiple prediction targets for naïve Bayes, neural network, and logistic regression

Of course you can build cubes without these features but they are very helpful in many situations. On the other side there's of course a difference in the licence fees... - here are some additional links to licence and pricing information:

- SQL Server 2008 Pricing
- SQL Server 2008 Licensing FAQ