Explaining: BottomCount, Filter and Null in SSAS2005 and ProClarity

Thomas was recently explaining some issues he had with the BottomCount() function in MDX. It was difficult to answer this in a comment on the post, so I thought I would see if I could address some of his issues here.

One common business analytical problem is to find customers with low sales. In MDX we have the BottomCount function for that but compared to the TopCount function, that return the best customers, BottomCount can return less clear results.

If you run this MDX Select in management studio, on the Adventure Works cube, you will see the 50 customers with least Internet Sales Amount in 2003 and what they have bought in 2004.  The MDX select is generated in ProClarity Professional 6.3 and I have used the filter functionality in the client.

SELECT { [Date].[Calendar].[Calendar Year].&[2003] , [Date].[Calendar].[Calendar Year].&[2004] } ON COLUMNS , NON EMPTY { BOTTOMCOUNT( { DESCENDANTS( [Customer].[Customer Geography].[All Customers], [Customer].[Customer Geography].[Customer] ) }, 50
, ( [Date].[Calendar].[Calendar Year].&[2003], [Measures].[Internet Sales Amount] ) ) }
ON ROWS FROM [Adventure Works] WHERE ( [Measures].[Internet Sales Amount] ) CELL PROPERTIES VALUE, FORMATTED_VALUE, CELL_ORDINAL ;

The problem with this result is that I have probably received 50 customers that not were customers in 2003 but started buying in 2004. Also note that I have put the Non Empty keyword before BottomCount and it will only remove empty rows after BottomCount have returned the bottom 50 customers. Because all these customers have bought something in 2004, no rows will be removed.

In this situation we are getting exactly what was asked for, the customers were basically sorted from highest to lowest and the bottom 50 were returned, no comparison was done with any other year and customers with empty sales amounts in 2003 were not filtered out so they will be returned as part of the bottom 50. In fact in the adventure works cube there are more than 50 customers without sales in 2003 so all of the bottom 50 customers have no sales for 2003.

Now I will show you an even more strange result with BottomCount:

SELECT { [Date].[Calendar].[Calendar Year].&[2002]
, [Date].[Calendar].[Calendar Year].&[2003] } ON COLUMNS , NON EMPTY { BOTTOMCOUNT( { DESCENDANTS( [Customer].[Customer Geography].[All Customers], [Customer].[Customer Geography].[Customer] ) }
, 50
, ( [Date].[Calendar].[Calendar Year].&[2003], [Measures].[Internet Sales Amount] ) ) }
ON ROWS FROM [Adventure Works] WHERE ( [Measures].[Internet Sales Amount] ) CELL PROPERTIES VALUE, FORMATTED_VALUE, CELL_ORDINAL ;

This MDX query will return only seven records. The only difference with the first select is that I have changed the year 2004 to 2002, on columns, but the BottomCount function is unchanged. I have no explanation for this behavior.

The explanation for this is that we are returning the 50 customers that had the least Internet Sales Amount in 2003, because we have not filtered out empty cells, this bottom 50 customers is entirely made up of people that have an empty (or null) value in 2003. It turns out that from these 50 customers that did not buy anything in 2003, that 7 of them happened to buy something in 2002. So when we apply the NON EMPTY to the row axis we are only left with these 7 customers. 

<snip>

In ProClarity Professional 6.3 I have combined the BottomCount function with a second filter that should remove values less than 0, 00 . You can see this in the following MDX Select but it will not work. It is a little bit sad that this business problem did not get more attention and was not implemented correctly.

SELECT { [Measures].[Internet Sales Amount] } ON COLUMNS , { EXCEPT( { BOTTOMCOUNT( { DESCENDANTS( [Customer].[Customer Geography].[All Customers]
, [Customer].[Customer Geography].[Customer] ) }, 50, ( [Measures].[Internet Sales Amount] ) ) } , { FILTER( { BOTTOMCOUNT( { DESCENDANTS( [Customer].[Customer Geography].[All Customers]
, [Customer].[Customer Geography].[Customer] ) }, 50, ( [Measures].[Internet Sales Amount] ) ) }
, ( [Measures].[Internet Sales Amount] ) < 0.00000000000000 ) } ) } ON ROWS FROM [Adventure Works] WHERE ( [Date].[Calendar].[Calendar Year].&[2003] )

Unfortunately I don't think this last solution is quite right either. When pulled out the second Filter(BottomCount(...)) section of the above query, it did not actually return any rows, this is because the BottomCount() is returning customers with empty sales values and the filter function is then excluding them all.

I think what Thomas was after was to filter the set of customers first, so that you actually get the 50 customers with the lowest sales and exclude the customers that did not have any sales. Note that I am using the NonEmpty() function inside the BottomCount() to filter out empty cells, before getting the bottom 50.


SELECT { [Measures].[Internet Sales Amount] } ON COLUMNS ,
BOTTOMCOUNT(
    NONEMPTY(DESCENDANTS( [Customer].[Customer Geography].[All Customers]
               , [Customer].[Customer Geography].[Customer] ) 
           , ( [Measures].[Internet Sales Amount] ) )
    , 50
    , ( [Measures].[Internet Sales Amount] )
  ) ON
ROWS
FROM
[Adventure Works]
WHERE ( [Date].[Calendar].[Calendar Year].&[2003] ) ;

 

The other possible twist that I can see to this is that if the measure that you are testing with contains actual zero values that you want excluded, then you would need to explicitly use the Filter() function rather than the NonEmpty() function.

 

SELECT { [Measures].[Internet Sales Amount] } ON COLUMNS , 
 BOTTOMCOUNT(
    FILTER(DESCENDANTS( [Customer].[Customer Geography].[All Customers]
               , [Customer].[Customer Geography].[Customer] ) 
           , ([Measures].[Internet Sales Amount]) > 0
           )
    , 50
    , ( [Measures].[Internet Sales Amount] ) 
  ) ON ROWS
FROM [Adventure Works] 
WHERE ( [Date].[Calendar].[Calendar Year].&[2003] ) ;

Print | posted on Wednesday, August 15, 2007 11:57 PM

Comments on this post

# re: Explaining: BottomCount, Filter and Null in SSAS2005 and ProClarity

Requesting Gravatar...
Hello Darren! Thank's for the comment on my blog and that you have paid attention to the problem.

Your final solution looks like the one I have found if I remove the Order Function. And I have filtered out null instead of zero, but both ways works.

Select {[Measures].[Internet Sales Amount]} On Columns,
{Order(BottomCount(Filter([Customer].[Customer Geography].[Customer].members,[Measures].[Internet Sales Amount] <> Null),50,
[Measures].[Internet Sales Amount]),[Measures].[Internet Sales Amount],BDESC)} On Rows
From [Adventure Works]
Where ([Date].[Calendar].[Calendar Year].&[2002])

Kind Regards
Thomas
Left by Thomas Ivarsson on Aug 16, 2007 2:27 AM

Your comment:

 (will show your gravatar)