Tuesday, 8 November 2016

Add new partition to SSAS cube programmatically

I'm stuck for time, so this is a brief overview and doesn't go into a lot of detail

I have built a cube for a small project to report on promotional transactions.
Stats for the machine (Dev): Windows Server 2012 (virtual), SQL 2012, 50GB RAM, 4TB hard drive and 10 cores
Currently the database has 3.5 billion-ish records in 2 tables (7 billion in total) covering 3 years of transactions. Due to the size of the data and the shape of the date hierarchy, the cube partitions will be by week. As of today there are 145 partitions - with a new one being created automatically each week.

That is a lot of partitions to manually create and there is no standard task in SSIS to help with the problem, therefore a mix of ASSP library and C# custom tasks will be employed to the the dirty work for us.
Due to the nature of this data I can't go into a fully working example, but will talk through each step.

Assumption: you have already built a cube!

The first thing to do is download the ASSP (Analysis Services Stored Procedure) library: https://asstoredprocedures.codeplex.com/
Once you have this, log into Analysis Services from SSMS, right click 'Assemblies' and add the ASSP dll.
This will allow you to run 'call assp.DiscoverXmlMetadata("Partition")' as an MDX query against the cube. it will return all the partitions with lots of useful information that we'll need later.

Step 1: Create a dataflow task to populate a table to contain you partition metadata:
In 'MDX to query partitions':
Next set the desination (SSIS can create the table automatically for you):
This table will allow SSIS to navigate which partition(s) to process.

This post assumes you have loaded new data into a fact table already and so we will focus on building a new partition from this data.

Each partition uses a naming convention that includes a date referencing the data contained within it. I.e. 'PARTITION 2016-01-01'. This will allow us to compare the partition dates to the dates in our fact table. From this we can work out what new 'weeks' need to added to the cube.

Here is the query:
select
convert(varchar(10),d.WEEK_ENDING,120) as WEEK_ENDING
from Date d
inner join Sales s
on d.DATE = s.DATE
where WEEK_ENDING > (select top 1 RIGHT(Name,10) as PartitionWeek from [dbo].[PromotionsCube_Partitions] order by PartitionWeek DESC)
GROUP BY convert(varchar(10),d.WEEK_ENDING,120)
ORDER BY WEEK_ENDING


Put the results into an SSIS object variable and loop through it:


The task within the loop is an 'Analysis Services Execute DDL Task'.


This task allows us to execute xml scripts to run on a cube. In this case we are going to create a new partition using the 'week_ending' from our loop.

Now open SSMS and connect to Analysis Services.
navigate to your cube and drill down to partitions. Right-click on a partition, Script Parition As then 'Create' you will be provided with an xml script that we can modify and utilise.

<Create xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
    <ParentObject>
        <DatabaseID>PromotionsCube</DatabaseID>
        <CubeID>Promotions</CubeID>
        <MeasureGroupID>Sales</MeasureGroupID>
    </ParentObject>
    <ObjectDefinition>
        <Partition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300" xmlns:ddl400="http://schemas.microsoft.com/analysisservices/2012/engine/400" xmlns:ddl400_400="http://schemas.microsoft.com/analysisservices/2012/engine/400/400">
            <ID>Sales_2015-11-19</ID>
            <Name>2015-11-19</Name>
            <Source xsi:type="QueryBinding">
                <DataSourceID>Promotions</DataSourceID>
                <QueryDefinition>SELECT YOUR SQL QUERY THAT FORMS PARTITION WHERE WEEK_ENDING = '2015-11-19'</QueryDefinition>
            </Source>
            <StorageMode>Molap</StorageMode>
            <ProcessingMode>Regular</ProcessingMode>
            <Slice>[Date].[SALES HIERARCHY].[WEEK ENDING].&[2015-11-19T00:00:00]</Slice>
            <ProactiveCaching>
                <SilenceInterval>-PT1S</SilenceInterval>
                <Latency>-PT1S</Latency>
                <SilenceOverrideInterval>-PT1S</SilenceOverrideInterval>
                <ForceRebuildInterval>-PT1S</ForceRebuildInterval>
                <Source xsi:type="ProactiveCachingInheritedBinding" />
            </ProactiveCaching>
            <EstimatedRows>10782922</EstimatedRows>
            <AggregationDesignID>AggregationDesign</AggregationDesignID>
        </Partition>
    </ObjectDefinition>
</Create>



Anywhere that references '2015-11-19' will be replaced by our 'week_ending' variable from the loop. This now creates our new partitions, using the standard aggregation design. I might cover that in another post.

Now we re-run our ASSP query to look for any Unprocessed partitions and use C# custom task to process these:

public void Main()
  {
            bool fireAgain = true;

            Microsoft.AnalysisServices.Server objServer = new Microsoft.AnalysisServices.Server();
            Microsoft.AnalysisServices.Database objDatabase = new Microsoft.AnalysisServices.Database();
            Microsoft.AnalysisServices.Cube objCube = new Microsoft.AnalysisServices.Cube();
            Microsoft.AnalysisServices.MeasureGroup objMeasureGroup = new Microsoft.AnalysisServices.MeasureGroup();
            Microsoft.AnalysisServices.Partition objPartition = new Microsoft.AnalysisServices.Partition();


            objServer.Connect("SERVER");
            objDatabase = objServer.Databases["Cube"];
            objCube = objDatabase.Cubes["Promotion"];
            objMeasureGroup = objCube.MeasureGroups["Sales"];
            objPartition = objMeasureGroup.Partitions["Sales_" + Dts.Variables["WEEK_ENDING"].Value.ToString()];

            


            byte[] emptyBytes = new byte[0];
            Dts.Log("Partition: " + "Sales_" + Dts.Variables["WEEK_ENDING"].Value.ToString() + " process starting", 0, emptyBytes);

            Dts.Events.FireInformation(0, "Sales_" + Dts.Variables["WEEK_ENDING"].Value.ToString(), "process starting.", String.Empty, 0, ref fireAgain);

            objPartition.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull);

            Dts.Log("Partition: " + "Sales_" + Dts.Variables["WEEK_ENDING"].Value.ToString() + " process completed", 0, emptyBytes);

            Dts.Events.FireInformation(0, "Sales_" + Dts.Variables["WEEK_ENDING"].Value.ToString(), "process completed.", String.Empty, 0, ref fireAgain);


   Dts.TaskResult = (int)ScriptResults.Success;
  }


Next process the Measure Group and Cube set as 'Default' processing.

Job done. This post took 10 mins to put together, so if someone is stuck I'll happily send more information. I might even update it with more details.

No comments:

Post a Comment