During my last project we had an interesting challenge that we solved rather creatively. We were in the process of developing a critical business dashboard. The dashboard contained a huge amount of data and in order to be responsive, it had to be cached.
The dashboard was to be used by users both internal and external to the company. In other words, we had to ensure that all users see only the data they are authorized to see. To tackle that, we organized users into User Groups, and each User Group could only view its own filtered data. In MicroStrategy terms, we mapped each Group to its Security Filter. Since Security Filters generate different SQLs for each User Group (WHERE clause is changed for each User Group), that meant that each User Group had its own version of the cache.
There was also a challenge with the cache update frequency and timing. The data we received was incomplete and came in irregular intervals, so we had to devise a caching strategy that recognised when complete data was received, reacted right away, and of course, was invisible to business.
Our Solution
To put the introduction into more concrete terms, we had to figure out how to update the MicroStrategy cache right after Complete data for a User Group is loaded into backend tables (We used SQL Server as our DBMS), and how to update it only for User Group(s) that have Complete data. After all, we want the data to be up to date, and there is no need for additional overhead of updating all User Groups if only one User Group received new data.
Essentially, we needed to:
- Notify MicroStrategy that a specific User Group was updated in the backend and transfer that information from the DBMS server to the MicroStrategy server for further processing
- Process the information and use it to update the MicroStrategy cache only for User Groups that were updated
To overcome the first challenge, we used status tables and FTP. For the second challenge we used a combination of Windows and MicroStrategy tools and systems.
So let’s talk about the status tables and what it means for data to be Complete.
Each night we had a standard nightly ETL process that populated the backend tables with new data. For each User Group we tracked multiple metrics, such as Revenue, Profit, and similar. When we received the data for all tracked metrics for a User Group for a month, we flagged the User Group as Complete and ready for a cache update. We stored this information in th Status Table.
Simplified example of Status Table data
After the standard nightly ETL process we had another ETL process that checks the Status Table and creates the Update File. The Update File contains the list containing all User Groups that are flagged as Complete and are not already Updated. The ETL process then sends the Update File via FTP to our MicroStrategy Web Server. Transferred to a Web Server machine, the Update File is ready for further processing.
Example of an Update File. Each line represents a User Group that received all metrics data for a month. In Status Table terms, that means that Complete = 1 and Updated = 0.
We installed the FTP server on our MicroStrategy Web Server machine. On the left hand side of the screen you can see the three folders on our FTP Site. We’ll talk about that a bit later.
FTP Site setup
The second challenge is to process the file’s contents and use that information to update the cache for User Groups that appear in the file. For that, we used a combination of:
- Batch scripting
- Windows Task scheduler
- MicroStrategy proprietary Command Manager tool
- A system of events, event-triggered schedules, and cache update subscriptions. All these are MicroStrategy objects.
The Update_Cache.bat batch script is scheduled to run on the MicroStrategy Web Server after the ETL nightly load is done. The job of the Update_Cache.bat is to process the contents of the Update File and trigger the MicroStrategy Event for each User Group contained in the file. Events are triggered from MicroStrategy Command Manager scripts (which we call from Update_Cache.bat). Events for individual User Groups would trigger event based schedules, which would in turn trigger subscriptions based on those schedules. When the subscriptions are triggered, the cache for a specific User_Group is finally updated.
Let’s try to make sense out of that.
Here is our Update_Cache.bat script. It’s broken down into four logical parts for easier understanding.
@echo off
echo SCRIPT START: %date% %time%
rem with echo off command, only rows starting with echo will be displayed. In that way debugging is made easier and we can easily track the execution
SETLOCAL ENABLEDELAYEDEXPANSION
rem position into InputFiles folder. /d option is required when changing drives (We're changing from C: to D: drive)
cd /d "D:\FTPSite\InputFiles"
This is the part where we start our script with a few utility/logging commands, and position ourselves in the InputFiles folder. This is the folder where the Update File is transferred from the ETL Server via FTP.
rem looping trough InputFiles folder. The script will now go through all Update Files it received in InputFiles folder.
for %%i in (*) do (
echo ****************************************************
echo Currently processing %%~nxi file
echo ****************************************************
rem check if Update File exists in InputFiles but doesn't exist in ProgressFiles
if exist D:\FTPSite\InputFiles\%%~nxi (
if not exist D:\FTPSite\ProgressFiles\%%~nxi (
rem if both is true move file from InputFiles to ProgressFiles
echo Moving %%~nxi from InputFiles to ProgressFiles
move D:\FTPSite\InputFiles\%%~nxi D:\FTPSite\ProgressFiles
Here we start the processing. We’re looping through the InputFiles folder and searching for the Update File. There can be multiple update files, so when we start processing the first one, we immediately move it from the staging area (InputFiles) into the ProgressFiles folder. Here it can be processed at leisure.
rem loop trough Update File in progress folder and run trigger for each User Group existing in a file rem if there is no Command Manager script for a User Group, skip the timeout and go to next user Group in a file
for /F %%A in (d:\FTPSite\ProgressFiles\%%~nxi) do (
if exist D:\Caching_CommandManager\%%A.scp (
echo Running Command Manager script for %%A
cmdmgr.exe -connlessMSTR -f "D:\Caching_CommandManager\%%A.scp" -o
”
D:\Caching_CommandManager_Output\CommandManager_Script_Output.txt"
rem give 5 mins (300 sec) of delay so cache can run
echo timeout 5 mins...
timeout /t 300
) else (echo Command Manager script for %%A doesn't exist)
)
This is the essence of the script. We run individual MicroStrategy Command Manager scripts for each User Group found in the Update File. Command Manager scripts are fairly simple. We connect to the Intelligence Server and trigger appropriate Events. The example script for User Group NorthWest can be found below.
Example of a Command Manager script
After the Event it triggered, we have 300 seconds of timeout so the cache for a User Group can be updated without interference from another User Group.
rem move file from Progress to Processed Files
echo Moving %%~nxi file to processed files
move D:\FTPSite\ProgressFiles\%%~nxi D:\FTPSite\ProcessedFiles
)
)
)
echo SCRIPT END: %date% %time%
This part of the script archives the Update File into the ProcessedFiles folder, and logs the date and time for logging and tracking purposes.
The last part of the solution is a system of events, event-triggered schedules, and cache update subscriptions. This part is standard MicroStrategy. For each User Group, we have created Events, event-triggered schedules, and finally subscriptions. When Command Manager scripts trigger an Event for let’s say User Group NorthWest, the event-triggered schedule is run, and Cache Update Subscription for User Group NorthWest is finally run. That is the final step in which the cache is updated.
Flow Diagram
Below is a rough graphical display of the whole process.
Flow diagram
Conclusion
This blog post shows how can MicroStrategy’s great interoperability capabilities be used to drive value for the clients. We used a combination of SSIS, MicroStrategy, and Windows tools to great effect.
In conclusion, event driven caching is a powerful way of automating the mundane tasks so we can concentrate on the creative work that we all love. MicroStrategy offers us powerful ways to achieve that goal, and we’ll be sure to continue in that direction.