Thursday, May 08, 2014

SSAS Cube issues ..... Incorrect Measure Values

Have you ever noticed that your measure value in SSAS do not correspond to the value in the data warehouse.... this is a big hindrance....because one is wasting precious development time in
extrapolating data between the cube and the warehouse. I am just going to create a checklist for issues to look at really quick if one does come across this issue!!!

Problem Statement:
Let us consider a Fact table called FactInternetSales with a fact called internetsalesamount
select sum(internetsalesamount) from FactInternetSales;

Let us say that this value is 25000.
Now if we run this against the cube .... lets call it Sales.
select measures].[internetsalesamount] on 0 from Sales;
Now the value returned from this query is 510......
Why is this happening? Now just follow the following checklist to ensure that you can rectify this issue as soon as one possibly can....
1. Go to the cube and in the [internetsalesamount] properties, change the aggregation value to count instead of sum. Validate whether the measure value matches the count in the datawarehouse i.e. select count(1) from FactInternetSales.... and ensure that the count matches.
2. Check the dimensions that are slicing the cube and ensure that all the dimension attribute values have a 1:1 or 1: many mapping with the FactInternetSales. Majority of the time, there could be a dimension value missing in the dimension which is present in the cube. If this is the case, try and figure out why the dimension value is missing and add it back in.
3. Sometimes the datatype used as part of the measure could also lead to such problems so just a quick cheat sheet to leverage:
int (in datawarehouse) --> corresponding measure should be bigint
decimal --> double
4. Constant changes to the fact table..... now if the fact table constantly changes, it is best practice to change the processing model to ROLAP instead of the MOLAP (I know you end up with more processing time but nothing beats real time data especially for the analysts)
5. Check the SQL query that is generated in populating the Fact in SSAS... this might be an inconclusive in majority of the scenarios but its still worth a shot
6. Final step in the checklist, if all else fails try try and succeed........

Friday, February 07, 2014

New CEO for Microsoft

It was recently announced that Satya Nadella (a Microsoft veteran) has become the CEO of Microsoft. Even though I have a great sense of pride that a person of Indian origins or a first generation American immigrant of Indian origins is now the biggest and probably the most powerful person in the I.T. industry today, Microsoft placed a safe bet in naming Nadella as its CEO. I feel that a better choice would have been Sundar Pichai (VP @ Google) or even Robin Li ( The founder and CEO of Baidu). Of course from an experience point of view Nadella would need absolutely no introduction. His resume speaks for itself. He created and successfully handled a multitude of Microsoft's unique silo's or divisions. But infusing new ideas decoupled from the earlier thought process that Microsoft has always had would have paved for a new era when it comes to Microsoft revolutionizing the tech industry. This is a factor that will bear a brunt in Microsoft's cogs because clearly Nadella has been part of this thought process that has been detrimental to Microsoft and probably will not deviate from this (Old habits die hard). Another big let down was the fact that my idol Bill Gates is back in a technical role. It just shows the level of confidence that the old guard at Microsoft has for the new CEO. Of course Microsoft has pushed it across as a highlight of Nadella becoming the CEO, since the legendary Bill Gates with all his experience of building an empire is back at the tech table. But to me this is more of a smoke screen than anything else. This made Nadella look a bit weak in retrospect with what should have considered to have been a tribute to the old era. Bill Gates and Steve Ballmer could have come out with their heads held high with a great sense of achievement of building a global entity from scratch and handing over the reigns to Nadella but this does not remain the case as Mr. Gates is back in a full time capacity and Ballmer would definitely be present for an interim period till Nadella settles down. This is the reason Microsoft should have gone with a fresh CEO, a person who can enable Microsoft's key strengths as well as infuse new ideas. I was rooting for Sundar since his name was dropped into the hat. Who knows this might have been the handshake that the two biggest giants in the industry Google and Microsoft might have required. Another person who I thought of when Microsoft was searching for a new CEO was Mark Zuckerburg. Sure he has a Facebook to handle, but just imagine if he was named the CEO of Microsoft and Facebook would no longer remain a unique identity but becomes a part of Microsoft. That truly would have been an earth shattering event. Two giants coming together as one integrated unit which would continue to create magic ( a dream that never became reality). These are my thoughts and I felt like expressing them but you never know maybe the Nadella-Gates combination might become a "Six" (as they call it in cricket).

Tuesday, January 21, 2014

SSIS SFTP handling using WinSCP

Just a follow on to the earlier post.....
try {
string file = Dts.Variables["strLocalDirectory"].Value.ToString() + "\\" + Dts.Variables["strLocalFile"].Value.ToString();
string username = Dts.Variables["strFTPUserName"].Value.ToString();
string password = Dts.Variables["strFTPPassword"].Value.ToString();
// Setup session options
SessionOptions sessionOptions = new SessionOptions {
HostName = Dts.Variables["strFTPHost"].Value.ToString(),
UserName = username,
Password = password,
Protocol = Protocol.Sftp,
PortNumber = int.Parse(Dts.Variables["strFTPPort"].Value.ToString()),
FtpMode = FtpMode.Active,
FtpSecure = FtpSecure.None,
SshHostKeyFingerprint = Dts.Variables["strFTPFingerprint"].Value.ToString()
}; using (Session session = new Session()) {
session.ExecutablePath = Dts.Variables["strWinSCPExecutable"].Value.ToString();
// Connect
session.Open(sessionOptions);
// Upload files
TransferOptions transferOptions = new TransferOptions();
transferOptions.TransferMode = TransferMode.Binary;
TransferOperationResult transferResult;
transferResult = session.PutFiles(file, Dts.Variables["strRemotePath"].Value.ToString()); //./
}
}
catch (Exception e) {
Dts.Variables["strExceptions"].Value = e.Message + ":" + e.StackTrace;
Dts.TaskResult = (int)ScriptResults.Failure;
}

Friday, January 17, 2014

SSIS FTPS File handling

The following scripttask showcases both the FTP and FTPS based connections for uploading a file (it can be modified to perform more options). My earlier approach was to leverage the FTPWebRequest but the "AUTH SSL" command was taking a huge amount of time so decided to fall back on using WinSCP instead. Also ensure that the WinSCP.exe is added to the path environment variable and the WinSCP.dll is placed in the GAC.
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
public void Main()
{
/* * Traditional FTP */
/*ConnectionManager mgr = Dts.Connections.Add("FTP"); try { mgr.Properties["ServerName"].SetValue(mgr, "ftp://"+Dts.Variables["strFTPHost"].Value.ToString()); mgr.Properties["ServerUserName"].SetValue(mgr, Dts.Variables["strFTPUserName"].Value.ToString()); mgr.Properties["ServerPassword"].SetValue(mgr, Dts.Variables["strFTPPassword"].Value.ToString()); mgr.Properties["ServerPort"].SetValue(mgr, Dts.Variables["strFTPPort"].Value.ToString()); mgr.Properties["Timeout"].SetValue(mgr, Dts.Variables["strFTPTimeout"].Value.ToString()); mgr.Properties["UsePassiveMode"].SetValue(mgr, true); mgr.ConnectionString = Dts.Variables["strFTPHost"].Value.ToString() + ":" + Dts.Variables["strFTPPort"].Value.ToString(); //string conn = Dts.Connections["FTP Connection Manager"].ConnectionString; FtpClientConnection ftpConn = new FtpClientConnection(mgr.AcquireConnection(null)); ftpConn.Connect(); string[] fileLocation = new string[1]; fileLocation[0] = Dts.Variables["strLocalDirectory"].Value.ToString()+"\\"+Dts.Variables["strLocalFile"].Value.ToString(); ftpConn.SendFiles(fileLocation, Dts.Variables["strRemotePath"].Value.ToString(), true, false); ftpConn.Close(); } catch (Exception exp) { Dts.Variables["strExceptions"].Value = exp.Message + ":" + exp.StackTrace; }*/
/* * FTPS using WinSCP */
try { string file = Dts.Variables["strLocalDirectory"].Value.ToString() + "\\" + Dts.Variables["strLocalFile"].Value.ToString();
string username = Dts.Variables["strFTPUserName"].Value.ToString();
string password = Dts.Variables["strFTPPassword"].Value.ToString();
//string.Format("Running sync with {0}", username).Dump(DateTime.Now.ToString());
// Setup session options SessionOptions sessionOptions = new SessionOptions {
HostName = Dts.Variables["strFTPHost"].Value.ToString(),
UserName = username,
Password = password,
Protocol = Protocol.Ftp,
PortNumber = Dts.Variables["strFTPPort"].Value.ToString(),
FtpMode = FtpMode.Passive,
FtpSecure = FtpSecure.Implicit,
TlsHostCertificateFingerprint = Dts.Variables["strFTPFingerprint"].Value.ToString()
};
using (Session session = new Session()) {
session.ExecutablePath = Dts.Variables["strWinSCPExecutable"].Value.ToString();
// Connect session.Open(sessionOptions);
// Upload files
TransferOptions transferOptions = new TransferOptions();
transferOptions.TransferMode = TransferMode.Binary;
TransferOperationResult transferResult;
transferResult = session.PutFiles(file, Dts.Variables["strRemotePath"].Value.ToString());
// Throw on any error
transferResult.Check();
}
}
catch (Exception e) {
Dts.Variables["strExceptions"].Value = e.Message + ":" + e.StackTrace;
Dts.TaskResult = (int)ScriptResults.Failure;
} Dts.TaskResult = (int)ScriptResults.Success;
}

The variables used in the above code is self explanatory.... Remember to get the thumbprint using WinSCP initially to get the certificate or check with the FTP system admin.

Thursday, September 19, 2013

Rhino - ETL

Using Rhino-ETL ( a C# based) framework for developing standard ETL's is pretty easy and one can do a lot of fun stuff with the underlying source data. I just wrote up a quick console app to generate data into a text file and push the same data into a table in SQL Server as well as an external file. Here are the following steps (for the external file push):
1. Create a new C# console application solution in Visual Studio.
2. Target the .Net framework as shown in the below screen shot in your project properties:-

3. Create 3 Sub Folders underneath your project as shown in the following screen shot
DataObjects --> Contains the class files associated with each and every table/file in your environment. Example:- if your source file contains student data, then you would create a class file called Student with the individual properties (in relation to the properties) exposed (nouns).
Operations--> This primarily contains the class files that contain the activities (adjectives) that need to be performed on the DataObjects. Example:- Writing the Student Data to a database, Reading the Student data from a file etc.
WorkFolder--> Contains the external file sources to interact with. Example:- a flat file, a csv or a tsv. In this case it will be student.txt.


Lets write some code to insert a student record from a flat file into another flat file.......(as simple as it sounds)

4. Create a class file called StudentRecord.cs (pipe delimited) and declare the required entity attributes as shown in the following code snippet:-
Contains records in the following manner (student.txt)
StudentId|StudentName|StudentAddress|StudentClassId|StudentMarks //header
1|Ishwar|TestAddress|1|85 //row

5. Create a class file called NewStudentRecord (which contains the attributes that need to be transferred to the new file)
This will be outputted in the following manner(tab separated)
StudentId\tStudentName\tStudentAddress\tStudentClass
1\tIshwar\tTestAddress\t1

Let us now create the action called student write i.e. Let us go about writing this out to an output file called studentoutput.txt and I am creating a new C# class file called StudentWriteFile which will be as shown in the following code snippet:-

Now let us go about writing the main program... create the main program in the following manner:-
The setting's values basically point to the settings files that I have created which contains the absolute path of the student.txt and the studentoutput.txt files.

After which in your main just initialize the MainProgram in the following manner:-
 new MainProgram().Execute();

and you will have your first rhino-etl to rock and roll with......



Tuesday, July 02, 2013

Javascripting libraries and frameworks.....Which is the best??

Recently got back into active web application development after quite a long time (4-5 years)..... It is always fun working on the UI layer except for the fact that everyone has their own opinions when it comes to look and feel. Anyways this article is to stress some of the javascript libraries that I am currently evaluating and trying to leverage for a new website....

1. Backbone.js--> I just loved using this library, easy construction of Models - Backbone.Model.extend({ Model definition }) and views - Backbone.View.extend({view declaration}). It seems to not adhere to any specific pattern. A lot of sites claim this to be a traditional MVC structure but I disagree because of the fact that this library can cater to other popular patterns out there in the market today. I have decided to settle down on this for my web site just because of the code base and the ease of use of writing backbone. It supports RESTful based routing as well.... so overall just one word for it awesome... Also on a quick note it seems to have a great tie in with another library that I will be leveraging called D3.js (Data driven documents) which is a phenomenal repository of different data representation factors.

2. Knockout --> Another library but supports the ever popular Microsoft's MVVM pattern.  Knockout is another cool library to leverage and works extremely well with SPA based apps. Knockout binds the elements with JSON payloads phenomenally well and is something that Microsoft has been leveraging in most of their SPA based apps. Cool tutorials also in place @ http://learn.knockoutjs.com/#/?tutorial=intro

3. Dojo --> Have not played around with this framework yet but a lot of people currently do leverage (not as many as backbone maybe a similar to knockout's user base) Dojo. Really neat name.... Concept of modular based programming in place. Everything is neatly structured with both the properties and methods exposed by the API code base.....Also has its own charting tools and a widget library (especially for card building in web sites) available.

The other popular javascript libraries out there are as follows:
Spine (another version of knockout) and Canjs
Popular frameworks out there are Angular js, Meteor and Batman.

A great site for referencing the various libraries and frameworks available out there is Steven Anderson's blog site @ http://blog.stevensanderson.com/2012/08/01/rich-javascript-applications-the-seven-frameworks-throne-of-js-2012/


Just to add to the list of charting libraries out there:
1. Rgraph
2. Flot2
3. Google graphs
4. D3.js
5. Dojo
6. High Charts

Check out the following site for more graphing tools out there in the market:
http://mannaio.altervista.org/design/interactive-javascript-charts/?doing_wp_cron=1372816883.0944008827209472656250

Tuesday, May 14, 2013

An Agile process evaluation


In the recent past, I have been working as a scrum master for one of the projects. And since there were two folds to this project, a portion being handled onsite as well as a portion being handled offshore, there were initial hiccups that gradually deteriorated away. But here comes the crux of the problem and where an Agile process has to be evaluated.

        We had weekly sprints where a subset of tasks were to be completed and we have a daily scrum between the offshore and onsite team. But the onsite team were persistent on a daily mail sent with the changes being addressed. Now the point of the scrum is to take into account who, what and where the changes have to be made based on the product backlog. Now having status mails as part of the Agile process is indeed a flaw in the fundamental flow. After pointing out this fact, I was counter attacked with several points (some of which were valid to have a status update email) and finally left to comply. Now here comes the secondary flaw and something which had a major impact in sprint. The onsite team demanded daily code reviews of the work done offshore. Now this caused a major butterfly effect in the scrum process. First the offshore team needs to comply to the tasks that need to be performed and then perform a code merge of all the developer's changes (again done by one of the team members) which put a lot of stress on the team member doing the merge. I tried to stream line this by asking different members to perform this on a daily basis, but eventually it could not be done in this fashion as well as even suggested a bi-weekly drop which was given a thumbs down. This led to a lot of down time and a morale let down as even I could not prevent this from happening as the scrum master of the team. I could not shield the team from this negative impact. And the other intake I could make was that the code sent for review was not being reviewed on a daily basis (this is the icing on the cake). I on the other hand had to ensure that a few deliverables on my plate did meet the deadline and could not help the offshore team in making changes to the Agile workflow we had in place. Then came the impact of managing the project with TFS (which I definitely had my eyes on) since it became easier to track down the deficits as well as generate charts to figure out where we were heading towards each sprint. But this was turned down by let's say the new product owner who wanted to do this via project plans. With all this and all my say not being looked at, we struggled to get the project to production ready with very minimal glitches.

                      I think the biggest mistake was my accepting responsibility as the scrum master. The scrum master must be the most strong willed member of the team and should say "NO" when the moment calls for it. But then comes to adapting to the organization's work culture which ultimately led to the undoing of the Agile process. I gave in more to the culture of the organization than the needs of the project which might happen in the future as well. But the next time I will more geared and in case I see the red lights propping up, I am going to step down as the scrum master if I were made one. Also in the case of the pigs and chickens analogy, the chickens had the major say in this case which again is an Agile flaw. But this has left me with quite a few questions unanswered:

  1. Corporate Culture VS Agile project management. Which needs to be given more priority and Why?
  2. Do all the team members have equal status in the Agile process or are there scenario's where certain team members should have more pull?
  3. If point 2 holds, what could be scenarios this could occur?
  4. Product owner VS Scrum Master in making product design changes based on the sprint timing. Who should have more hold?

Based on my understanding of how Agile works and some of the books I read, I get varying answers to these questions. I think it something that I need to ponder on and figure out an approach myself. Stay tuned to a future article about my insights into such situations….