Using Rhino-ETL ( a C# based) framework for developing standard ETL's is pretty easy and one can do a lot of fun stuff with the underlying source data. I just wrote up a quick console app to generate data into a text file and push the same data into a table in SQL Server as well as an external file. Here are the following steps (for the external file push): 1. Create a new C# console application solution in Visual Studio. 2. Target the .Net framework as shown in the below screen shot in your project properties:- 3. Create 3 Sub Folders underneath your project as shown in the following screen shot DataObjects --> Contains the class files associated with each and every table/file in your environment. Example:- if your source file contains student data, then you would create a class file called Student with the individual properties (in relation to the properties) exposed (nouns). Operations --> This primarily contains the class files that contain the activities (a...
On Monday 6/13/2016, Microsoft announced its acquisition of LinkedIn. This is a major game changer in the world of IT. But before we get to some of the advantages of this acquisition, Microsoft actually was working on a LinkedIn killer on its CRM dynamics platform. The idea was to generate more footprint for its CRM solution as well as create something unique with it. This was started in early 2012 and was way before its actual acquisition of LinkedIn. Here are my thoughts into where this acquisition will lead Microsoft & LinkedIn to: Microsoft gains a huge database of professionals and organizations in various streams: This alone is the most massive gain by Microsoft. It could start targeting professionals/organizations to either move onto the Microsoft platform or join the Microsoft platform which can bolster its sales by a huge margin/ Microsoft integration of LinkedIn ads with Bing: Just imagine an organization trying to establish a marketing campaign. Now with LinkedIn a...
Big Data - the keyword given to solutions that can handle massive amount of data usually in the petabyte or greater amount. There are several big data solutions out there and all of them have their unique characteristics which can be useful in different scenarios. I was looking into Cloudera's versions of Hadoop like Impala, Sentry and HBase. All these vary based on the use case. For some of my clients I have leveraged Amazon Redshift, Cassandra (and hopefully soon Apache Hadoop). The architecture of these systems differ but the end goal is the storage and processing of vast amounts of data down to second or milli second based result generation. Focusing on this aspect I am going to give a more detailed insight on Redshift which is a node based peta byte scaled database as well as a high level overview of what I recently implemented. Note: The above diagram is from the Redshift Warehousing article ( http://docs.aws.amazon.com/redshift/latest/dg/c_high_level_system_architecture....
Comments