Rhino - ETL

Using Rhino-ETL ( a C# based) framework for developing standard ETL's is pretty easy and one can do a lot of fun stuff with the underlying source data. I just wrote up a quick console app to generate data into a text file and push the same data into a table in SQL Server as well as an external file. Here are the following steps (for the external file push):
1. Create a new C# console application solution in Visual Studio.
2. Target the .Net framework as shown in the below screen shot in your project properties:-

3. Create 3 Sub Folders underneath your project as shown in the following screen shot
DataObjects --> Contains the class files associated with each and every table/file in your environment. Example:- if your source file contains student data, then you would create a class file called Student with the individual properties (in relation to the properties) exposed (nouns).
Operations--> This primarily contains the class files that contain the activities (adjectives) that need to be performed on the DataObjects. Example:- Writing the Student Data to a database, Reading the Student data from a file etc.
WorkFolder--> Contains the external file sources to interact with. Example:- a flat file, a csv or a tsv. In this case it will be student.txt.

Lets write some code to insert a student record from a flat file into another flat file.......(as simple as it sounds)

4. Create a class file called StudentRecord.cs (pipe delimited) and declare the required entity attributes as shown in the following code snippet:-
Contains records in the following manner (student.txt)
StudentId|StudentName|StudentAddress|StudentClassId|StudentMarks //header
1|Ishwar|TestAddress|1|85 //row

5. Create a class file called NewStudentRecord (which contains the attributes that need to be transferred to the new file)
This will be outputted in the following manner(tab separated)

Let us now create the action called student write i.e. Let us go about writing this out to an output file called studentoutput.txt and I am creating a new C# class file called StudentWriteFile which will be as shown in the following code snippet:-

Now let us go about writing the main program... create the main program in the following manner:-
The setting's values basically point to the settings files that I have created which contains the absolute path of the student.txt and the studentoutput.txt files.

After which in your main just initialize the MainProgram in the following manner:-
 new MainProgram().Execute();

and you will have your first rhino-etl to rock and roll with......


Thanks for the blog.. Ive been looking for something like this to learn Rhino ETL for quite a while.. Was surfing the net before you put this article.

Anyhow I have a small doubt with the example you have provided.

1)"StudentRead" Cant find this class.
Ishwar Nataraj said…
Great catch Ashwin.... must have added the class at a later stage...
public StudentRead(string filePath) { this.filePath = filePath; }
string filePath = null; public override IEnumerable Execute(IEnumerable rows) { FluentFile engine = FluentFile.For(); engine.HeaderText = "Id\tsName\tsAddress\tsclass"; using (FileEngine file = engine.To(filePath)) { foreach (Row testRow in rows) { Row row = new Row(); //row.Copy(leftRow); //copy over all properties not in the student records row["sId"] = testRow["StudentId"]; row["sName"] = testRow["StudentName"]; row["sAddress"] = testRow["StudentAddress"]; row["sclass"] = testRow["StudentClassId"]; file.Read(row.ToObject()); //pass through rows if needed for another later operation yield return row; } }

Popular posts from this blog

System.ConfigurationSettings.AppSettings is Obsolete

Branding your SharePoint site in a super fast way - Emgage

Sharepoint & SSRS integration Issues