9/07/2015

Re-Intro to Pentaho Data Integration Kettle

I've used the Pentaho Data Integration product, also known as Kettle a few years ago.

My Intro to Pentaho Big Data PDI Kettle

#Pentaho #Kettle #PDI CE Offering.

And this post, which is one of my all time most viewed blogs at 6,000 reads: Follow Up Post

So tonight I wanted to get re-acquainted the Kettle.  You can find the download and info page here.  And the project Jira here.  And Pentaho Data Integration (Kettle) Tutorial here.

Initiate the download:


Unpacked the Zip file:


Click on SpoonConsole.bat


And the Windows application opens:


The best part about using this free open source software from Pentaho are the Big Data features:


I opened a sample project called "test-job.kjb":


Project loaded in the IDE:


Modified one of the steps to include an Excel file:


Save, then Run / Execute the Job:


It produced a message box:



And the job ended:



Opened another sample, "Generate product data.ktr"


Save, Run & Execute:


The project generated an output file after creating a directory above in the Transformation directory called "Output":


And looking at the output file contents:


The data was generated randomly using the Java script function:

java;

var description=Packages.org.pentaho.di.core.util.StringUtil.generateRandomString(4, "", "-desc", false);
var code = Packages.org.pentaho.di.core.util.StringUtil.generateRandomString(3, "PRD-", "", true);
var category = Packages.org.pentaho.di.core.util.StringUtil.generateRandomString(1, "", "", true);

var price = 150.00 + ( Math.random() * 1000 );

And mapping the field:


Pentaho Data Integration Kettle is a nice piece of software that's open source and works with Hadoop.  Lot's of nice features.




No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Top 22 Complaints by Number