Follow Up Post

So yesterday I posted a blog entry where I got Pentaho working for Data Integration.


By upgrading to the latest version 4.4.

However, that was only partially correct statement.

I COULD connect from a CentosVM to the Cloudera Hadoop Cluster VM, that is true.

I could see the file structure, delete folders, etc.

However, I COULD NOT execute a Data Integration Spoon job for some reason.

So today, it was discovered that the /etc/host files need to be updated on both destination and calling VMs. (sudo vi hosts)

I had to translate the calling server to map it's IP address with a host name.

I had to modify the hosts.allow file to allow certain IPs.

And I had to modify the Hadoop Cluster VM to accept incoming calls from the calling IP address.

And presto, we got files in the output directory of the wordcount folder on the HDFS cluster: