Install Mahout, RHadoop, Configure Map Reduce Job

Found a good blog post on installing Hadoop Mahout and RHadoop:


So I got out my reference for Linux VIM editor and started down the path:


And off we went.

Installed Mahout, then installed the streaming software, ran into an issue because skipped a set, of adding "export" commands to the environment variables, took an hour to troubleshoot.

Installed files:

Ran the RHadoop example from the blog:

 Found the pg100.txt file on the web, downloaded and then uploaded into the HDFS file system:

Uploaded file into File System:

And the GUI file system:

Typed in the code for Map Reduce job in VIM:

Tried running, found a missing comma, specified the incorrect Jar file in the .bash_profile, corrected that, removed the /Data/Output folder, reran:

 Output folder:

And the browser view:

So it seems Streaming is working as well as RHadoop.  Thanks  again for the great tutorial:

No comments:

Post a Comment