So today is day 2 of Cloudera Hadoop.
Today I learned to set the ClassPath by
export HADOOP_CLASSPATH=path of JAR / Class Files
Learned that the file this gets saved to is called hadoop-env.sh and can be found in the directory:
Learned a basic Linux command "pwd" tells you your current path in the directory structure.
To modify the contents of the file:
sudo vi /usr/lib/hadoop-0.20-mapreduce/conf/hadoop-env.sh
cntl-A to get to end
To simulate the admin:
To kick off a Hadoop job you enter:
hadoop jar wordcount.jar org.myorg.WordCount /user/cloudera/input /user/cloudera/output
the input directory must already be created in HDFS
hadoop dfs -mkdir /user/cloudera/input
and you must copy your test files to that location on the HDFS server:
hadoop dfs -copyFromLocal /home/cloudera/wordcount/input /user/cloudera/input
assuming you change your directory / folder name accordingly.
and you must first clear the contents of the output directory:
hadoop dfs --rm -r /user/cloudera/output
hadoop dfs -rmdir /user/cloudera/input
if you log onto the Hadoop website, localhost, you will see your input folder as follows:
Here's a posting from day 1 http://www.bloomconsultingbi.com/2013/01/first-try-at-cloudera-hadoop.html
And I got the tutorial example to work on my virtual machine! Yippie!!!!
I signed up for the Hortonworks Certified Associate exam last Thursday. Figured if I sign up, I'd have to take the test. And if I tak...
Saw a post today on Twitter, " Microsoft releases CNTK, its open source deep learning toolkit, on GitHub " This is big news. Be...
It seems like open source applications are the mainstream today. So many new products delivered through Aache foundation. Some do this. S...