I’ve just begun with Apache Hadoop. Getting started isn’t as straightforward as one might wish. There is quite a bit of work to do after the initial installation (“brew install hadoop” in my case). Expect tweaks and workarounds. Fortunately, there is plenty of advice available, such as this excellent blog post.

After having installed the latest stable version (2.7.1) and going through the initial setup I was still unable to use the damn thing! Strange as it sounds, some of the key modules were not present on the classpath. See the failure I got when trying to format HDFS:

$ hdfs namenode -format
Could not find or load main class org.apache.hadoop.hdfs.server.namenode.Namenode

I tried altering the hadoop-env.sh, but after a while I realised it was too much of a hassle. I ended up looking for a way of how to add ALL of the JARs in the Hadoop package to the classpath. It didn’t take too long to find the answer at Stackoverflow.

So, here is my take on how to add all Hadoop JARs on the classpath.

Open your bash profile (~/.profile or ~/.bash_profile) for editing and add the following:

export HADOOP_HOME="/usr/local/Cellar/hadoop" # Replace with your own path
export HADOOP_CLASSPATH=$(find $HADOOP_HOME -name '*.jar' | xargs echo | tr ' ' ':')

Save the changes and reload:

source ~/.profile

Finally, check the classpath. I’ve truncated the output, but you get the idea.

$ hadoop classpath
hadoop/yarn/hadoop-yarn-server-common-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.1.jar:...

From this point on, all commands should work. Let’s get back to hdfs:

$ hdfs namenode -format
SLF4J: Class path contains multiple SLF4J bindings.
...
15/08/28 09:53:33 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = toms-macbook-pro.home/192.168.1.99
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.1
...

I would love to say that this is it, but you are likely to run into other issues. As usual, patience and Google are your best friends.

Categories: Tips & Tricks

Tomas Zezula

Hello! I'm a technology enthusiast with a knack for solving problems and a passion for making complex concepts accessible. My journey spans across software development, project management, and technical writing. I specialise in transforming rough sketches of ideas to fully launched products, all the while breaking down complex processes into understandable language. I believe a well-designed software development process is key to driving business growth. My focus as a leader and technical writer aims to bridge the tech-business divide, ensuring that intricate concepts are available and understandable to all. As a consultant, I'm eager to bring my versatile skills and extensive experience to help businesses navigate their software integration needs. Whether you're seeking bespoke software solutions, well-coordinated product launches, or easily digestible tech content, I'm here to make it happen. Ready to turn your vision into reality? Let's connect and explore the possibilities together.