Configuration of Hadoop
Hadoop is supported by GNU/Linux platform and its flavors. Configurations are specified by resources. A resource contains a set of name/value pairs as XML data. Each resource is named by either a
String or by a
Path. If named by a
String, then the classpath is examined for a file with that name. If named by a
Path, then the local filesystem is examined directly, without referring to the classpath.
Setting environment variables for Hadoop:-
The next step is to set up the PATH environment variables. You can set them as following:
Login as root user
Copy the java path as
Login as object user
To check the hadoop version
Cmd: hadoop version
To start the hadoop services:-
Login as object user.
Cmd: sudo/etc/init. d/hadoop-0.20-namenode start
Repeat the command for all the daemon such as data node, secondary name node, job tracker, task.
Adding the host name in hadoop configuration file:-
Login as root user.
Add your host name instead of local host in the file.
Formatting the Name Node:-
Before formatting the Name Node, we have to assign users to the daemons in HDFS
Cmd: export HADOOP NAME NODE-USER=hdfs
Cmd: export Hadoop-Secondary Name Node-user=hdfs
Cmd: export Hadoop DATA NODE hdfs JDB TRACKER hdfs=map red
Log in as object user
Cmd: hadoop Name Node-format
To format the name node
Start all the services for hdfs
Cmd: hadoop fs-ls/
Now the hadoop configuration is finished.
Cmd: jps[To check all the daemons]
Installation of Library files:-
Log in as root user
Cmd: yum install hadoop-0.20*
Now, all the files and libraries which are related to hadoop [i.e. remains supported for those, which are not installed previously]