hadoop-2.0.0-alpha (Standalone) and jdk7u4 on CentOS 6.2
Hardware:
Dell PowerEdge SC420 Pentium 4GB RAM 80GB HDD
OS:
CentOS 6.2
Download from,
ftp://mirror.nandomedia.com/pub/CentOS/6.2/isos/i386/CentOS-6.2-i386-LiveCD.iso
Burn to a CD.
Boot from this CD.
Brings up CentOS 6.2.
Click on 'Install onto Hard Disk Drive', from desktop.
Reboot.
Login as 'root'.
Software:
Download Java from,
http://download.oracle.com/otn-pub/java/jdk/7u4-b20/jdk-7u4-linux-i586.rpm
# rpm -ivh {download folder}/jdk-7u4-linux-i586.rpm
installs java at /usr/java/jdk1.7.0_04
# java -version
java version "1.7.0_04"
Java(TM) SE Runtime Environment (build 1.7.0_04-b20)
Java HotSpot(TM) Client VM (build 23.0-b21, mixed mode, sharing)
# cd
# pwd
/root
# vi .bashrc
append the following two lines at the end.
export JAVA_HOME=/usr/java/jdk1.7.0_04
export PATH=${JAVA_HOME}/bin:${PATH}
Press keys, Esc : wq
# . ~/.bashrc
Download hadoop from,
http://apache.mirrorcatalogs.com/hadoop/common/hadoop-2.0.0-alpha/hadoop-2.0.0-alpha.tar.gz
# cd /usr
# tar -xvzf {download folder}/hadoop-2.0.0-alpha.tar.gz
# ls /usr/hadoop-2.0.0-alpha
bin include lib LICENSE.txt output sbin
etc input libexec NOTICE.txt README.txt share
# service sshd status
openssh-daemon (pid xxxx) is running...
if not running, # service sshd restart
# groupadd hadoop
# useradd -g hadoop -m hadoop
# cd /usr/hadoop-2.0.0-alpha
# chown -R hadoop:hadoop
# su - hadoop
# ssh-keygen -t rsa -P ''
hit Enter when prompted for a folder
# cat .ssh/id_rsa.pub >> .ssh/authorized_keys
# ssh localhost
would log you without prompting for a password
# exit
# whoami
hadoop
# bin/hadoop version
Hadoop 2.0.0-alpha
Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-2.0.0-alpha/hadoop-common-project/hadoop-common -r 1338348
Compiled by hortonmu on Wed May 16 01:28:50 UTC 2012
From source with checksum 954e3f6c91d058b06b1e81a02813303f
# mkdir input
# cp share/hadoop/common/templates/conf/*.xml input
# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-alpha.jar grep input output 'dfs[a-z.]+'
...
INFO input.FileInputFormat: Total input paths to process : 6
...
...
12/05/27 03:38:30 INFO mapreduce.Job: Counters: 27
File System Counters
FILE: Number of bytes read=91368
FILE: Number of bytes written=505456
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=32
Map output records=32
Map output bytes=1006
Map output materialized bytes=1076
Input split bytes=133
Combine input records=0
Combine output records=0
Reduce input groups=3
Reduce shuffle bytes=0
Reduce input records=32
Reduce output records=32
Spilled Records=64
Shuffled Maps =0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=50
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=242368512
File Input Format Counters
Bytes Read=1368
File Output Format Counters
Bytes Written=830
# ls input/
capacity-scheduler.xml core-site.xml hadoop-policy.xml hdfs-site.xml mapred-queue-acls.xml mapred-site.xml
{matching the, INFO input.FileInputFormat: Total input paths to process : 6}
# cat output/part-r-00000
3 dfs.datanode.data.dir
2 dfs.namenode.http
2 dfs.namenode.https
1 dfsqa
1 dfsadmin
1 dfs.webhdfs.enabled
1 dfs.umaskmode
1 dfs.support.append
1 dfs.secondary.namenode.keytab.file
1 dfs.secondary.namenode.kerberos.principal
1 dfs.secondary.namenode.kerberos.https.principal
1 dfs.permissions.superusergroup
1 dfs.namenode.secondary.http
1 dfs.namenode.safemode.threshold
1 dfs.namenode.replication.min.
1 dfs.namenode.name.dir
1 dfs.namenode.keytab.file
1 dfs.namenode.kerberos.principal
1 dfs.namenode.kerberos.https.principal
1 dfs.include
1 dfs.https.port
1 dfs.hosts.exclude
1 dfs.hosts
1 dfs.exclude
1 dfs.datanode.keytab.file
1 dfs.datanode.kerberos.principal
1 dfs.datanode.http.address
1 dfs.datanode.data.dir.perm
1 dfs.datanode.address
1 dfs.cluster.administrators
1 dfs.block.access.token.enable
1 dfs
{matching the, Reduce output records=32}
Done, quick install and verify of hadoop 2.0.0 alpha standalone.
Refer to,
http://hadoop.apache.org/common/docs/stable/single_node_setup.html
Dell PowerEdge SC420 Pentium 4GB RAM 80GB HDD
OS:
CentOS 6.2
Download from,
ftp://mirror.nandomedia.com/pub/CentOS/6.2/isos/i386/CentOS-6.2-i386-LiveCD.iso
Burn to a CD.
Boot from this CD.
Brings up CentOS 6.2.
Click on 'Install onto Hard Disk Drive', from desktop.
Reboot.
Login as 'root'.
Software:
Download Java from,
http://download.oracle.com/otn-pub/java/jdk/7u4-b20/jdk-7u4-linux-i586.rpm
# rpm -ivh {download folder}/jdk-7u4-linux-i586.rpm
installs java at /usr/java/jdk1.7.0_04
# java -version
java version "1.7.0_04"
Java(TM) SE Runtime Environment (build 1.7.0_04-b20)
Java HotSpot(TM) Client VM (build 23.0-b21, mixed mode, sharing)
# cd
# pwd
/root
# vi .bashrc
append the following two lines at the end.
export JAVA_HOME=/usr/java/jdk1.7.0_04
export PATH=${JAVA_HOME}/bin:${PATH}
Press keys, Esc : wq
# . ~/.bashrc
Download hadoop from,
http://apache.mirrorcatalogs.com/hadoop/common/hadoop-2.0.0-alpha/hadoop-2.0.0-alpha.tar.gz
# cd /usr
# tar -xvzf {download folder}/hadoop-2.0.0-alpha.tar.gz
# ls /usr/hadoop-2.0.0-alpha
bin include lib LICENSE.txt output sbin
etc input libexec NOTICE.txt README.txt share
# service sshd status
openssh-daemon (pid xxxx) is running...
if not running, # service sshd restart
# groupadd hadoop
# useradd -g hadoop -m hadoop
# cd /usr/hadoop-2.0.0-alpha
# chown -R hadoop:hadoop
# su - hadoop
# ssh-keygen -t rsa -P ''
hit Enter when prompted for a folder
# cat .ssh/id_rsa.pub >> .ssh/authorized_keys
# ssh localhost
would log you without prompting for a password
# exit
# whoami
hadoop
# bin/hadoop version
Hadoop 2.0.0-alpha
Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-2.0.0-alpha/hadoop-common-project/hadoop-common -r 1338348
Compiled by hortonmu on Wed May 16 01:28:50 UTC 2012
From source with checksum 954e3f6c91d058b06b1e81a02813303f
# mkdir input
# cp share/hadoop/common/templates/conf/*.xml input
# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-alpha.jar grep input output 'dfs[a-z.]+'
...
INFO input.FileInputFormat: Total input paths to process : 6
...
...
12/05/27 03:38:30 INFO mapreduce.Job: Counters: 27
File System Counters
FILE: Number of bytes read=91368
FILE: Number of bytes written=505456
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=32
Map output records=32
Map output bytes=1006
Map output materialized bytes=1076
Input split bytes=133
Combine input records=0
Combine output records=0
Reduce input groups=3
Reduce shuffle bytes=0
Reduce input records=32
Reduce output records=32
Spilled Records=64
Shuffled Maps =0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=50
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=242368512
File Input Format Counters
Bytes Read=1368
File Output Format Counters
Bytes Written=830
# ls input/
capacity-scheduler.xml core-site.xml hadoop-policy.xml hdfs-site.xml mapred-queue-acls.xml mapred-site.xml
{matching the, INFO input.FileInputFormat: Total input paths to process : 6}
# cat output/part-r-00000
3 dfs.datanode.data.dir
2 dfs.namenode.http
2 dfs.namenode.https
1 dfsqa
1 dfsadmin
1 dfs.webhdfs.enabled
1 dfs.umaskmode
1 dfs.support.append
1 dfs.secondary.namenode.keytab.file
1 dfs.secondary.namenode.kerberos.principal
1 dfs.secondary.namenode.kerberos.https.principal
1 dfs.permissions.superusergroup
1 dfs.namenode.secondary.http
1 dfs.namenode.safemode.threshold
1 dfs.namenode.replication.min.
1 dfs.namenode.name.dir
1 dfs.namenode.keytab.file
1 dfs.namenode.kerberos.principal
1 dfs.namenode.kerberos.https.principal
1 dfs.include
1 dfs.https.port
1 dfs.hosts.exclude
1 dfs.hosts
1 dfs.exclude
1 dfs.datanode.keytab.file
1 dfs.datanode.kerberos.principal
1 dfs.datanode.http.address
1 dfs.datanode.data.dir.perm
1 dfs.datanode.address
1 dfs.cluster.administrators
1 dfs.block.access.token.enable
1 dfs
{matching the, Reduce output records=32}
Done, quick install and verify of hadoop 2.0.0 alpha standalone.
Refer to,
http://hadoop.apache.org/common/docs/stable/single_node_setup.html