1 把定制的配置文件和hadoop源码分开,这样升级hadoop版本的时候,不需要重新修改配置文件;
2 经常要在集群模式/单点模式下进行切换
   硬件准备情况:
   
   有4台机器,一台作为namenode,三台作为datanode,机器名分配如下
   
   10.2.224.24 namenode
   
   10.2.224.25 datanode1
   
   10.2.224.26 datanode2
   
   10.2.224.27 datanode3
   环境搭建步骤
   
   1.在每台机器上创建一个admin用户,打通从namenode到各个datanode的ssh通道,这个在网上资料很多,不再描述;
2.mkdir /home/admin/hadoop-installed
3. 在 /home/admin/hadoop-installed目录下解压hadoop压缩包,解压为hadoop
4. mkdir /home/admin/hadoop-installed/cluster-conf
   5. 在cluster-conf目录下创建如下四个文件
   
   masters
   
   [quote]
   
   namenode #namenode机器的机器名
   
   [/quote]
   slaves
   
   [quote]
   
   datanode1
   
   datanode2
   
   datanode3
   
   [/quote]
   hadoop-site.xml
   
   [quote]
   
   <?xml version=”1.0″?>
   
   <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>
   
   <configuration>
   
   <property>
   
   <name>hadoop.tmp.dir</name>
   
   <value>/home/admin/hadoop-installed/filesystem</value>
   
   </property>
   
   <property>
   
   <name>fs.default.name</name>
   
   <value>hdfs://namenode:54310</value>
   
   </property>
   
   <property>
   
   <name>mapred.job.tracker</name>
   
   <value>hdsf://namenode:54311</value>
   
   </property>
   
   <property>
   
   <name>dfs.replication</name>
   
   <value>3</value>
   
   </property>
   
   <property>
   
   <name>mapred.child.java.opts</name>
   
   <value>-Xmx512m</value>
   
   </property>
   
   </configuration>
   
   [/quote]
   hadoop-env.sh
   
   [quote]
   
   export JAVA_HOME=/usr/ali/java
   
   export HADOOP_NAMENODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS”
   
   export HADOOP_SECONDARYNAMENODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS”
   
   export HADOOP_DATANODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS”
   
   export HADOOP_BALANCER_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS”
   
   export HADOOP_JOBTRACKER_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS”
   export HADOOP_LOG_DIR=/home/admin/hadoop-installed/logs
   
   [/quote]
   6.增加环境变量HADOOP_CONF_DIR
   
   [quote]
   
   export HADOOP_CONF_DIR=/home/alisoft/hadoop-installed/conf
   
   [/quote]
   7.增加single-conf目录,在该目录下也增加如下四个文件
   
   masters
   
   [quote]
   
   localhost #namenode机器的机器名
   
   [/quote]
   slaves
   
   [quote]
   
   localhost
   
   [/quote]
   hadoop-site.xml
   
   [quote]
   
   <?xml version=”1.0″?>
   
   <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>
   
   <!– Put site-specific property overrides in this file. –>
   
   <configuration>
   
   <property>
   
   <name>hadoop.tmp.dir</name>
   
   <value>/home/admin/hadoop-installed/filesystem</value>
   
   </property>
   
   <property>
   
   <name>fs.default.name</name>
   
   <value>hdfs://localhost:54310</value>
   
   </property>
   
   <property>
   
   <name>mapred.job.tracker</name>
   
   <value>hdsf://localhost:54311</value>
   
   </property>
   
   <property>
   
   <name>dfs.replication</name>
   
   <value>1</value>
   
   </property>
   
   <property>
   
   <name>mapred.child.java.opts</name>
   
   <value>-Xmx512m</value>
   
   </property>
   
   </configuration>
   
   [/quote]
   hadoop-env.sh
   
   [quote]
   
   export JAVA_HOME=/usr/ali/java
   
   export HADOOP_NAMENODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS”
   
   export HADOOP_SECONDARYNAMENODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS”
   
   export HADOOP_DATANODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS”
   
   export HADOOP_BALANCER_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS”
   
   export HADOOP_JOBTRACKER_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS”
   
   export HADOOP_LOG_DIR=/home/alisoft/hadoop-installed/logs
   
   [/quote]
   8. ln -s cluster-conf/ conf 集群模式
   
   或者 ln -s single-conf/ conf 单机调试模式
   9. 增加一个同步脚本
   
   [quote]
   
   DESTSERVER=’datanode1 datanode2 datanode3′
   
   for DEST in $DESTSERVER
   
   do
   
   rsync -v -r -l -H -p -g -t -S -e ssh –exclude “.svn” –delete /home/admin/hadoop-installed/conf/ admin@$DEST:/home/admin/hadoop-installed/conf/
   
   rsync -v -r -l -H -p -g -t -S -e ssh –delete /home/admin/hadoop-installed/hadoop/ alisoft@$DEST:/home/admin/hadoop-ins
   
   talled/hadoop/
   done
   
   exit 0
   
   [/quote]
  
 
