【hadoop】hadoop环境快速搭建

  • Post author:
  • Post category:其他


前段时间,搭建了一个hadoop分布式环境,特点在于

1 把定制的配置文件和hadoop源码分开,这样升级hadoop版本的时候,不需要重新修改配置文件;

2 经常要在集群模式/单点模式下进行切换

硬件准备情况:

有4台机器,一台作为namenode,三台作为datanode,机器名分配如下

10.2.224.24 namenode

10.2.224.25 datanode1

10.2.224.26 datanode2

10.2.224.27 datanode3

环境搭建步骤

1.在每台机器上创建一个admin用户,打通从namenode到各个datanode的ssh通道,这个在网上资料很多,不再描述;

2.mkdir /home/admin/hadoop-installed

3. 在 /home/admin/hadoop-installed目录下解压hadoop压缩包,解压为hadoop

4. mkdir /home/admin/hadoop-installed/cluster-conf

5. 在cluster-conf目录下创建如下四个文件

masters

[quote]

namenode #namenode机器的机器名

[/quote]

slaves

[quote]

datanode1

datanode2

datanode3

[/quote]

hadoop-site.xml

[quote]

<?xml version=”1.0″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<configuration>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/admin/hadoop-installed/filesystem</value>

</property>

<property>

<name>fs.default.name</name>

<value>hdfs://namenode:54310</value>

</property>

<property>

<name>mapred.job.tracker</name>

<value>hdsf://namenode:54311</value>

</property>

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

<property>

<name>mapred.child.java.opts</name>

<value>-Xmx512m</value>

</property>

</configuration>

[/quote]

hadoop-env.sh

[quote]

export JAVA_HOME=/usr/ali/java

export HADOOP_NAMENODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS”

export HADOOP_SECONDARYNAMENODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS”

export HADOOP_DATANODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS”

export HADOOP_BALANCER_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS”

export HADOOP_JOBTRACKER_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS”

export HADOOP_LOG_DIR=/home/admin/hadoop-installed/logs

[/quote]

6.增加环境变量HADOOP_CONF_DIR

[quote]

export HADOOP_CONF_DIR=/home/alisoft/hadoop-installed/conf

[/quote]

7.增加single-conf目录,在该目录下也增加如下四个文件

masters

[quote]

localhost #namenode机器的机器名

[/quote]

slaves

[quote]

localhost

[/quote]

hadoop-site.xml

[quote]

<?xml version=”1.0″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<!– Put site-specific property overrides in this file. –>

<configuration>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/admin/hadoop-installed/filesystem</value>

</property>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:54310</value>

</property>

<property>

<name>mapred.job.tracker</name>

<value>hdsf://localhost:54311</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>mapred.child.java.opts</name>

<value>-Xmx512m</value>

</property>

</configuration>

[/quote]

hadoop-env.sh

[quote]

export JAVA_HOME=/usr/ali/java

export HADOOP_NAMENODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS”

export HADOOP_SECONDARYNAMENODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS”

export HADOOP_DATANODE_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS”

export HADOOP_BALANCER_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS”

export HADOOP_JOBTRACKER_OPTS=”-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS”

export HADOOP_LOG_DIR=/home/alisoft/hadoop-installed/logs

[/quote]

8. ln -s cluster-conf/ conf 集群模式

或者 ln -s single-conf/ conf 单机调试模式

9. 增加一个同步脚本

[quote]

DESTSERVER=’datanode1 datanode2 datanode3′

for DEST in $DESTSERVER

do

rsync -v -r -l -H -p -g -t -S -e ssh –exclude “.svn” –delete /home/admin/hadoop-installed/conf/ admin@$DEST:/home/admin/hadoop-installed/conf/

rsync -v -r -l -H -p -g -t -S -e ssh –delete /home/admin/hadoop-installed/hadoop/ alisoft@$DEST:/home/admin/hadoop-ins

talled/hadoop/

done

exit 0

[/quote]