本文只介绍开源hadoop2.7.3版本的最简部署安装流程,hadoop HA和kerberos相关安装配置不涉及。

安装环境:
Centos 7.4

Java 1.8.0_144

Hadoop 2.7.5

服务器

hadoop01 NM DN RS NN

hadoop02 DN NN

hadoop03 DN NN

1、基础环境配置
a) JDK 安装,不赘述(但请避免使用oracle 1.6.0_16/18/19)

b) 关闭防火墙

systemctl stop firewalld.service
c) 关闭selinux

setenfore 0

vi /etc/selinux/config
设置
SELINUX=disabled
d) 配置/etc/hosts

172.27.129.170 hadoop01
172.27.129.171 hadoop02
172.27.129.172 hadoop03
e) 配置免秘钥登陆
ssh-keygen
ssh-copy-id -i pro@hadoop01
ssh-copy-id -i pro@hadoop02
ssh-copy-id -i pro@hadoop03

2、Hadoop部署
1) 下载部署包。http://www.apache.org/dyn/closer.cgi/hadoop/common/

wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.5/hadoop-2.7.5.tar.gz
2)解压安装包
tar zxvf hadoop-2.7.5.tar.gz -C /home/pro/
mv hadoop-2.7.5 hadoop
3) 修改环境变量

export JAVA_HOME="/home/pro/jdk"
export HADOOP_HOME="/home/pro/hadoop"
PATH=/home/pro/jdk/bin:/home/pro/hadoop/bin:/home/pro/hadoop/sbin:$PATH
4)修改hadoop相关配置
core-site.xml

<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://hadoop01:8020</value>
        </property>
</configuration>

hdfs-site.xml

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
        <property>  
                <name>dfs.namenode.name.dir</name>  
                <value>file:///home/pro/bigdata/nn</value>
        </property>
        <property>  
                <name>dfs.datanode.data.dir</name>  
                <value>file:///home/pro/bigdata/dn</value>  
        </property>  
        <property>
         <name>dfs.hosts.exclude</name>
         <value>/home/lee/hadoop/etc/hadoop/hdfs.excludes</value>
    </property>
</configuration>

yarn-site.xml

<configuration>
<property>
     <name>yarn.nodemanager.local-dirs</name>
     <value>file:///home/pro/bigdata/nm-local</value>
</property>
<property>
        <name>yarn.log.server.url</name>
        <value>http://hadoop01:19888/jobhistory/logs</value>
</property>
 
<property>
        <name>yarn.timeline-service.hostname</name>
        <value>hadoop01</value>
</property>
 
<property>
        <name>yarn.timeline-service.enabled</name>
        <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.timeline-service.generic-application-history.enabled</name>
    <value>true</value>
</property>
<property>
    <name>yarn.resourcemanager.hostname</name>
        <value>hadoop01</value>
</property>
<property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
</property>
<property>
        <name>yarn.timeline-service.leveldb-timeline-store.path</name>
        <value>/home/pro/hadoop/tmp/yarn/timeline</value>
</property>
<property>
    <name>yarn.resourcemanager.nodes.exclude-path</name>
    <value>/home/pro/hadoop/etc/hadoop/yarn.excludes</value>
</property>
<property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>10000</value>
</property>
<property>
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
        <name>yarn.nodemanager.vmem-pmem-ratio</name>
        <value>2</value>
</property>
<property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
</property>
<property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
</property>
<property>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>/var/log/hadoop-yarn/apps</value>
</property>
<property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>10000</value>
</property>
<property>
        <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>6</value>
</property>
</configuration>

5)分发安装包

scp -r /home/pro/hadoop pro@hadoop02:/home/pro/
scp -r /home/pro/hadoop pro@hadoop03:/home/pro/
6) 启动服务

sh /home/pro/hadoop/sbin/start-all.sh
7) 验证服务

jps
HDFS: http://hadoop01:50070
YARN: http://hadoop01:8088

PS:至此hadoop2.7.5已经安装完毕,请有兴趣的各位分别实现HA场景和KERBEROS场景。

标签: Apace Hadoop

添加新评论