安装大概步骤:
- JDK
- SSH免密码登录
- NTPDATE 时间同步
- 网络配置
- CDH5 安装
- ZOOKEEPER安装
- HIVE安装
主机IP | hsotname | 角色 |
172.21.25.100 | namenode.yxnrtf.openpf | NameNode |
172.21.25.104 | datanode01.yxnrtf.openpf | DataNode |
172.21.25.105 | datanode02.yxnrtf.openpf | DataNode |
1、JDK安装
tar -zxvf jdk-7u80-linux-x64.gz -C /usr/local/java
配置环境变量
#javaexport JAVA_HOME=/usr/local/java/jdk1.7.0_80export JRE_HOME=$JAVA_HOME/jreexport CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/libexport PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
验证:java -version
2、SSH免密登录
各个节点都执行下面该命令,会在/root/下生成.ssh 文件夹,
ssh-keygen –t rsa –P ''
在NameNode节点上将 id_rsa.pub追加到key里面去
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
将DataNode节点的 id_rsa.pub依次 追加到 主节点的 keys里面
scp ~/.ssh/id_rsa.pub root@172.21.25.100:~/scp ~/id_rsa.pub hadoop@172.21.25.100:~/authorized_keys
将NameNode节点的authorized_keys 传到DataNode节点上
scp ~/.ssh/authorized_keys root@172.21.25.104:~/.ssh/scp ~/.ssh/authorized_keys root@172.21.25.105:~/.ssh/chmod 600 ~/.ssh/authorized_keys # 各个DataNode节点上修改文件权限
在所有的节点上修改/etc/ssh/sshd_config 文件
RSAAuthentication yes # 启用 RSA 认证PubkeyAuthentication yes # 启用公钥私钥配对认证方式AuthorizedKeysFile .ssh/authorized_keys # 公钥文件路径(和上面生成的文件同)service sshd restart # 重启ssh服务
验证 ssh localhost, ssh DataNode IP ,这样节点之间可以相互访问
3、NTPDate 时间同步
时间同步是通过crontab 定时任务 ntpdate 来实现时间同步,不采用NTP是因为时间上超过某个值就无法实现同步。每5分钟实现一次同步crontab -e # crond 命令*/5 * * * * /usr/sbin/ntpdate ntp.oss.XX && hwclock --systohc #ntp.oss.XX 是你的NTP服务器
4、网络配置
所有机器关闭防火墙,添加/etc/hosts
service iptables statusservice iptables stop
添加/etc/hosts
172.21.25.100 namenode.yxnrtf.openpf172.21.25.104 datanode01.yxnrtf.openpf172.21.25.105 datanode02.yxnrtf.openpf
5、安装CDH5
在NameNode节点上执行
wget http://archive.cloudera.com/cdh5/one-click-install/redhat/6/x86_64/cloudera-cdh-5-0.x86_64.rpm
禁用GPG签名检查,并安装本地软件包
yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
添加cloudera仓库验证:
rpm --import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
NameNode 节点上安装 软件包namenode、resourcemanager、nodemanager、datanode、mapreduce、historyserver、proxyserver和hadoop-client:
yum install hadoop hadoop-hdfs hadoop-client hadoop-doc hadoop-debuginfo hadoop-hdfs-namenode hadoop-yarn-resourcemanager hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce hadoop-mapreduce-historyserver hadoop-yarn-proxyserver -y
DataNode 节点上安装
yum install hadoop hadoop-hdfs hadoop-client hadoop-doc hadoop-debuginfo hadoop-yarn hadoop-hdfs-datanode hadoop-yarn-nodemanager hadoop-mapreduce -y
SecondaryNamenode 安装,本人是在NameNode节点上安装的,如果安装在其他服务器上,之后的一些一些配置需要进行修改
yum install hadoop-hdfs-secondarynamenode -y
在/etc/hadoop/conf/hdfs-site.xml 中添加如下配置
dfs.namenode.checkpoint.check.period 60 dfs.namenode.checkpoint.txns 1000000 dfs.namenode.checkpoint.dir file:///data/cache1/dfs/namesecondary file:///data/cache1/dfs/namesecondary hdfs dfs.namenode.num.checkpoints.retained 2 dfs.secondary.http.address namenode.yxnrtf.openpf:50090
在NameNode节点上创建目录
mkdir -p /data/cache1/dfs/nnchown -R hdfs:hadoop /data/cache1/dfs/nnchmod 700 -R /data/cache1/dfs/nn
在DataNode节点上创建目录
mkdir -p /data/cache1/dfs/dnmkdir -p /data/cache1/dfs/mapred/localchown -R hdfs:hadoop /data/cache1/dfs/dnchmod 777 -R /data/usermod -a -G mapred hadoopchown -R mapred:hadoop /data/cache1/dfs/mapred/local
各个节点上在/etc/profile上添加如下配置
export HADOOP_HOME=/usr/lib/hadoopexport HIVE_HOME=/usr/lib/hiveexport HBASE_HOME=/usr/lib/hbaseexport HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfsexport HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduceexport HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfsexport HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexecexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoopexport HADOOP_YARN_HOME=/usr/lib/hadoop-yarnexport YARN_CONF_DIR=$HADOOP_HOME/etc/hadoopexport PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$HBASE_HOME/bin:$PATH
source /etc/profile 让其生效
在NameNode节点上添加/etc/hadoop/conf/core-site.xml 如下配置
fs.defaultFS hdfs://namenode.yxnrtf.openpf:9000 dfs.replication 1 hadoop.proxyuser.hadoop.hosts namenode.yxnrtf.openpf hadoop.proxyuser.hadoop.groups hdfs hadoop.proxyuser.mapred.groups * hadoop.proxyuser.mapred.hosts * hadoop.proxyuser.yarn.groups * hadoop.proxyuser.yarn.hosts * hadoop.proxyuser.httpfs.hosts httpfs-host.foo.com hadoop.proxyuser.httpfs.groups * hadoop.proxyuser.hive.hosts * hadoop.proxyuser.hive.groups *
在/etc/hadoop/conf/hdfs-site.xml添加如下配置
dfs.namenode.name.dir /data/cache1/dfs/nn/ dfs.datanode.data.dir /data/cache1/dfs/dn/ dfs.hosts /etc/hadoop/conf/slaves dfs.permissions false dfs.permissions.superusergroup hdfs
在/etc/hadoop/conf/mapred-site.xml添加如下配置
mapreduce.jobhistory.address namenode.yxnrtf.openpf:10020 mapreduce.jobhistory.webapp.address namenode.yxnrtf.openpf:19888 mapreduce.jobhistory.joblist.cache.size 50000 mapreduce.jobhistory.done-dir /user/hadoop/done mapreduce.jobhistory.intermediate-done-dir /user/hadoop/tmp mapreduce.framework.name yarn
在/etc/hadoop/conf/yarn-site.xml 添加如下配置
yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce_shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.log-aggregation-enable true List of directories to store localized files in. yarn.nodemanager.local-dirs /var/lib/hadoop-yarn/cache/${user.name}/nm-local-dir Where to store container logs. yarn.nodemanager.log-dirs /var/log/hadoop-yarn/containers Where to aggregate logs to. yarn.nodemanager.remote-app-log-dir hdfs://namenode.yxnrtf.openpf:9000/var/log/hadoop-yarn/apps yarn.resourcemanager.address namenode.yxnrtf.openpf:8032 yarn.resourcemanager.scheduler.address namenode.yxnrtf.openpf:8030 yarn.resourcemanager.webapp.address namenode.yxnrtf.openpf:8088 yarn.resourcemanager.resource-tracker.address namenode.yxnrtf.openpf:8031 yarn.resourcemanager.admin.address namenode.yxnrtf.openpf:8033 Classpath for typical applications. yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*, $HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*, $HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*, $HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*, $HADOOP_YARN_HOME/lib/* yarn.web-proxy.address namenode.yxnrtf.openpf:54315
修改/etc/hadoop/conf/slaves
datanode01.yxnrtf.openpfdatanode02.yxnrtf.openpf
在yarn.env.xml 注释的下面添加
export JAVA_HOME=/usr/local/java/jdk1.7.0_80
将/etc/hadoop/conf 这个目录拷贝到datanode节点上
scp -r conf/ root@172.21.25.104:/etc/hadoop/scp -r conf/ root@172.21.25.105:/etc/hadoop/
NameNode节点上启动
hdfs namenode –formatservice hadoop-hdfs-namenode initservice hadoop-hdfs-namenode startservice hadoop-yarn-resourcemanager startservice hadoop-yarn-proxyserver startservice hadoop-mapreduce-historyserver start
DataNode节点上启动
service hadoop-hdfs-datanode startservice hadoop-yarn-nodemanager start
在浏览器中查看
HDFS | |
ResourceManager(Yarn) | |
http://:8088/cluster/nodes | 在线的节点 |
NodeManager | |
4 | |
4 | |
http://:19888/ | JobHistory |
6、ZOOKEEPER安装
各个节点运行如下命名
yum install zookeeper* -y
NameNode节点上修改配置文件/etc/zookeeper/conf/zoo.cfg
#clean logsautopurge.snapRetainCount=3autopurge.purgeInterval=1server.1=namenode.yxnrtf.openpf:2888:3888server.2=datanode01.yxnrtf.openpf:2888:3888server.3=datanode01.yxnrtf.openpf:2888:3888
在NameNode上启动
service zookeeper-server init --myid=1service zookeeper-server start
DataNode 节点1上启动
service zookeeper-server init --myid=2service zookeeper-server start
DataNode 节点2上启动
service zookeeper-server init --myid=3service zookeeper-server start
此处需要注意的是--myid的值的设置要和配置文件中的一直
验证启动情况,在NameNode节点上运行命令
zookeeper-client -server traceMaster:2181