释放双眼，带上耳机，听听看~！

环境：阿里云服务器 CentOS 7 x86_64

安装介质：jdk-7u75-linux-i586.tar.gz，hadoop-2.4.1.tar.gz

安装jdk


1
2
1tar -zxvf jdk-7u75-linux-i586.tar.gz

2

配置环境变量：


1
2
3
4
5
6
7
8
9
10
11
12
1# vi .bash_profile

2

3JAVA_HOME=/root/training/jdk1.7.0_75

4export JAVA_HOME

5

6PATH=$JAVA_HOME/bin:$PATH

7export PATH

8

9# source .bash_profile

10# which java

11# java -version

12

bug解决：64bit的操作系统，无法运行32bit的应用程序，需要安装32bit的glibc库。


1
2
1-bash: /root/training/jdk1.7.0_75/bin/java: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

2


1
2
3
4
1# yum install glibc*.i686

2# locate /lib/ld-linux.so.2

3# rpm -qf /lib/ld-linux.so.2

4

安装Hadoop


1
2
1tar -zxvf hadoop-2.4.1.tar.gz

2

配置环境变量：


1
2
3
4
5
6
7
8
9
10
1# vi .bash_profile

2

3HADOOP_HOME=/root/training/hadoop-2.4.1

4export HADOOP_HOME

5

6PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

7export PATH

8

9# source .bash_profile

10

本地模式的配置

hadoop-env.sh
JAVA_HOME
/root/training/jdk1.7.0_75


1
2
3
4
1# vi hadoop-env.sh

2

3export JAVA_HOME=/root/training/jdk1.7.0_75

4

修改hostname，/etc/hosts下的地址必须使用私有地址。


1
2
3
4
1# vi /etc/hosts

2

3192.168.1.107 izwz985sjvpoji48moqz01z

4

验证mapreduce


1
2
3
1# hadoop jar hadoop-mapreduce-examples-2.4.1.jar wordcount ~/training/data/input/data.txt ~/training/data/output/

2# more part-r-00000

3

伪分布式模式配置

hadoop-env.sh
JAVA_HOME
/root/training/jdk1.7.0_75
Java的home目录
hdfs-site.xml
dfs.replication
1
数据的冗余度
core-site.xml
fs.defaultFS
hdfs://<hostname>:9000
namenode的IP地址和端口，9000是RPC通信的端口
core-site.xml
hadoop.tmp.dir
/root/training/hadoop-2.4.1/tmp
如不修改默认为/tmp，设置的路径必须事先存在
mapred-site.xml
mapreduce.framework.name
yarn
指定MR运行在yarn上
yarn-site.xml
yarn.resourcemanager.hostname
<hostname>
指定YARN的老大（ResourceManager）的地址
yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
reducer获取数据的方式

hdfs-site.xml


1
2
3
4
5
1&lt;property&gt;

2    &lt;name&gt;dfs.replication&lt;/name&gt;

3    &lt;value&gt;1&lt;/value&gt;

4&lt;/property&gt;

5

core-site.xml


1
2
3
4
5
6
7
8
9
1&lt;property&gt;

2    &lt;name&gt;fs.defaultFS&lt;/name&gt;

3    &lt;value&gt;hdfs://192.168.1.107:9000&lt;/value&gt;

4&lt;/property&gt;

5&lt;property&gt;

6    &lt;name&gt;hadoop.tmp.dir&lt;/name&gt;

7    &lt;value&gt;/root/training/hadoop-2.4.1/tmp&lt;/value&gt;

8&lt;/property&gt;

9

mapred-site.xml，cp mapred-site.xml.template mapred-site.xml


1
2
3
4
5
1&lt;property&gt;

2    &lt;name&gt;mapreduce.framework.name&lt;/name&gt;

3    &lt;value&gt;yarn&lt;/value&gt;

4&lt;/property&gt;

5

yarn-site.xml


1
2
3
4
5
6
7
8
9
1&lt;property&gt;

2    &lt;name&gt;yarn.resourcemanager.hostname&lt;/name&gt;

3    &lt;value&gt;192.168.1.107&lt;/value&gt;

4&lt;/property&gt;

5&lt;property&gt;

6    &lt;name&gt;yarn.nodemanager.aux-services&lt;/name&gt;

7    &lt;value&gt;mapreduce_shuffle&lt;/value&gt;

8&lt;/property&gt;

9

验证HDFS和mapreduce


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1# cd ~/trainging

2# ls hadoop-2.4.1/tmp/

3# hdfs namenode -format

4# start-all.sh

5# jps

65828 NodeManager

76284 Jps

85438 SecondaryNameNode

95288 DataNode

105579 ResourceManager

115172 NameNode

12# hdfs dfsadmin -report

13# hdfs dfs -mkdir /input

14# hdfs dfs -put data/input/data.txt /input/data.txt

15# hdfs dfs -lsr /

16# hadoop jar hadoop-mapreduce-examples-2.4.1.jar wordcount /input/data.txt /output

17# hdfs dfs -cat /output/part-r-00000

18# stop-all.sh

19# jps

20

配置Hadoop的SSH免密码登录

1、生成A的密钥和公钥ssh-keygen -t rsa

2、将A的公钥 –> B，ssh-copy -i –> B
3、得到Server A的公钥

4、随机产生一个字符串：helloworld

5、使用A的公钥进行加密：*

6、将加密后的字符串*发给A
7、得到B发来的加密字符串

8、使用私钥进行解密 –> helloworld

9、将解密后的helloworld发给B
10、得到A发来的解密后的字符串helloworld

11、对比step4和step10这两个字符串，一样则Server B允许Server A免密码登录到Server B


1
2
3
4
5
6
7
1# cd ~

2# ls .ssh/

3hnown_hosts

4# ssh-keygen -t rsa

5# ssh-copy-id -i .ssh/id_rsa.pub root@120.78.89.97

6# more .ssh/authorized_keys

7

{{userData.name}}已认证

Hadoop实战（1）_阿里云搭建Hadoop2.x的伪分布式环境

安装jdk

安装Hadoop

本地模式的配置

伪分布式模式配置

配置Hadoop的SSH免密码登录

1、生成A的密钥和公钥ssh-keygen -t rsa

2、将A的公钥 –> B，ssh-copy -i –> B
3、得到Server A的公钥

4、随机产生一个字符串：helloworld

5、使用A的公钥进行加密：*

6、将加密后的字符串*发给A
7、得到B发来的加密字符串

8、使用私钥进行解密 –> helloworld

9、将解密后的helloworld发给B
10、得到A发来的解密后的字符串helloworld

基于spring boot和mongodb打造一套完整的权限架构（四）【完全集成security】

Ubuntu上NFS的安装配置

{{userData.name}}已认证

安装jdk

安装Hadoop

本地模式的配置

伪分布式模式配置

配置Hadoop的SSH免密码登录

1、生成A的密钥和公钥ssh-keygen -t rsa

2、将A的公钥 –> B，ssh-copy -i –> B 3、 得到Server A的公钥

4、随机产生一个字符串：helloworld

5、使用A的公钥进行加密：*

6、将加密后的字符串*发给A 7、得到B发来的加密字符串

8、使用私钥进行解密 –> helloworld

9、将解密后的helloworld发给B 10、得到A发来的解密后的字符串helloworld

Related posts:

基于spring boot和mongodb打造一套完整的权限架构（四）【完全集成security】

Ubuntu上NFS的安装配置

OpenStack、Kubernetes、Mesos谁主沉浮？

Hbase常用优化、Hbae性能优化、Hbase优化经验总结

R利剑NoSQL系列文章 之 Cassandra

flume+kafka+storm+mysql架构设计

2、将A的公钥 –> B，ssh-copy -i –> B
3、得到Server A的公钥

6、将加密后的字符串*发给A
7、得到B发来的加密字符串

9、将解密后的helloworld发给B
10、得到A发来的解密后的字符串helloworld

R利剑NoSQL系列文章之 Cassandra