HBase实战(6):使用Spark 2.2.1 直接操作HBASE 1.2.0数据库
之前对于Hbase系统已实验成功的内容:
Hbase分布式集群搭建:点击打开链接
- 直接使用python API连接Hbase操作数据。点击打开链接
- 直接使用Java API 连接Hbase操作数据。点击打开链接
- 使用spark-sql 工具通过Hive间接操作Hbase的数据。点击打开链接
- 使用Hive-sql 操作Hbase数据。点击打开链接
本次大数据实验室的内容:
5.直接使用Spark 2.2.1 操作HBase 1.2.0的数据。
编写测试代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55 1package HbaseTest.sparkconnectHbase;
2
3import org.apache.hadoop.conf.Configuration;
4import org.apache.hadoop.hbase.*;
5import org.apache.hadoop.hbase.client.*;
6import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
7import org.apache.hadoop.hbase.mapreduce.TableInputFormat;
8import org.apache.hadoop.hbase.util.Bytes;
9import org.apache.spark.SparkConf;
10import org.apache.spark.api.java.JavaPairRDD;
11import org.apache.spark.api.java.JavaSparkContext;
12import org.apache.spark.api.java.function.VoidFunction;
13import org.apache.spark.rdd.RDD;
14import org.apache.spark.sql.SparkSession;
15import scala.Function1;
16import scala.Tuple2;
17import scala.collection.Iterator;
18import scala.runtime.BoxedUnit;
19
20/***
21 * 使用Spark 2.2.1 直接连接 Hbase 1.2.0 数据库。
22 * */
23public class SparkConnectHbaseTest {
24
25 public static void main(String[] args) {
26 Configuration confhbase = HBaseConfiguration.create();
27
28 confhbase.set("hbase.zookeeper.property.clientPort", "2181");
29 confhbase.set("hbase.zookeeper.quorum", "192.168.189.1,192.168.189.2,192.168.189.3");
30 confhbase.set("hbase.master", "192.168.189.1:60000");
31
32 confhbase.set(TableInputFormat.INPUT_TABLE, "db_res:wtb_ow_operation");
33 SparkConf conf = new SparkConf().
34 setAppName("Spark_Connect_Hbase_Test");
35 JavaSparkContext sc = new JavaSparkContext(conf);
36 JavaPairRDD<ImmutableBytesWritable, Result> resultRDD = sc.newAPIHadoopRDD(confhbase, TableInputFormat.class, ImmutableBytesWritable.class, Result.class);
37
38 long count = resultRDD.count();
39 System.out.print("************SPARK from hbase count *************** " + count + " ");
40
41 resultRDD.foreach(new VoidFunction<Tuple2<ImmutableBytesWritable, Result>>() {
42 @Override
43 public void call(Tuple2<ImmutableBytesWritable, Result> v1) throws Exception {
44 String key = Bytes.toString(v1._2().getRow());
45 String operate_begin_time = Bytes.toString(v1._2().getValue(Bytes.toBytes("info"), Bytes.toBytes("operate_begin_time")));
46 System.out.print("==================spark from hbase record=========== : " + key + " " + operate_begin_time);
47 }
48 });
49 while (true) {
50
51 }
52 }
53}
54
55
pom文件:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255 1<?xml version="1.0" encoding="UTF-8"?>
2<project xmlns="http://maven.apache.org/POM/4.0.0"
3 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
4 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
5 <modelVersion>4.0.0</modelVersion>
6
7 <groupId>noc_hbase_test</groupId>
8 <artifactId>noc_hbase_test</artifactId>
9 <version>1.0-SNAPSHOT</version>
10
11 <properties>
12 <scala.version>2.11.8</scala.version>
13 <spark.version>2.2.1</spark.version>
14 <jedis.version>2.8.2</jedis.version>
15 <fastjson.version>1.2.14</fastjson.version>
16 <jetty.version>9.2.5.v20141112</jetty.version>
17 <container.version>2.17</container.version>
18 <java.version>1.8</java.version>
19 <hbase.version>1.2.0</hbase.version>
20 </properties>
21
22
23 <repositories>
24 <repository>
25 <id>scala-tools.org</id>
26 <name>Scala-Tools Maven2 Repository</name>
27 <url>http://scala-tools.org/repo-releases</url>
28 </repository>
29 </repositories>
30
31 <pluginRepositories>
32 <pluginRepository>
33 <id>scala-tools.org</id>
34 <name>Scala-Tools Maven2 Repository</name>
35 <url>http://scala-tools.org/repo-releases</url>
36 </pluginRepository>
37 </pluginRepositories>
38
39 <dependencies>
40 <dependency>
41 <groupId>org.scala-lang</groupId>
42 <artifactId>scala-library</artifactId>
43 <version>${scala.version}</version>
44 </dependency>
45 <dependency>
46 <groupId>org.scala-lang</groupId>
47 <artifactId>scala-compiler</artifactId>
48 <version>${scala.version}</version>
49 </dependency>
50 <dependency>
51 <groupId>org.scala-lang</groupId>
52 <artifactId>scala-reflect</artifactId>
53 <version>${scala.version}</version>
54 </dependency>
55
56 <dependency>
57 <groupId>org.scala-lang</groupId>
58 <artifactId>scalap</artifactId>
59 <version>${scala.version}</version>
60 </dependency>
61 <dependency>
62 <groupId>org.apache.spark</groupId>
63 <artifactId>spark-core_2.10</artifactId>
64 <version>${spark.version}</version>
65 </dependency>
66 <dependency>
67 <groupId>org.apache.spark</groupId>
68 <artifactId>spark-launcher_2.10</artifactId>
69 <version>${spark.version}</version>
70 </dependency>
71 <dependency>
72 <groupId>org.apache.spark</groupId>
73 <artifactId>spark-network-shuffle_2.10</artifactId>
74 <version>${spark.version}</version>
75 </dependency>
76 <dependency>
77 <groupId>org.apache.spark</groupId>
78 <artifactId>spark-sql_2.10</artifactId>
79 <version>${spark.version}</version>
80 </dependency>
81 <dependency>
82 <groupId>org.apache.spark</groupId>
83 <artifactId>spark-hive_2.10</artifactId>
84 <version>${spark.version}</version>
85 </dependency>
86 <dependency>
87 <groupId>org.apache.spark</groupId>
88 <artifactId>spark-catalyst_2.10</artifactId>
89 <version>${spark.version}</version>
90 </dependency>
91
92
93 <dependency>
94 <groupId>org.apache.spark</groupId>
95 <artifactId>spark-repl_2.10</artifactId>
96 <version>${spark.version}</version>
97 </dependency>
98
99 <dependency>
100 <groupId>org.apache.hive</groupId>
101 <artifactId>hive-jdbc</artifactId>
102 <version>1.2.1</version>
103 </dependency>
104
105
106 <!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase -->
107
108 <!-- hbase依赖包 -->
109 <dependency>
110 <groupId>org.apache.hbase</groupId>
111 <artifactId>hbase-client</artifactId>
112 <version>${hbase.version}</version>
113 <exclusions>
114 <exclusion>
115 <groupId>org.slf4j</groupId>
116 <artifactId>slf4j-log4j12</artifactId>
117 </exclusion>
118 </exclusions>
119 </dependency>
120 <dependency>
121 <groupId>org.apache.hbase</groupId>
122 <artifactId>hbase-common</artifactId>
123 <version>${hbase.version}</version>
124 <exclusions>
125 <exclusion>
126 <groupId>org.slf4j</groupId>
127 <artifactId>slf4j-log4j12</artifactId>
128 </exclusion>
129 </exclusions>
130 </dependency>
131 <dependency>
132 <groupId>org.apache.hbase</groupId>
133 <artifactId>hbase-server</artifactId>
134 <version>${hbase.version}</version>
135 <exclusions>
136 <exclusion>
137 <groupId>org.slf4j</groupId>
138 <artifactId>slf4j-log4j12</artifactId>
139 </exclusion>
140 </exclusions>
141 </dependency>
142
143 <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
144 <dependency>
145 <groupId>org.apache.hadoop</groupId>
146 <artifactId>hadoop-common</artifactId>
147 <version>2.6.0</version>
148 </dependency>
149
150 <dependency>
151 <groupId>org.apache.hadoop</groupId>
152 <artifactId>hadoop-client</artifactId>
153 <version>2.6.0</version>
154 </dependency>
155
156 <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
157 <dependency>
158 <groupId>org.apache.hadoop</groupId>
159 <artifactId>hadoop-hdfs</artifactId>
160 <version>2.6.0</version>
161 </dependency>
162
163
164 </dependencies>
165
166 <build>
167 <plugins>
168 <plugin>
169 <artifactId>maven-assembly-plugin</artifactId>
170 <configuration>
171 <classifier>dist</classifier>
172 <appendAssemblyId>true</appendAssemblyId>
173 <descriptorRefs>
174 <descriptor>jar-with-dependencies</descriptor>
175 </descriptorRefs>
176 </configuration>
177 <executions>
178 <execution>
179 <id>make-assembly</id>
180 <phase>package</phase>
181 <goals>
182 <goal>single</goal>
183 </goals>
184 </execution>
185 </executions>
186 </plugin>
187
188 <plugin>
189 <artifactId>maven-compiler-plugin</artifactId>
190 <configuration>
191 <source>1.7</source>
192 <target>1.7</target>
193 </configuration>
194 </plugin>
195
196 <plugin>
197 <groupId>net.alchim31.maven</groupId>
198 <artifactId>scala-maven-plugin</artifactId>
199 <version>3.2.2</version>
200 <executions>
201 <execution>
202 <id>scala-compile-first</id>
203 <phase>process-resources</phase>
204 <goals>
205 <goal>compile</goal>
206 </goals>
207 </execution>
208 </executions>
209 <configuration>
210 <scalaVersion>${scala.version}</scalaVersion>
211 <recompileMode>incremental</recompileMode>
212 <useZincServer>true</useZincServer>
213 <args>
214 <arg>-unchecked</arg>
215 <arg>-deprecation</arg>
216 <arg>-feature</arg>
217 </args>
218 <jvmArgs>
219 <jvmArg>-Xms1024m</jvmArg>
220 <jvmArg>-Xmx1024m</jvmArg>
221 </jvmArgs>
222 <javacArgs>
223 <javacArg>-source</javacArg>
224 <javacArg>${java.version}</javacArg>
225 <javacArg>-target</javacArg>
226 <javacArg>${java.version}</javacArg>
227 <javacArg>-Xlint:all,-serial,-path</javacArg>
228 </javacArgs>
229 </configuration>
230 </plugin>
231
232 <plugin>
233 <groupId>org.antlr</groupId>
234 <artifactId>antlr4-maven-plugin</artifactId>
235 <version>4.3</version>
236 <executions>
237 <execution>
238 <id>antlr</id>
239 <goals>
240 <goal>antlr4</goal>
241 </goals>
242 <phase>none</phase>
243 </execution>
244 </executions>
245 <configuration>
246 <outputDirectory>src/test/java</outputDirectory>
247 <listener>true</listener>
248 <treatWarningsAsErrors>true</treatWarningsAsErrors>
249 </configuration>
250 </plugin>
251 </plugins>
252 </build>
253
254</project>
255
在spark集群中提交运行:
1
2
3
4 1root@master:~# spark-submit --name noc_hbase_test --class HbaseTest.sparkconnectHbase.SparkConnectHbaseTest --master spark://master:7077 --jars /usr/local/apache-hive-1.2.1/lib/mysql-connector-java-5.1.13-bin.jar,/usr/local/apache-hive-1.2.1/lib/hive-hbase-handler-1.2.1.jar,/usr/local/hbase-1.2.0/lib/hbase-client-1.2.0.jar,/usr/local/hbase-1.2.0/lib/hbase-common-1.2.0.jar,/usr/local/hbase-1.2.0/lib/hbase-protocol-1.2.0.jar,/usr/local/hbase-1.2.0/lib/hbase-server-1.2.0.jar,/usr/local/hbase-1.2.0/lib/htrace-core-3.1.0-incubating.jar,/usr/local/hbase-1.2.0/lib/metrics-core-2.2.0.jar,/usr/local/hbase-1.2.0/lib/hbase-hadoop2-compat-1.2.0.jar,/usr/local/hbase-1.2.0/lib/guava-12.0.1.jar,/usr/local/hbase-1.2.0/lib/protobuf-java-2.5.0.jar --executor-memory 512m --total-executor-cores 2 /usr/local/setup_tools/noc_hbase_test.jar
2
3
4
1
2 1spark运行成功,运行结果如下:
2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94 1SLF4J: Class path contains multiple SLF4J bindings.
2SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
3SLF4J: Found binding in [jar:file:/usr/local/spark-2.2.1-bin-hadoop2.6/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
4SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
5SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
618/06/15 14:52:57 INFO spark.SparkContext: Running Spark version 2.2.1
718/06/15 14:52:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
818/06/15 14:52:58 INFO spark.SparkContext: Submitted application: Spark_Connect_Hbase_Test
918/06/15 14:52:58 INFO spark.SecurityManager: Changing view acls to: root
1018/06/15 14:52:58 INFO spark.SecurityManager: Changing modify acls to: root
1118/06/15 14:52:58 INFO spark.SecurityManager: Changing view acls groups to:
1218/06/15 14:52:58 INFO spark.SecurityManager: Changing modify acls groups to:
1318/06/15 14:52:58 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
1418/06/15 14:52:58 INFO util.Utils: Successfully started service 'sparkDriver' on port 46964.
1518/06/15 14:52:58 INFO spark.SparkEnv: Registering MapOutputTracker
1618/06/15 14:52:58 INFO spark.SparkEnv: Registering BlockManagerMaster
1718/06/15 14:52:58 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
1818/06/15 14:52:58 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
1918/06/15 14:52:58 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-4ac96a51-bf1d-4c35-b9d7-53e481274c63
2018/06/15 14:52:58 INFO memory.MemoryStore: MemoryStore started with capacity 413.9 MB
2118/06/15 14:52:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator
2218/06/15 14:52:59 INFO util.log: Logging initialized @2617ms
2318/06/15 14:52:59 INFO server.Server: jetty-9.3.z-SNAPSHOT
2418/06/15 14:52:59 INFO server.Server: Started @2799ms
2518/06/15 14:52:59 INFO server.AbstractConnector: Started ServerConnector@2ca308df{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2618/06/15 14:52:59 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
2718/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@70e0accd{/jobs,null,AVAILABLE,@Spark}
2818/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@65f87a2c{/jobs/json,null,AVAILABLE,@Spark}
2918/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6ce1f601{/jobs/job,null,AVAILABLE,@Spark}
3018/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d816dde{/jobs/job/json,null,AVAILABLE,@Spark}
3118/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6c451c9c{/stages,null,AVAILABLE,@Spark}
3218/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@372b0d86{/stages/json,null,AVAILABLE,@Spark}
3318/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3113a37{/stages/stage,null,AVAILABLE,@Spark}
3418/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@20312893{/stages/stage/json,null,AVAILABLE,@Spark}
3518/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@c41709a{/stages/pool,null,AVAILABLE,@Spark}
3618/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@54ec8cc9{/stages/pool/json,null,AVAILABLE,@Spark}
3718/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5528a42c{/storage,null,AVAILABLE,@Spark}
3818/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1a6f5124{/storage/json,null,AVAILABLE,@Spark}
3918/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@ec2bf82{/storage/rdd,null,AVAILABLE,@Spark}
4018/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6cc0bcf6{/storage/rdd/json,null,AVAILABLE,@Spark}
4118/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@32f61a31{/environment,null,AVAILABLE,@Spark}
4218/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@669253b7{/environment/json,null,AVAILABLE,@Spark}
4318/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@51a06cbe{/executors,null,AVAILABLE,@Spark}
4418/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@49a64d82{/executors/json,null,AVAILABLE,@Spark}
4518/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@66d23e4a{/executors/threadDump,null,AVAILABLE,@Spark}
4618/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4d9d1b69{/executors/threadDump/json,null,AVAILABLE,@Spark}
4718/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@251f7d26{/static,null,AVAILABLE,@Spark}
4818/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@37d3d232{/,null,AVAILABLE,@Spark}
4918/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@581d969c{/api,null,AVAILABLE,@Spark}
5018/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5851bd4f{/jobs/job/kill,null,AVAILABLE,@Spark}
5118/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f40a43{/stages/stage/kill,null,AVAILABLE,@Spark}
5218/06/15 14:52:59 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://master:4040
5318/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/apache-hive-1.2.1/lib/mysql-connector-java-5.1.13-bin.jar at spark://master:46964/jars/mysql-connector-java-5.1.13-bin.jar with timestamp 1529045579564
5418/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/apache-hive-1.2.1/lib/hive-hbase-handler-1.2.1.jar at spark://master:46964/jars/hive-hbase-handler-1.2.1.jar with timestamp 1529045579571
5518/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-client-1.2.0.jar at spark://master:46964/jars/hbase-client-1.2.0.jar with timestamp 1529045579572
5618/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-common-1.2.0.jar at spark://master:46964/jars/hbase-common-1.2.0.jar with timestamp 1529045579574
5718/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-protocol-1.2.0.jar at spark://master:46964/jars/hbase-protocol-1.2.0.jar with timestamp 1529045579575
5818/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-server-1.2.0.jar at spark://master:46964/jars/hbase-server-1.2.0.jar with timestamp 1529045579577
5918/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/htrace-core-3.1.0-incubating.jar at spark://master:46964/jars/htrace-core-3.1.0-incubating.jar with timestamp 1529045579578
6018/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/metrics-core-2.2.0.jar at spark://master:46964/jars/metrics-core-2.2.0.jar with timestamp 1529045579579
6118/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-hadoop2-compat-1.2.0.jar at spark://master:46964/jars/hbase-hadoop2-compat-1.2.0.jar with timestamp 1529045579581
6218/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/guava-12.0.1.jar at spark://master:46964/jars/guava-12.0.1.jar with timestamp 1529045579583
6318/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/protobuf-java-2.5.0.jar at spark://master:46964/jars/protobuf-java-2.5.0.jar with timestamp 1529045579584
6418/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/setup_tools/noc_hbase_test.jar at spark://master:46964/jars/noc_hbase_test.jar with timestamp 1529045579585
6518/06/15 14:52:59 INFO client.StandaloneAppClient$ClientEndpoint: Connecting to master spark://master:7077...
6618/06/15 14:52:59 INFO client.TransportClientFactory: Successfully created connection to master/192.168.189.1:7077 after 40 ms (0 ms spent in bootstraps)
6718/06/15 14:53:00 INFO cluster.StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20180615145300-0004
6818/06/15 14:53:00 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20180615145300-0004/0 on worker-20180615140035-worker1-39457 (worker1:39457) with 1 cores
6918/06/15 14:53:00 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20180615145300-0004/0 on hostPort worker1:39457 with 1 cores, 512.0 MB RAM
7018/06/15 14:53:00 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20180615145300-0004/1 on worker-20180615140043-worker3-56574 (worker3:56574) with 1 cores
7118/06/15 14:53:00 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20180615145300-0004/1 on hostPort worker3:56574 with 1 cores, 512.0 MB RAM
7218/06/15 14:53:00 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43464.
7318/06/15 14:53:00 INFO netty.NettyBlockTransferService: Server created on master:43464
7418/06/15 14:53:00 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
7518/06/15 14:53:00 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, master, 43464, None)
7618/06/15 14:53:00 INFO storage.BlockManagerMasterEndpoint: Registering block manager master:43464 with 413.9 MB RAM, BlockManagerId(driver, master, 43464, None)
7718/06/15 14:53:00 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, master, 43464, None)
7818/06/15 14:53:00 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, master, 43464, None)
7918/06/15 14:53:00 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20180615145300-0004/0 is now RUNNING
8018/06/15 14:53:00 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20180615145300-0004/1 is now RUNNING
8118/06/15 14:53:00 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d02f8d{/metrics/json,null,AVAILABLE,@Spark}
8218/06/15 14:53:00 INFO cluster.StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
8318/06/15 14:53:01 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 300.0 KB, free 413.6 MB)
8418/06/15 14:53:01 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 26.5 KB, free 413.6 MB)
8518/06/15 14:53:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on master:43464 (size: 26.5 KB, free: 413.9 MB)
8618/06/15 14:53:01 INFO spark.SparkContext: Created broadcast 0 from newAPIHadoopRDD at SparkConnectHbaseTest.java:35
8718/06/15 14:53:01 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x895416d connecting to ZooKeeper ensemble=192.168.189.1:2181,192.168.189.2:2181,192.168.189.3:2181
8818/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
8918/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:host.name=master
9018/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_60
9118/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
9218/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/local/jdk1.8.0_60/jre
9318/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar:/usr/local/spark-2.2.1-bin-hadoop2.6/conf/:。。。。。。。。。
94
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63 118/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
218/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
318/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
418/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
518/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
618/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:os.version=3.16.0-30-generic
718/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:user.name=root
818/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
918/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:user.dir=/root
1018/06/15 14:53:02 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.189.1:2181,192.168.189.2:2181,192.168.189.3:2181 sessionTimeout=90000 watcher=hconnection-0x895416d0x0, quorum=192.168.189.1:2181,192.168.189.2:2181,192.168.189.3:2181, baseZNode=/hbase
1118/06/15 14:53:02 INFO zookeeper.ClientCnxn: Opening socket connection to server 192.168.189.3/192.168.189.3:2181. Will not attempt to authenticate using SASL (unknown error)
1218/06/15 14:53:02 INFO zookeeper.ClientCnxn: Socket connection established to 192.168.189.3/192.168.189.3:2181, initiating session
1318/06/15 14:53:02 INFO zookeeper.ClientCnxn: Session establishment complete on server 192.168.189.3/192.168.189.3:2181, sessionid = 0x3640207247f0009, negotiated timeout = 40000
1418/06/15 14:53:04 INFO util.RegionSizeCalculator: Calculating region sizes for table "db_res:wtb_ow_operation".
1518/06/15 14:53:05 INFO client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
1618/06/15 14:53:05 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3640207247f0009
1718/06/15 14:53:05 INFO zookeeper.ClientCnxn: EventThread shut down
1818/06/15 14:53:05 INFO zookeeper.ZooKeeper: Session: 0x3640207247f0009 closed
1918/06/15 14:53:05 INFO spark.SparkContext: Starting job: count at SparkConnectHbaseTest.java:37
2018/06/15 14:53:05 INFO scheduler.DAGScheduler: Got job 0 (count at SparkConnectHbaseTest.java:37) with 1 output partitions
2118/06/15 14:53:05 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (count at SparkConnectHbaseTest.java:37)
2218/06/15 14:53:05 INFO scheduler.DAGScheduler: Parents of final stage: List()
2318/06/15 14:53:05 INFO scheduler.DAGScheduler: Missing parents: List()
2418/06/15 14:53:05 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (NewHadoopRDD[0] at newAPIHadoopRDD at SparkConnectHbaseTest.java:35), which has no missing parents
2518/06/15 14:53:05 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2040.0 B, free 413.6 MB)
2618/06/15 14:53:05 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1278.0 B, free 413.6 MB)
2718/06/15 14:53:05 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on master:43464 (size: 1278.0 B, free: 413.9 MB)
2818/06/15 14:53:05 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
2918/06/15 14:53:05 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (NewHadoopRDD[0] at newAPIHadoopRDD at SparkConnectHbaseTest.java:35) (first 15 tasks are for partitions Vector(0))
3018/06/15 14:53:05 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
3118/06/15 14:53:20 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
3218/06/15 14:53:33 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.189.2:36455) with ID 0
3318/06/15 14:53:33 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, worker1, executor 0, partition 0, NODE_LOCAL, 4879 bytes)
3418/06/15 14:53:34 INFO storage.BlockManagerMasterEndpoint: Registering block manager worker1:56820 with 117.0 MB RAM, BlockManagerId(0, worker1, 56820, None)
3518/06/15 14:53:34 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.189.4:45624) with ID 1
3618/06/15 14:53:35 INFO storage.BlockManagerMasterEndpoint: Registering block manager worker3:38924 with 117.0 MB RAM, BlockManagerId(1, worker3, 38924, None)
3718/06/15 14:53:42 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on worker1:56820 (size: 1278.0 B, free: 117.0 MB)
3818/06/15 14:53:43 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on worker1:56820 (size: 26.5 KB, free: 116.9 MB)
3918/06/15 14:53:57 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 23730 ms on worker1 (executor 0) (1/1)
4018/06/15 14:53:57 INFO scheduler.DAGScheduler: ResultStage 0 (count at SparkConnectHbaseTest.java:37) finished in 51.886 s
4118/06/15 14:53:57 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
4218/06/15 14:53:58 INFO scheduler.DAGScheduler: Job 0 finished: count at SparkConnectHbaseTest.java:37, took 52.827846 s
43************SPARK from hbase count *************** 1 18/06/15 14:53:58 INFO spark.SparkContext: Starting job: foreach at SparkConnectHbaseTest.java:40
4418/06/15 14:53:58 INFO scheduler.DAGScheduler: Got job 1 (foreach at SparkConnectHbaseTest.java:40) with 1 output partitions
4518/06/15 14:53:58 INFO scheduler.DAGScheduler: Final stage: ResultStage 1 (foreach at SparkConnectHbaseTest.java:40)
4618/06/15 14:53:58 INFO scheduler.DAGScheduler: Parents of final stage: List()
4718/06/15 14:53:58 INFO scheduler.DAGScheduler: Missing parents: List()
4818/06/15 14:53:58 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (NewHadoopRDD[0] at newAPIHadoopRDD at SparkConnectHbaseTest.java:35), which has no missing parents
4918/06/15 14:53:58 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.2 KB, free 413.6 MB)
5018/06/15 14:53:58 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1430.0 B, free 413.6 MB)
5118/06/15 14:53:58 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on master:43464 (size: 1430.0 B, free: 413.9 MB)
5218/06/15 14:53:58 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
5318/06/15 14:53:58 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (NewHadoopRDD[0] at newAPIHadoopRDD at SparkConnectHbaseTest.java:35) (first 15 tasks are for partitions Vector(0))
5418/06/15 14:53:58 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
5518/06/15 14:53:58 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, worker1, executor 0, partition 0, NODE_LOCAL, 4879 bytes)
5618/06/15 14:53:58 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on worker1:56820 (size: 1430.0 B, free: 116.9 MB)
5718/06/15 14:53:59 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 507 ms on worker1 (executor 0) (1/1)
5818/06/15 14:53:59 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
5918/06/15 14:53:59 INFO scheduler.DAGScheduler: ResultStage 1 (foreach at SparkConnectHbaseTest.java:40) finished in 0.508 s
6018/06/15 14:53:59 INFO scheduler.DAGScheduler: Job 1 finished: foreach at SparkConnectHbaseTest.java:40, took 0.533378 s
6118/06/15 14:58:02 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on master:43464 in memory (size: 1430.0 B, free: 413.9 MB)
6218/06/15 14:58:02 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on worker1:56820 in memory (size: 1430.0 B, free: 116.9 MB)
63
console截图如下:
spark web截图如下: