hadoop组件—面向列的开源数据库(九)–python–python使用thrift连接hbase

释放双眼,带上耳机,听听看~!

Python使用thrift操作HBase

thrift支持多种语言进行连接使用,但是没找到linux中的cli操作命令行的形式。所以如果服务器有python环境的话,可以使用python进行连接,快速测试。

确认hbase和thrift服务已经安装,启动

相关hbase和thrift的安装,启动参考

注意:我这里使用的CDH套装中的hbase服务,如果单独安装hbase使用的话,请参考文末附录。

hadoop基础—-hadoop实战(七)—–hadoop管理工具—使用Cloudera Manager安装Hadoop—Cloudera Manager和CDH5.8离线安装

hadoop组件—面向列的开源数据库(三)—hbase的接口thrift简介和安装

在root权限下使用命令 (如果是个人账户,有可能看不到root账户安装的程序)


1
2
3
1jps
2
3

输出如下:


1
2
3
4
5
6
1root@master:/# jps
23332 Jps
33254 ThriftServer
42685 HMaster
5
6

有HMaster 说明 hbase服务正常运行,有ThriftServer说明thrift服务正常运行。

python2连接hbase

检查环境

明确python的版本 和pip是否安装


1
2
3
4
1[zzq@host252 ~]$ python --version
2Python 2.7.11
3
4

pip


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
1[zzq@host252 ~]$ pip
2
3Usage:  
4  pip <command> [options]
5
6Commands:
7  install                     Install packages.
8  download                    Download packages.
9  uninstall                   Uninstall packages.
10  freeze                      Output installed packages in requirements format.
11  list                        List installed packages.
12  show                        Show information about installed packages.
13  check                       Verify installed packages have compatible dependencies.
14  config                      Manage local and global configuration.
15  search                      Search PyPI for packages.
16  wheel                       Build wheels from your requirements.
17  hash                        Compute hashes of package archives.
18  completion                  A helper command used for command completion.
19  help                        Show help for commands.
20
21General Options:
22  -h, --help                  Show help.
23  --isolated                  Run pip in an isolated mode, ignoring environment variables and user configuration.
24  -v, --verbose               Give more output. Option is additive, and can be used up to 3 times.
25  -V, --version               Show version and exit.
26  -q, --quiet                 Give less output. Option is additive, and can be used up to 3 times (corresponding to WARNING, ERROR, and CRITICAL logging levels).
27  --log <path>                Path to a verbose appending log.
28  --proxy <proxy>             Specify a proxy in the form [user:passwd@]proxy.server:port.
29  --retries <retries>         Maximum number of retries each connection should attempt (default 5 times).
30  --timeout <sec>             Set the socket timeout (default 15 seconds).
31  --exists-action <action>    Default action when a path already exists: (s)witch, (i)gnore, (w)ipe, (b)ackup, (a)bort).
32  --trusted-host <hostname>   Mark this host as trusted, even though it does not have valid or any HTTPS.
33  --cert <path>               Path to alternate CA bundle.
34  --client-cert <path>        Path to SSL client certificate, a single file containing the private key and the certificate in PEM format.
35  --cache-dir <dir>           Store the cache data in <dir>.
36  --no-cache-dir              Disable the cache.
37  --disable-pip-version-check
38                              Don't periodically check PyPI to determine whether a new version of pip is available for download. Implied with --no-index.
39  --no-color                  Suppress colored output
40[zzq@host252 ~]$
41
42
43

可能遇到的问题–bash: pip: command not found

解决方法 把对应python路径中的pip连接到系统层面

首先查下安装路径:


1
2
3
1find / -name pip
2
3

做个软连接


1
2
3
1ln -sv /usr/local/python/bin/pip /usr/bin/pip
2
3

创建虚拟环境

为了不影响系统的python环境 最好新建一个 虚拟环境来运行(当然也可以不创建,直接在系统python环境中操作)

只有python2.7及更高版本才支持virtualenv这个脚本的运行

使用命令如下:


1
2
3
4
5
1pip install virtualenv
2或
3pip2 install virtualenv -i https://pypi.douban.com/simple
4
5

安装完成后使用命令校验


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
1[zzq@host252 ~]$ virtualenv
2You must provide a DEST_DIR
3Usage: virtualenv [OPTIONS] DEST_DIR
4
5Options:
6  --version             show program's version number and exit
7  -h, --help            show this help message and exit
8  -v, --verbose         Increase verbosity.
9  -q, --quiet           Decrease verbosity.
10  -p PYTHON_EXE, --python=PYTHON_EXE
11                        The Python interpreter to use, e.g.,
12                        --python=python3.5 will use the python3.5 interpreter
13                        to create the new environment.  The default is the
14                        interpreter that virtualenv was installed with
15                        (/usr/bin/python3.6)
16  --clear               Clear out the non-root install and start from scratch.
17  --no-site-packages    DEPRECATED. Retained only for backward compatibility.
18                        Not having access to global site-packages is now the
19                        default behavior.
20  --system-site-packages
21                        Give the virtual environment access to the global
22                        site-packages.
23  --always-copy         Always copy files rather than symlinking.
24  --relocatable         Make an EXISTING virtualenv environment relocatable.
25                        This fixes up scripts and makes all .pth files
26                        relative.
27  --no-setuptools       Do not install setuptools in the new virtualenv.
28  --no-pip              Do not install pip in the new virtualenv.
29  --no-wheel            Do not install wheel in the new virtualenv.
30  --extra-search-dir=DIR
31                        Directory to look for setuptools/pip distributions in.
32                        This option can be used multiple times.
33  --download            Download preinstalled packages from PyPI.
34  --no-download, --never-download
35                        Do not download preinstalled packages from PyPI.
36  --prompt=PROMPT       Provides an alternative prompt prefix for this
37                        environment.
38  --setuptools          DEPRECATED. Retained only for backward compatibility.
39                        This option has no effect.
40  --distribute          DEPRECATED. Retained only for backward compatibility.
41                        This option has no effect.
42  --unzip-setuptools    DEPRECATED.  Retained only for backward compatibility.
43                        This option has no effect.
44
45

创建虚拟环境使用命令


1
2
3
4
5
6
7
1mkdir my-python2hbase-env
2cd my-python2hbase-env
3
4创建
5virtualenv  project-env
6
7

使用命令查看当前目录


1
2
3
1pwd
2
3

输出为:


1
2
3
1/home/zzq/my-python2hbase-env
2
3

进入虚拟环境


1
2
3
1source /home/zzq/my-python2hbase-env/project-env/bin/activate
2
3

安装依赖包

一共需要两个依赖包 Thrift和hbase-thrift 使用命令如下:

python连接hbase的包也有很多种
HBase-Thrift
happyhbase
hbase-python 的pypi仓库
hbase-python github

我们这里使用HBase-Thrift

安装Thrift依赖包


1
2
3
1pip install thrift
2
3

安装成功输出如下:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
1(project-env) [root@host3 my-python2hbase-env]# pip install thrift
2DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
3Collecting thrift
4  Downloading https://files.pythonhosted.org/packages/c6/b4/510617906f8e0c5660e7d96fbc5585113f83ad547a3989b80297ac72a74c/thrift-0.11.0.tar.gz (52kB)
5     |████████████████████████████████| 61kB 46kB/s
6Collecting six>=1.7.2
7  Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
8Building wheels for collected packages: thrift
9  Building wheel for thrift (setup.py) ... done
10  Created wheel for thrift: filename=thrift-0.11.0-cp27-cp27mu-linux_x86_64.whl size=264173 sha256=8392860fa66ddd575b004c4d1ef13f1a462c01a779ddfa1929db42bcebe26a34
11  Stored in directory: /root/.cache/pip/wheels/be/36/81/0f93ba89a1cb7887c91937948519840a72c0ffdd57cac0ae8f
12Successfully built thrift
13Installing collected packages: six, thrift
14Successfully installed six-1.12.0 thrift-0.11.0
15(project-env) [root@host3 my-python2hbase-env]#
16
17

安装hbase-thrift依赖包


1
2
3
1pip install hbase-thrift
2
3

安装成功输出如下:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
1(project-env) [root@host3 my-python2hbase-env]# pip install hbase-thrift
2DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
3Collecting hbase-thrift
4  Downloading https://files.pythonhosted.org/packages/89/f7/dbb6c764bb909ed361c255828701228d8c9867d541cfef84127e6f3704cc/hbase-thrift-0.20.4.tar.gz
5Requirement already satisfied: Thrift in ./project-env/lib/python2.7/site-packages (from hbase-thrift) (0.11.0)
6Requirement already satisfied: six>=1.7.2 in ./project-env/lib/python2.7/site-packages (from Thrift->hbase-thrift) (1.12.0)
7Building wheels for collected packages: hbase-thrift
8  Building wheel for hbase-thrift (setup.py) ... done
9  Created wheel for hbase-thrift: filename=hbase_thrift-0.20.4-cp27-none-any.whl size=19705 sha256=c3334f4d28c385ec7b29fda6db64c128c76e08e4bc2cfe9e1d20ff8dbd813629
10  Stored in directory: /root/.cache/pip/wheels/fe/51/f2/afb7b010cd97910aa0b651d492735a38ed69a93a817444904e
11Successfully built hbase-thrift
12Installing collected packages: hbase-thrift
13Successfully installed hbase-thrift-0.20.4
14You have mail in /var/spool/mail/root
15(project-env) [root@host3 my-python2hbase-env]#
16
17

python连接thrift代码

目前的Hbase有两套thrift接口(可以叫thrift和thrift2),它们并不兼容

先来看看连接thrift的代码


1
2
3
1vi query.py
2
3

注意 localhost 和端口9090(thrift默认端口) 需要与自己的对应

输入内容如下:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1from thrift import Thrift
2from thrift.transport import TSocket
3from thrift.transport import TTransport
4from thrift.protocol import TBinaryProtocol
5
6from hbase import Hbase
7from hbase.ttypes import *
8
9transport = TSocket.TSocket('localhost', 9090)
10
11transport = TTransport.TBufferedTransport(transport)
12protocol = TBinaryProtocol.TBinaryProtocol(transport)
13
14client = Hbase.Client(protocol)
15transport.open()
16print client.getTableNames()
17
18

可能遇到问题–thrift.Thrift.TApplicationException: Invalid method name: 'getTableNames

原因

客户端thrift版本和hbase thrift server的thrift版本不一致造成的。

thrift server上是使用的thrift2启动的,而客户端使用的是thrift访问的。

解决方法
因为根本原因在于客户端和服务器thrift版本不一致,那么解决方法有两个:

1、服务端以启动thrift版本的thrift server
hbase 的 thrift server以thrift1方式启动。


1
2
3
4
5
1hbase-daemon.sh stop thrift2
2#启动命令
3hbase-daemon.sh start thrift
4
5

如果想使用happybase这个好用的模块去连接hbase,只能使用thrift,因为happybase目前还不支持thrift2

python连接thrift2代码

python连接thrift2要稍微麻烦一些

生成对应编译器–注意thrift版本和thrift2版本

需要安装Thrift编译器,才能生成HBase跨语言的API。
生成编译器的工具的路径如下

如果是原生安装的hbase路径为:
$HBASE_HOME/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift

如果是CDH安装的hbase路径为:
/opt/cloudera/parcels/CDH/lib/hue/apps/hbase/thrift/Hbase.thrift

如果实在找不到则使用全局搜索命令


1
2
3
1sudo find  /  -name "Hbase.thrift"
2
3

如图:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-xfIRozbm-1575285150936)(http://image.525.life/FsupHgwkScRimAgDcivI66mAnMRA)]

使用命令生成python版本的编译器


1
2
3
1thrift --gen py  /opt/cloudera/parcels/CDH/lib/hue/apps/hbase/thrift/Hbase.thrift
2
3

如果报错 -bash: thrift: command not found

需要安装 thrift,参考 hadoop组件—面向列的开源数据库(三)—hbase的接口thrift简介和安装

该命令会在当前目录下生成 gen-py文件夹

因为 CDH的hbase只提供了thrift1类型的编译器,所以 需要我们在其他地方找一下thrift2的编译器Hbase.thrift。

如果是HDP版本的话Hbase则提供了两个版本的编译器,路径和使用的命令可能如下:


1
2
3
4
5
6
7
1# hdp hbase.thrift 文件路径
2cd /usr/hdp/3.0.0.0-1634/hbase/include/thrift/
3# 生成 python
4# 该路径下存在 thrift1 和 thrift2 两种,可以自行选择
5thrift -gen py hbase1.thrift 或 thrift -gen py hbase2.thrift
6
7

如果不是使用的HDP版本的Hbase的话,需要去github里找到hbase源码项目中,有thrift1和thrift2两个版本的编译器


1
2
3
4
5
6
7
1
2
3 thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift1/hbase.thrift
4或者
5 thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift
6
7

甚至可以直接下载编译好的文件使用
https://github.com/apache/hbase/tree/master/hbase-examples/src/main/python

我们这里直接下载 thrift2编译好的文件

链接:https://pan.baidu.com/s/1s3iysNJHW7s8lW6ni4qxrw
提取码:is1j

thrift1版本可以得到一组 Python 文件:


1
2
3
4
5
6
7
8
9
10
1[zzq@host252 thrift-0.10.0]$ ll gen-py/hbased/
2total 440
3-rw-rw-r--. 1 zzq zzq    326 Nov 28 19:38 constants.py
4-rw-rw-r--. 1 zzq zzq 384499 Nov 28 19:38 Hbase.py
5-rwxr-xr-x. 1 zzq zzq  14386 Nov 28 19:38 Hbase-remote
6-rw-rw-r--. 1 zzq zzq     43 Nov 28 19:38 __init__.py
7-rw-rw-r--. 1 zzq zzq  38776 Nov 28 19:38 ttypes.py
8[zzq@host252 thrift-0.10.0]$
9
10

thrift2版本会得到以下文件


1
2
3
4
5
6
7
8
1$ ls gen-py
2gen-py/hbase/__init__.py
3gen-py/hbase/constants.py
4gen-py/hbase/THBaseService.py
5gen-py/hbase/ttypes.py
6
7
8

因为thrift2没有getTableNames()方法,所以我们需要先手动创建一个测试用的table。


1
2
3
4
5
6
7
8
9
10
1hbase shell
2
3hbase(main):001:0> create "example", NAME => "family"
40 row(s) in 1.6480 seconds
5
6=> Hbase::Table - example
7hbase(main):002:0>
8
9
10

假如我们的gen-py路径为:
/home/zzq/thrift2/gen-py

则使用命令创建测试脚本test.py


1
2
3
1vim test.py
2
3

输入内容如下:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
1import sys
2import os
3import time
4
5from thrift.transport import TTransport
6from thrift.transport import TSocket
7from thrift.transport import THttpClient
8from thrift.protocol import TBinaryProtocol
9
10# Add path for local "gen-py/hbase" for the pre-generated module
11sys.path.append("/home/zzq/thrift2/gen-py")
12from hbase import THBaseService
13from hbase.ttypes import *
14
15print "Thrift2 Demo"
16print "This demo assumes you have a table called \"example\" with a column family called \"family\""
17
18host = "192.168.30.250"
19port = 9090
20framed = False
21
22socket = TSocket.TSocket(host, port)
23if framed:
24  transport = TTransport.TFramedTransport(socket)
25else:
26  transport = TTransport.TBufferedTransport(socket)
27protocol = TBinaryProtocol.TBinaryProtocol(transport)
28client = THBaseService.Client(protocol)
29
30transport.open()
31
32table = "example"
33
34put = TPut(row="row1", columnValues=[TColumnValue(family="family",qualifier="qualifier1",value="value1")])
35print "Putting:", put
36client.put(table, put)
37
38get = TGet(row="row1")
39print "Getting:", get
40result = client.get(table, get)
41
42print "Result:", result
43
44transport.close()
45
46
47

使用命令运行


1
2
3
1python test.py
2
3

输出如下:


1
2
3
4
5
6
7
8
9
10
11
1[zzq@host252 ~]$ vi test.py
2[zzq@host252 ~]$ python test.py
3Thrift2 Demo
4This demo assumes you have a table called "example" with a column family called "family"
5Putting: TPut(durability=None, timestamp=None, cellVisibility=None, attributes=None, columnValues=[TColumnValue(qualifier='qualifier1', family='family', tags=None, timestamp=None, value='value1', type=None)], row='row1')
6Getting: TGet(storeOffset=None, existence_only=None, authorizations=None, filterString=None, timestamp=None, maxVersions=None, timeRange=None, filterBytes=None, targetReplicaId=None, consistency=None, attributes=None, storeLimit=None, cacheBlocks=None, columns=None, row='row1')
7Result: TResult(partial=False, stale=False, columnValues=[TColumnValue(qualifier='qualifier1', family='family', tags=None, timestamp=1575022330934, value='value1', type=None)], row='row1')
8[zzq@host252 ~]$
9
10
11

python3连接hbase

检查环境

明确python的版本 和pip是否安装


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
1(project-env) [zzq@host252 ~]$ python --version
2Python 3.6.5
3(project-env) [zzq@host252 ~]$ pip3
4
5Usage:  
6  pip3 <command> [options]
7
8Commands:
9  install                     Install packages.
10  download                    Download packages.
11  uninstall                   Uninstall packages.
12  freeze                      Output installed packages in requirements format.
13  list                        List installed packages.
14  show                        Show information about installed packages.
15  check                       Verify installed packages have compatible dependencies.
16  config                      Manage local and global configuration.
17  search                      Search PyPI for packages.
18  wheel                       Build wheels from your requirements.
19  hash                        Compute hashes of package archives.
20  completion                  A helper command used for command completion.
21  debug                       Show information useful for debugging.
22  help                        Show help for commands.
23
24General Options:
25  -h, --help                  Show help.
26  --isolated                  Run pip in an isolated mode, ignoring environment variables and user configuration.
27  -v, --verbose               Give more output. Option is additive, and can be used up to 3 times.
28  -V, --version               Show version and exit.
29  -q, --quiet                 Give less output. Option is additive, and can be used up to 3 times (corresponding to WARNING, ERROR, and CRITICAL logging levels).
30  --log <path>                Path to a verbose appending log.
31  --proxy <proxy>             Specify a proxy in the form [user:passwd@]proxy.server:port.
32  --retries <retries>         Maximum number of retries each connection should attempt (default 5 times).
33  --timeout <sec>             Set the socket timeout (default 15 seconds).
34  --exists-action <action>    Default action when a path already exists: (s)witch, (i)gnore, (w)ipe, (b)ackup, (a)bort.
35  --trusted-host <hostname>   Mark this host or host:port pair as trusted, even though it does not have valid or any HTTPS.
36  --cert <path>               Path to alternate CA bundle.
37  --client-cert <path>        Path to SSL client certificate, a single file containing the private key and the certificate in PEM format.
38  --cache-dir <dir>           Store the cache data in <dir>.
39  --no-cache-dir              Disable the cache.
40  --disable-pip-version-check
41                              Don't periodically check PyPI to determine whether a new version of pip is available for download. Implied with --no-index.
42  --no-color                  Suppress colored output
43(project-env) [zzq@host252 ~]$
44
45
46

安装依赖包

一共需要两个依赖包 Thrift和hbase-thrift 使用命令如下:

python连接hbase的包也有很多种
HBase-Thrift
happyhbase
hbase-python 的pypi仓库
hbase-python github

我们这里使用HBase-Thrift

安装Thrift依赖包


1
2
3
1pip3 install thrift
2
3

安装hbase-thrift依赖包


1
2
3
1pip3 install hbase-thrift
2
3

python3连接thrift1代码

目前的Hbase有两套thrift1接口(可以叫thrift1和thrift2),它们并不兼容

先来看看连接thrift1的代码


1
2
3
1vi query.py
2
3

注意 localhost 和端口9090(thrift默认端口) 需要与自己的对应

输入内容如下:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1from thrift import Thrift
2from thrift.transport import TSocket
3from thrift.transport import TTransport
4from thrift.protocol import TBinaryProtocol
5
6from hbase import Hbase
7from hbase.ttypes import *
8
9transport = TSocket.TSocket('192.168.30.250', 9090)
10
11transport = TTransport.TBufferedTransport(transport)
12protocol = TBinaryProtocol.TBinaryProtocol(transport)
13
14client = Hbase.Client(protocol)
15transport.open()
16print(client.getTableNames())
17
18

运行命令


1
2
3
1python query.py
2
3

会报错如下:


1
2
3
4
5
6
7
8
9
10
1(project-env) [zzq@host252 ~]$ python query.py
2Traceback (most recent call last):
3  File "query.py", line 6, in <module>
4    from hbase import Hbase
5  File "/home/zzq/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/Hbase.py", line 2066
6    except IOError, io:
7                  ^
8SyntaxError: invalid syntax
9
10

thrift连接的时候需要导入一个Hbase包, 实际是需要另外下载一个第三方包hbase-thrift, 这个包是用Python2写的,加载时会出现兼容性问题。

网上有别人修改好的兼容python3版本的文件,需要下载python3的Hbase文件,替换Hbase文件/usr/local/lib/python3.6/site-packages/hbase/Hbase.py和ttypes.py

如果是虚拟环境则路径 查找如下:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
1(project-env) [zzq@host252 ~]$ which python
2~/my-python2hbase-env/project-env/bin/python
3(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/
4bin  include  lib  lib64
5(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/
6python3.6
7(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/
8abc.py          codecs.py                     copy.py           encodings     __future__.py   hmac.py    keyword.py    no-global-site-packages.txt  os.py         reprlib.py      site-packages     sre_parse.py  tempfile.py  warnings.py
9base64.py       collections                   copyreg.py        enum.py       genericpath.py  importlib  lib-dynload   ntpath.py                    posixpath.py  re.py           site.py           stat.py       tokenize.py  weakref.py
10bisect.py       _collections_abc.py           distutils         fnmatch.py    hashlib.py      imp.py     linecache.py  operator.py                  __pycache__   rlcompleter.py  sre_compile.py    struct.py     token.py     _weakrefset.py
11_bootlocale.py  config-3.6m-x86_64-linux-gnu  _dummy_thread.py  functools.py  heapq.py        io.py      locale.py     orig-prefix.txt              random.py     shutil.py       sre_constants.py  tarfile.py    types.py
12(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase
13hbase/                         hbase_thrift-0.20.4.dist-info/
14(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
15constants.py  Hbase.py  __init__.py  __pycache__  ttypes.py
16
17

下载地址为:

链接:https://pan.baidu.com/s/1-yKP1ghu2IAswnXzWpGNbw
提取码:d132

替换使用命令如下 :


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1(project-env) [zzq@host252 ~]$ unzip hbase3.6.zip
2Archive:  hbase3.6.zip
3  inflating: hbase3.6/Hbase.py      
4  inflating: hbase3.6/readme        
5  inflating: hbase3.6/ttypes.py  
6
7(project-env) [zzq@host252 ~]$ cp hbase3.6/Hbase.py  ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
8(project-env) [zzq@host252 ~]$ cp hbase3.6/ttypes.py   ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
9
10(project-env) [zzq@host252 ~]$ ll ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
11total 276
12-rw-rw-r--. 1 zzq zzq    150 Nov 28 12:13 constants.py
13-rw-rw-r--. 1 zzq zzq 240677 Dec  2 16:44 Hbase.py
14-rw-rw-r--. 1 zzq zzq     43 Nov 28 12:13 __init__.py
15drwxrwxr-x. 2 zzq zzq   4096 Nov 28 12:13 __pycache__
16-rw-rw-r--. 1 zzq zzq  25228 Dec  2 16:44 ttypes.py
17
18

可能遇到问题–thrift.Thrift.TApplicationException: Invalid method name: 'getTableNames

原因

客户端thrift版本和hbase thrift server的thrift版本不一致造成的。

thrift server上是使用的thrift2启动的,而客户端使用的是thrift访问的。

解决方法
因为根本原因在于客户端和服务器thrift版本不一致,那么解决方法有两个:

1、服务端以启动thrift版本的thrift server
hbase 的 thrift server以thrift1方式启动。


1
2
3
4
5
1hbase-daemon.sh stop thrift2
2#启动命令
3hbase-daemon.sh start thrift
4
5

如果想连接服务端的thrift2,参考下节

python3连接thrift2代码

生成对应编译器–注意thrift版本和thrift2版本

流程跟python2的差不多,需要注意的是使用thrift0.10.0以上版本生成编译器,才支持python3.5以上的版本。


1
2
3
4
5
1 thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift1/hbase.thrift
2或者
3 thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift
4
5

我们还是可以直接下载编译好的文件使用
https://github.com/apache/hbase/tree/master/hbase-examples/src/main/python

我们这里直接下载 thrift2编译好的文件

链接:https://pan.baidu.com/s/1s3iysNJHW7s8lW6ni4qxrw
提取码:is1j

thrift1版本可以得到一组 Python 文件:


1
2
3
4
5
6
7
8
9
10
1[zzq@host252 thrift-0.10.0]$ ll gen-py/hbased/
2total 440
3-rw-rw-r--. 1 zzq zzq    326 Nov 28 19:38 constants.py
4-rw-rw-r--. 1 zzq zzq 384499 Nov 28 19:38 Hbase.py
5-rwxr-xr-x. 1 zzq zzq  14386 Nov 28 19:38 Hbase-remote
6-rw-rw-r--. 1 zzq zzq     43 Nov 28 19:38 __init__.py
7-rw-rw-r--. 1 zzq zzq  38776 Nov 28 19:38 ttypes.py
8[zzq@host252 thrift-0.10.0]$
9
10

thrift2版本会得到以下文件


1
2
3
4
5
6
7
8
1$ ls gen-py
2gen-py/hbase/__init__.py
3gen-py/hbase/constants.py
4gen-py/hbase/THBaseService.py
5gen-py/hbase/ttypes.py
6
7
8

因为thrift2没有getTableNames()方法,所以我们需要先手动创建一个测试用的table。


1
2
3
4
5
6
7
8
9
10
1hbase shell
2
3hbase(main):001:0> create "example", NAME => "family"
40 row(s) in 1.6480 seconds
5
6=> Hbase::Table - example
7hbase(main):002:0>
8
9
10

假如我们的gen-py路径为:
/home/zzq/thrift2/gen-py

则使用命令创建测试脚本test.py


1
2
3
1vim test.py
2
3

输入内容如下:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
1import sys
2import os
3import time
4
5from thrift.transport import TTransport
6from thrift.transport import TSocket
7from thrift.transport import THttpClient
8from thrift.protocol import TBinaryProtocol
9
10# Add path for local "gen-py/hbase" for the pre-generated module
11sys.path.append("/home/zzq/thrift2/gen-py")
12from hbase import THBaseService
13from hbase.ttypes import *
14
15print("Thrift2 Demo")
16print("This demo assumes you have a table called \"example\" with a column family called \"family\"")
17
18host = "192.168.30.250"
19port = 9090
20framed = False
21
22socket = TSocket.TSocket(host, port)
23if framed:
24  transport = TTransport.TFramedTransport(socket)
25else:
26  transport = TTransport.TBufferedTransport(socket)
27protocol = TBinaryProtocol.TBinaryProtocol(transport)
28client = THBaseService.Client(protocol)
29
30transport.open()
31
32table = "example"
33
34tableName = str.encode(table)
35rowKey =str.encode('row2')
36put = TPut()
37put.row = rowKey
38columnValues=[TColumnValue(family=str.encode("family"),qualifier=str.encode("qualifier2"),value=str.encode("value2"))]
39put.columnValues = columnValues
40result = client.put(tableName, put)
41
42rowKey =str.encode('row2')  
43get = TGet()
44get.row = rowKey
45result = client.get(tableName, get)
46print(result.row)
47print(result.columnValues)
48for i in result.columnValues:
49     print(i.value)
50
51
52transport.close()
53
54
55

使用命令运行


1
2
3
1python test.py
2
3

输出如下:


1
2
3
4
5
6
7
8
9
10
11
12
1(project-env) [zzq@host252 ~]$ python test.py
2Thrift2 Demo
3This demo assumes you have a table called "example" with a column family called "family"
4b'row2'
5[TColumnValue(family=b'family', qualifier=b'qualifier2', value=b'value2', timestamp=1575285482023, tags=None, type=None)]
6b'value2'
7(project-env) [zzq@host252 ~]$
8
9
10
11
12

可能遇到报错–ImportError: cannot import name ‘THBaseService’


1
2
3
4
5
6
7
1(project-env) [zzq@host252 ~]$ python3.6  test.py
2Traceback (most recent call last):
3  File "test.py", line 12, in <module>
4    from hbase import THBaseService
5ImportError: cannot import name 'THBaseService'
6
7

原因 默认先加载了 project-env/lib/python3.6/site-packages/hbase/路径的hbase.py文件。

没有识别到 gen-py目录

解决方法一

修改路径名

把生成的gen-py目录修改成genpy,否则python3导入会出现问题。

解决方法2 用新的覆盖lib包里的文件


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1(project-env) [zzq@host252 ~]$ cp thrift2/gen-py/hbase/*    ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
2(project-env) [zzq@host252 ~]$ ll ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
3total 1200
4-rw-rw-r--. 1 zzq zzq    366 Dec  2 16:59 constants.py
5-rw-rw-r--. 1 zzq zzq 240677 Dec  2 16:44 Hbase.py
6-rw-rw-r--. 1 zzq zzq     51 Dec  2 16:59 __init__.py
7-rw-rw-r--. 1 zzq zzq    199 Dec  2 16:59 __init__.pyc
8drwxrwxr-x. 2 zzq zzq   4096 Dec  2 16:49 __pycache__
9-rw-rw-r--. 1 zzq zzq 369677 Dec  2 16:59 THBaseService.py
10-rw-rw-r--. 1 zzq zzq 359818 Dec  2 16:59 THBaseService.pyc
11-rw-rw-r--. 1 zzq zzq  14357 Dec  2 16:59 THBaseService-remote
12-rw-rw-r--. 1 zzq zzq 119702 Dec  2 16:59 ttypes.py
13-rw-rw-r--. 1 zzq zzq  97317 Dec  2 16:59 ttypes.pyc
14
15

更多用法参考

https://blog.csdn.net/qq_21153619/article/details/86502624

https://blog.csdn.net/m0_37634723/article/details/79191420

https://blog.csdn.net/zjerryj/article/details/80045657

https://blog.csdn.net/luanpeng825485697/article/details/81048468

附录—单独的hbase服务安装和thrift启动

安装jdk

配置hbase的依赖环境JAVA_HOME

参考文章

linux软件(一)—CentOS安装jdk

Hbase下载

下载地址:http://hbase.apache.org/downloads.html

本地Hbase安装


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1root@master:/usr/local/setup_tools# tar -zxvf hbase-2.0.0-bin.tar.gz
2root@master:/usr/local/setup_tools# mv hbase-2.0.0 /usr/local/
3root@master:/usr/local/setup_tools# cd /usr/local
4root@master:/usr/local# ls | grep hbase
5hbase-2.0.0
6
7
8root@master:/usr/local/hbase-2.0.0# vi /etc/profile
9
10export HBASE_HOME=/usr/local/hbase-2.0.0
11export PATH=.:$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin:$HADOOP_HOME/bin:$SPARK_HOME/bin:$HIVE_HOME/bin:$FLUME_HOME/bin:$ZOOKEEPER_HOME/bin:$KAFKA_HOME/bin:$IDEA_HOME/bin:$eclipse_HOME:$MAVEN_HOME/bin:$ALLUXIO_HOME/bin:$HBASE_HOME/bin
12
13root@master:/usr/local/hbase-2.0.0# source /etc/profile
14
15

配置

修改hbase-site.xml,设置存储数据的根目录。


1
2
3
4
5
6
7
8
9
10
1root@master:/usr/local/hbase-2.0.0/conf# vi hbase-site.xml
2<configuration>
3    <property>
4        <name>hbase.rootdir</name>
5        <value>file:///usr/local/hbase-2.0.0/data</value>
6    </property>
7
8</configuration>
9
10

启动hbase


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
1root@master:/usr/local/hbase-2.0.0# cd bin
2root@master:/usr/local/hbase-2.0.0/bin# ls
3considerAsDead.sh     hbase             hbase-config.cmd  hbase-jruby             master-backup.sh  replication               start-hbase.sh  zookeepers.sh
4draining_servers.rb   hbase-cleanup.sh  hbase-config.sh   hirb.rb                 region_mover.rb   rolling-restart.sh        stop-hbase.cmd
5get-active-master.rb  hbase.cmd         hbase-daemon.sh   local-master-backup.sh  regionservers.sh  shutdown_regionserver.rb  stop-hbase.sh
6graceful_stop.sh      hbase-common.sh   hbase-daemons.sh  local-regionservers.sh  region_status.rb  start-hbase.cmd           test
7
8
9root@master:/usr/local/hbase-2.0.0/bin# start-hbase.sh
10SLF4J: Class path contains multiple SLF4J bindings.
11SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
12SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
13SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
14SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
15SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16running master, logging to /usr/local/hbase-2.0.0/logs/hbase-root-master-master.out
17
18
19root@master:/usr/local/hbase-2.0.0/bin# jps
202757 Jps
212685 HMaster
22
23

使用hbase shell


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1root@master:/usr/local/hbase-2.0.0/bin#  hbase shell
2SLF4J: Class path contains multiple SLF4J bindings.
3SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
4SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
5SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
6SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
7SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
8HBase Shell
9Use "help" to get list of supported commands.
10Use "exit" to quit this interactive shell.
11Version 2.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
12Took 0.0044 seconds                                                                                                                                                    
13hbase(main):001:0>
14
15hbase(main):003:0> version
162.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
17Took 0.0054 seconds                                                                                                                                                    
18hbase(main):004:0>
19
20

启动hbase thrift服务


1
2
3
4
5
6
7
8
9
10
11
12
13
14
1root@master:/usr/local/hbase-2.0.0/bin# hbase-daemon.sh start thrift
2running thrift, logging to /usr/local/hbase-2.0.0/logs/hbase-root-thrift-master.out
3SLF4J: Class path contains multiple SLF4J bindings.
4SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
5SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
6SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
7SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
8
9root@master:/usr/local/hbase-2.0.0/bin# jps
103332 Jps
113254 ThriftServer
122685 HMaster
13
14

给TA打赏
共{{data.count}}人
人已打赏
安全运维

OpenSSH-8.7p1离线升级修复安全漏洞

2021-10-23 10:13:25

安全运维

设计模式的设计原则

2021-12-12 17:36:11

个人中心
购物车
优惠劵
今日签到
有新私信 私信列表
搜索