且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用python pyhs2连接到hive?

更新时间:2022-10-31 15:53:31

1- 使用(在 Linux 上)找出本地主机的 IP 地址:

主机名 -I

2- 将 localhost 更改为实际 ip

我还建议您仔细检查 Hive 所在的主机.如果您使用 hortonworks,请在 Ambari 上转到 Hive,然后转到 Configs 并检查那里的主机.

编辑(添加另一个建议):

您的用户名和密码很可能不是None.要获取您的用户名和密码,请检查 hive-site.xml 并查看 javax.jdo.option.ConnectionUserNamejavax.jdo.option 中的值.连接密码.如果找不到任何内容,请尝试使用空字符串作为密码(而不是 None),并将 hive 或空字符串作为用户名,即一一尝试:

conn = pyhs2.connect(host='localhost', port=10000,authMechanism='PLAIN', user='hive', password='',database='default')

conn = pyhs2.connect(host='localhost', port=10000,authMechanism='PLAIN', user='', password='',database='default')>

请注意,我也将 authMechanism 更改为 "PLAIN"

I am trying to access hive using pyhs2. I tried the following code:

example.py

import pyhs2
conn = pyhs2.connect(host='localhost', port=10000,authMechanism=None, user=None, password=None,database='default')
with conn.cursor() as cur:
        cur.execute("select * from table")
        for i in cur.fetch():
            print i

I am getting the following error:

    Traceback (most recent call last):
 File "example.py", line 2, in <module> conn = pyhs2.connect(host='localhost', port=10000,authMechanism=None, user=None, password=None,database='default')
      File "build/bdist.linux-x86_64/egg/pyhs2/__init__.py", line 7, in connect
      File "build/bdist.linux-x86_64/egg/pyhs2/connections.py", line 46, in __init__
      File "build/bdist.linux-x86_64/egg/pyhs2/cloudera/thrift_sasl.py", line 55, in open
      File "build/bdist.linux-x86_64/egg/thrift/transport/TSocket.py", line 101, in open
    thrift.transport.TTransport.TTransportException: Could not connect to localhost:10000

I am getting the exact error when I try with hive utils. I have checked sasl installation. Do I need to make any changes to the hive-site.xml in hive? If yes where do I need to create it? Am I missing out something?

1- Figure out the IP address of the localhost using (on Linux):

hostname -I

2- Change localhost to the actual ip

I would also suggest that you double check which host Hive is on. If you are using hortonworks, on Ambari, go to Hive, then Configs and check the host there.

Edit (adding another suggestion):

Your username and password most likely aren't None. To get your username and password, check hive-site.xml and look at the values in javax.jdo.option.ConnectionUserName and javax.jdo.option.ConnectionPassword. If you can't find anything, try an empty string as the password (as opposed to None), and hive or empty string as the username i.e. try these one by one:

conn = pyhs2.connect(host='localhost', port=10000,authMechanism='PLAIN', user='hive', password='',database='default')

conn = pyhs2.connect(host='localhost', port=10000,authMechanism='PLAIN', user='', password='',database='default')

Note that I also changed authMechanism to "PLAIN"