转载

[原]Hadoop2.3、 Hbase0.98、 Hive0.13架构中Hive的安装部署配置以及数据测试

简介:

Hive 是基于 Hadoop 的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供简单的 sql 查询功能,可以将 sql 语句转换为 MapReduce 任务进行运行。 其优点是学习成本低,可以通过类 SQL 语句快速实现简单的 MapReduce 统计,不必开发专门的 MapReduce 应用,十分适合数据仓库的统计分析。

1,  适用场景

Hive  构建在基于静态批处理的 Hadoop  之上, Hadoop  通常都有较高的延迟并且在作业提交和调度的时候需要大量的开销。因此, Hive  并不能够在大规模数据集上实现低延迟快速的查询,例如, Hive  在几百 MB  的数据集上执行查询一般有分钟级的时间延迟。因此,

Hive  并不适合那些需要低延迟的应用,例如,联机事务处理( OLTP )。 Hive  查询操作过程严格遵守 Hadoop MapReduce  的作业执行模型, Hive  将用户的 HiveQL 语句通过解释器转换为 MapReduce  作业提交到 Hadoop  集群上, Hadoop  监控作业执行过程,然后返回作业执行结果给用户。 Hive  并非为联机事务处理而设计, Hive  并不提供实时的查询和基于行级的数据更新操作。 Hive  的最佳使用场合是大数据集的批处理作业,例如,网络日志分析。

2 ,下载安装

前期hadoop安装准备,参考: http://blog.itpub.net/26230597/viewspace-1257609/

下载地址

wget  http://mirror.bit.edu.cn/apache/hive/hive-0.13.1/apache-hive-0.13.1-bin.tar.gz

解压安装

tar zxvf apache-hive-0.13.1-bin.tar.gz  -C /home/hadoop/src/

PS Hive 只需要在一个节点上安装即可,本例安装在 name 节点上面的虚拟机上面,与 hadoop name 节点复用一台虚拟机器。

3 ,配置 hive 环境变量

vim hive-env.sh

export HIVE_HOME=/home/hadoop/src/hive-0.13.1

export PATH=$PATH:$HIVE_HOME/bin

4 ,配置 hadoop 以及 hbase 参数

vim hive-env.sh

# Set HADOOP_HOME to point to a specific hadoop install directory

HADOOP_HOME=/home/hadoop/src/hadoop-2.3.0/

# Hive Configuration Directory can be controlled by:

export HIVE_CONF_DIR=/home/hadoop/src/hive-0.13.1/conf

# Folder containing extra ibraries required for hive compilation/execution can be controlled by:

export HIVE_AUX_JARS_PATH=/home/hadoop/src/hive-0.13.1/lib

5 ,验证安装:

启动 hive 命令行模式,出现 hive ,说明安装成功了

[hadoop@name01 lib]$ hive --service cli

15/01/09 00:20:32 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead

Logging initialized using configuration in jar:file:/home/hadoop/src/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties

创建表,执行 create 命令,出现 OK ,说明命令执行成功,也说明 hive 安装成功。

hive> create table test(key string);

OK

Time taken: 8.749 seconds

hive>

6 ,验证可用性

启动 hive

[hadoop@name01 root]$hive --service metastore &

查看后台 hive 运行进程

[hadoop@name01 root]$ ps -eaf|grep hive

hadoop    4025  2460  1 22:52 pts/0    00:00:19 /usr/lib/jvm/jdk1.7.0_60/bin/java -Xmx256m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/hadoop/src/hadoop-2.3.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/hadoop/src/hadoop-2.3.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Djava.library.path=/home/hadoop/src/hadoop-2.3.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /home/hadoop/src/hive-0.13.1/lib/hive-service-0.13.1.jar org.apache.hadoop.hive.metastore.HiveMetaStore

hadoop    4575  4547  0 23:14 pts/1    00:00:00 grep hive

[hadoop@name01 root]$

6.1 hive 下执行命令,创建 2 个字段的表,字段间隔用 ’,’ 隔开:

hive> create table test(key string);

OK

Time taken: 8.749 seconds

hive> create table tim_test(id int,name string) row format delimited fields terminated by ',';

OK

Time taken: 0.145 seconds

hive>

6.2 准备导入到数据库的 txt 文件,并输入值:

[hadoop@name01 hive-0.13.1]$ more tim_hive_test.txt

123,xinhua

456,dingxilu

789,fanyulu

903,fahuazhengroad

[hadoop@name01 hive-0.13.1]$

6.4  再打开一个 xshell 端口,进入服务器端启动 hive

[hadoop@name01 root]$ hive --service metastore

Starting Hive Metastore Server

6.5  再打开一个 xshell 端口,进入 hive 客户端录入数据:

[hadoop@name01 hive-0.13.1]$ hive

Logging initialized using configuration in jar:file:/home/hadoop/src/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties

hive> load data local inpath  '/home/hadoop/src/hive-0.13.1/tim_hive_test.txt'   into table tim_test;

Copying data from file:/home/hadoop/src/hive-0.13.1/tim_hive_test.txt

Copying file: file:/home/hadoop/src/hive-0.13.1/tim_hive_test.txt

Loading data to table default.tim_test

[Warning] could not update stats.

OK

Time taken: 7.208 seconds

hive>

6.6  验证录入数据是否成功,看到 dfs 出来有 tim_test

hive> dfs -ls /home/hadoop/hive/warehouse;

Found 2 items

drwxr-xr-x   - hadoop supergroup          0 2015-01-12 01:47 /home/hadoop/hive/warehouse/hive_hbase_mapping_table_1

drwxr-xr-x   - hadoop supergroup          0 2015-01-12 02:11 /home/hadoop/hive/warehouse/tim_test

hive>

7,安装部署中的报错记录:
报错
1

[hadoop@name01 conf]$ hive --service metastore

Starting Hive Metastore Server

javax.jdo.JDOFatalInternalException: Error creating transactional connection factory

Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.

缺少 mysql jar 包, copy hive lib 目录下面, OK

报错 2

[hadoop@name01 conf]$ hive --service metastore

Starting Hive Metastore Server

javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://192.168.52.130:3306/hive_remote?createDatabaseIfNotExist=true, username = root. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------

java.sql.SQLException: null,  message from server: "Host '192.168.52.128' is not allowed to connect to this MySQL server"

hadoop 用户添加到 mysql 组:

[root@data02 mysql]# gpasswd -a hadoop mysql

Adding user hadoop to group mysql

[root@data02 mysql]#

^C[hadoop@name01 conf]$ telnet 192.168.52.130 3306

Trying 192.168.52.130...

Connected to 192.168.52.130.

Escape character is '^]'.

G

Host '192.168.52.128' is not allowed to connect to this MySQL serverConnection closed by foreign host.

[hadoop@name01 conf]$

解决办法:修改 mysql 账号

mysql> update user set user = 'hadoop' where user = 'root' and host='%';

Query OK, 1 row affected (0.04 sec)

Rows matched: 1  Changed: 1  Warnings: 0

mysql> flush privileges;

Query OK, 0 rows affected (0.09 sec)

mysql>

报错 3

[hadoop@name01 conf]$ hive --service metastore

Starting Hive Metastore Server

javax.jdo.JDOException: Exception thrown calling table.exists() for hive_remote.`SEQUENCE_TABLE`

at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)

at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732)

at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)

……

NestedThrowablesStackTrace:

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

解决,去远程 mysql 库上修改字符集从 utf8mb4 修改成 utf8

mysql> alter database hive_remote /*!40100 DEFAULT CHARACTER SET utf8 */;

Query OK, 1 row affected (0.03 sec)

mysql>

然后在 data01 上面配置 hive client

scp -r hive-0.13.1/ data01:/home/hadoop/src/

报错 3

继续启动,查看日志信息:

[hadoop@name01 conf]$ hive --service metastore

Starting Hive Metastore Server

卡在这里不动,去看日志信息

[hadoop@name01 hadoop]$ tail -f hive.log

2015-01-09 03:46:27,692 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(229)) - Initialized ObjectStore

2015-01-09 03:46:27,892 WARN  [main]: metastore.ObjectStore (ObjectStore.java:checkSchema(6295)) - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.0

2015-01-09 03:46:30,574 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(551)) - Added admin role in metastore

2015-01-09 03:46:30,582 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(560)) - Added public role in metastore

2015-01-09 03:46:31,168 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(588)) - No user is added in admin role, since config is empty

2015-01-09 03:46:31,473 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5178)) - Starting DB backed MetaStore Server

2015-01-09 03:46:31,481 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5190)) - Started the new metaserver on port [9083]...

2015-01-09 03:46:31,481 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5192)) - Options.minWorkerThreads = 200

2015-01-09 03:46:31,482 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5194)) - Options.maxWorkerThreads = 100000

2015-01-09 03:46:31,482 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5196)) - TCP keepalive = true

hive-site.xml 上添加如下:

<property>

<name>hive.metastore.uris</name>

<value>thrift://192.168.52.128:9083</value>

</property>

报错 4

2015-01-09 04:01:43,053 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(229)) - Initialized ObjectStore

2015-01-09 04:01:43,540 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(551)) - Added admin role in metastore

2015-01-09 04:01:43,546 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(560)) - Added public role in metastore

2015-01-09 04:01:43,684 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(588)) - No user is added in admin role, since config is empty

2015-01-09 04:01:44,041 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5178)) - Starting DB backed MetaStore Server

2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5190)) - Started the new metaserver on port [9083]...

2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5192)) - Options.minWorkerThreads = 200

2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5194)) - Options.maxWorkerThreads = 100000

2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5196)) - TCP keepalive = true

2015-01-09 04:24:13,917 INFO  [Thread-3]: metastore.HiveMetaStore (HiveMetaStore.java:run(5073)) - Shutting down hive metastore.

解决:

查了好久, No user is added in admin role, since config is empty 没有查到问题所在,碰到此类情况的一起交流下,欢迎留言。

-------- - ------- ----------------------------------------------------------------- - ------------------------------
<版权所 有, 允许转载,但必须以链接方式注明源地址,否则追究法律责 任!>
原博客地址:      http://blog.itpub.net/26230597/viewspace-1400379/
原作者: 黄杉 (mchdba)
-------- - --- - ---------------------------------------------------------------------------------------------------

正文到此结束
Loading...