hive创建外部表过程

侯上校

浏览: 216931 次
性别:
来自: 上海

最近访客更多访客>>

nothingisall

ecjtubaowp

killbrokenwolf

wulugeng-2018

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hive

hadoop@hadoopSlave2:/opt/hadoop/testData$ touch hiveOuter.log 
hadoop@hadoopSlave2:/opt/hadoop/testData$ vim hiveOuter.log 
20140413        jim0    2014-06-06
20140414        jim1    2014-06-06
...
hadoop@hadoopSlave2:/opt/hadoop/testData$ scp -rv hiveOuter.log  hadoopMaster:/opt/hadoop/hive/
hadoop@hadoopSlave2:/opt/hadoop/testData$ ssh hadoopMaster 
hadoop@hadoopMaster:~$ cd /opt/hadoop/hive/
hadoop@hadoopMaster:/opt/hadoop/hive$ hdfs dfs -moveFromLocal /opt/hadoop/hive/hiveOuter.log /user/hive/
hadoop@hadoopMaster:/opt/hadoop/hive$ hdfs dfs -ls -R /
-rw-r--r--   2 hadoop supergroup       2591 2014-06-06 10:00 /user/hive/hiveOuter.log
drwxr-xr-x   - hadoop supergroup          0 2014-06-06 09:51 /user/hive/warehouse
drwxr-xr-x   - hadoop supergroup          0 2014-06-06 09:51 /user/hive/warehouse/test1
...
hive> create EXTERNAL table IF NOT EXISTS userOuter (id int,name string, ed string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' location 'hdfs://hadoopMaster:9000/user/hive/warehouse/test1';
OK
Time taken: 0.309 seconds
hive> show tables;
OK
userouter
Time taken: 0.045 seconds, Fetched: 12 row(s)
hive> LOAD DATA INPATH 'hdfs://hadoopMaster:9000/user/hive/warehouse/hiveOuter.log' INTO TABLE userOuter  partition(ptDate='20140606');
Loading data to table default.hiveOuter partition (ptdate=20140606)
Partition default.testouter{ptdate=20140606} stats: [numFiles=3, numRows=0, totalSize=4312, rawDataSize=0]
OK
Time taken: 1.153 seconds
hive> SELECT * FROM userouter;
OK
20140413	jim0	2014-06-06
20140414	jim1	2014-06-06
20140415	jim2	2014-06-06
....
此时hive将该表的数据文件信息保存到metadata数据库中。 
hadoop@hadoopMaster:~$ mysql -uroot -proot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 47
Server version: 5.5.37-0ubuntu0.12.04.1 (Ubuntu)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hiveDB             |
| mysql              |
| performance_schema |
| schools            |
+--------------------+
5 rows in set (0.00 sec)
mysql> use hiveDB;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
可以看到该表的类型为EXTERNAL_TABLE。 
mysql> SELECT * FROM TBLS WHERE TBL_NAME = 'userouter';
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER  | RETENTION | SD_ID | TBL_NAME  | TBL_TYPE       | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
|     25 |  1402020143 |     1 |                0 | hadoop |         0 |    26 | userouter | EXTERNAL_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
1 row in set (0.00 sec)
在表SDS中记录了表userouter的数据文件路径为hdfs://hadoopMaster:9000/user/hive/warehouse/test1。 
mysql> SELECT * FROM SDS WHERE SD_ID=25;
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT                             | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION                                           | NUM_BUCKETS | OUTPUT_FORMAT                                              | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
|    25 |    24 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://hadoopMaster:9000/user/hive/warehouse/test1 |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |       25 |
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
1 row in set (0.00 sec)

分享到：

【转】ubuntu 12.04安装redis2.6.16 | ubuntu crontab创建定时任务

2014-06-06 10:12
浏览 865
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

hive创建外部表过程

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

hive创建外部表过程

评论

发表评论

相关推荐

Hive的Base64使用

编译spark-project的hive

编译Hive1.2.1

【转】Hive的UDF的自定义

hive创建hbase外部表

hive安装及使用

最近访客更多访客>>