`
侯上校
  • 浏览: 216931 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

hive创建外部表过程

    博客分类:
  • hive
 
阅读更多
hadoop@hadoopSlave2:/opt/hadoop/testData$ touch hiveOuter.log 
hadoop@hadoopSlave2:/opt/hadoop/testData$ vim hiveOuter.log 
20140413        jim0    2014-06-06
20140414        jim1    2014-06-06
...
hadoop@hadoopSlave2:/opt/hadoop/testData$ scp -rv hiveOuter.log  hadoopMaster:/opt/hadoop/hive/
hadoop@hadoopSlave2:/opt/hadoop/testData$ ssh hadoopMaster 
hadoop@hadoopMaster:~$ cd /opt/hadoop/hive/
hadoop@hadoopMaster:/opt/hadoop/hive$ hdfs dfs -moveFromLocal /opt/hadoop/hive/hiveOuter.log /user/hive/
hadoop@hadoopMaster:/opt/hadoop/hive$ hdfs dfs -ls -R /
-rw-r--r--   2 hadoop supergroup       2591 2014-06-06 10:00 /user/hive/hiveOuter.log
drwxr-xr-x   - hadoop supergroup          0 2014-06-06 09:51 /user/hive/warehouse
drwxr-xr-x   - hadoop supergroup          0 2014-06-06 09:51 /user/hive/warehouse/test1
...
hive> create EXTERNAL table IF NOT EXISTS userOuter (id int,name string, ed string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' location 'hdfs://hadoopMaster:9000/user/hive/warehouse/test1';
OK
Time taken: 0.309 seconds
hive> show tables;
OK
userouter
Time taken: 0.045 seconds, Fetched: 12 row(s)
hive> LOAD DATA INPATH 'hdfs://hadoopMaster:9000/user/hive/warehouse/hiveOuter.log' INTO TABLE userOuter  partition(ptDate='20140606');
Loading data to table default.hiveOuter partition (ptdate=20140606)
Partition default.testouter{ptdate=20140606} stats: [numFiles=3, numRows=0, totalSize=4312, rawDataSize=0]
OK
Time taken: 1.153 seconds
hive> SELECT * FROM userouter;
OK
20140413	jim0	2014-06-06
20140414	jim1	2014-06-06
20140415	jim2	2014-06-06
....
此时hive将该表的数据文件信息保存到metadata数据库中。 
hadoop@hadoopMaster:~$ mysql -uroot -proot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 47
Server version: 5.5.37-0ubuntu0.12.04.1 (Ubuntu)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hiveDB             |
| mysql              |
| performance_schema |
| schools            |
+--------------------+
5 rows in set (0.00 sec)
mysql> use hiveDB;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
可以看到该表的类型为EXTERNAL_TABLE。 
mysql> SELECT * FROM TBLS WHERE TBL_NAME = 'userouter';
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER  | RETENTION | SD_ID | TBL_NAME  | TBL_TYPE       | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
|     25 |  1402020143 |     1 |                0 | hadoop |         0 |    26 | userouter | EXTERNAL_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
1 row in set (0.00 sec)
在表SDS中记录了表userouter的数据文件路径为hdfs://hadoopMaster:9000/user/hive/warehouse/test1。 
mysql> SELECT * FROM SDS WHERE SD_ID=25;
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT                             | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION                                           | NUM_BUCKETS | OUTPUT_FORMAT                                              | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
|    25 |    24 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://hadoopMaster:9000/user/hive/warehouse/test1 |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |       25 |
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
1 row in set (0.00 sec)

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics