Data Query between BDA and Exadata (Part 1): Query Hive Table from Oracle on Exadata

mouse_cheese
Last year I wrote several posts showing below to demonstrate load data between Hadoop Hive tables and Oracle tables using Sqoop.
Import Data to Hive from Oracle Database
Export data from Hive table to Oracle Database
Use incremental import in sqoop to load data from Oracle (Part I)
Use incremental import in sqoop to load data from Oracle (Part II)

To accessing data between Big Data Appliance (BDA) and Exadata, there are a few other approaches that can be used in addition to Sqoop approach. For the next few blogs, I am going to discuss the ways to query hive table from an Oracle table, query Oracle table from Hive on BDA, and load data from BDA to Exadata.
Last post, I show the way how to install Big Data SQL on Exadata. In this post, I am going to show how to use Big Data SQL to query a Hive table on BDA from Exadata.

Step 1: Load test dataset to HDFS on BDA
I use a small CSV file, stock_price.csv, which contains the stock price for one stock. The columns in the file include date, close, volume, open, high, and low. For simplicity, I am going to define the date column as STRING (Hive)/VARCHAR2(Oracle) instead of DATE type. The part of file content is shown below:

[oracle@enkbda1node01 test]$ cat stock_price.csv
date,close,volume,open,high,low
9/23/16,24.05,56837,24.13,24.22,23.88
9/22/16,24.1,56675,23.49,24.18,23.49
9/21/16,23.38,70925,23.21,23.58,23.025
9/20/16,23.07,35429,23.17,23.264,22.98
9/19/16,23.12,34257,23.22,23.27,22.96
9/16/16,23.16,83309,22.96,23.21,22.96
9/15/16,23.01,43258,22.7,23.25,22.53
9/14/16,22.69,33891,22.81,22.88,22.66
9/13/16,22.81,59871,22.75,22.89,22.53
9/12/16,22.85,109145,22.9,22.95,22.74
9/9/16,23.03,115901,23.53,23.53,23.02
9/8/16,23.6,32717,23.8,23.83,23.55
9/7/16,23.85,143635,23.69,23.89,23.69
9/6/16,23.68,43577,23.78,23.79,23.43
....
[oracle@enkbda1node01 test]$ vi stock_price.csv
[oracle@enkbda1node01 test]$ hdfs dfs -mkdir /user/oracle/mytest
[oracle@enkbda1node01 test]$ hdfs dfs -put stock_price.csv /user/oracle/mytest
[oracle@enkbda1node01 test]$ hdfs dfs -ls /user/oracle/mytest
Found 1 items
-rw-r--r--   3 oracle hadoop       2495 2016-10-30 15:36 /user/oracle/mytest/stock_price.csv

Step 2: Create a hive table using the datafile
vi hive_create_stock_price1.hql
Input the following content in the hive_create_stock_price1.hql file.

CREATE EXTERNAL TABLE hive_stock_price1 (
stock_date string,
close_price float,
volume int,
open_price float,
high_price float,
low_price float
)
ROW FORMAT delimited fields terminated by ‘,’
LOCATION ‘/user/oracle/mytest’
tblproperties (“skip.header.line.count”=”1”);

Create the table and verify the result.

[oracle@enkbda1node01 test]$ hive -f hive_create_stock_price1.hql
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
16/11/01 15:47:47 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree module jar containing PrefixTreeCodec is not present.  Continuing without it.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p1464.1349/jars/hive-common-1.1.0-cdh5.7.0.jar!/hive-log4j.properties
OK
Time taken: 0.839 seconds

[oracle@enkbda1node01 test]$ hive
hive> select * from hive_stock_price1 limit 5;
OK
9/23/16	24.05	56837	24.13	24.22	23.88
9/22/16	24.1	56675	23.49	24.18	23.49
9/21/16	23.38	70925	23.21	23.58	23.025
9/20/16	23.07	35429	23.17	23.264	22.98
9/19/16	23.12	34257	23.22	23.27	22.96
Time taken: 1.93 seconds, Fetched: 5 row(s)

Step 3: Create an Oracle external table on Exadata
Goto another session and logon to one db node on Exadata. Create an external table using type ORACLE_HIVE. Parameter com.oracle.bigdata.cluster refers to the Hadoop cluster name.

sqlplus wzhou/wzhou
CREATE TABLE EXT_HDFS_STOCK_PRICE1
(    
    STOCK_DATE  VARCHAR2(20),
    CLOSE_PRICE NUMBER(8,3),
    VOLUME NUMBER(8),
    OPEN_PRICE NUMBER(8,3),
    HIGH_PRICE NUMBER(8,3),
    LOW_PRICE NUMBER(8,3)               
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_HIVE
      DEFAULT DIRECTORY "DEFAULT_DIR"
      ACCESS PARAMETERS
      ( com.oracle.bigdata.cluster=enkbda
        com.oracle.bigdata.tablename: hive_stock_price1
      )
)
REJECT LIMIT UNLIMITED ;

Step 3: Query the result

WZHOU:SQL> select * from EXT_HDFS_STOCK_PRICE1 where rownum < 6;
STOCK_DATE           CLOSE_PRICE      VOLUME  OPEN_PRICE  HIGH_PRICE   LOW_PRICE
-------------------- ----------- ----------- ----------- ----------- -----------
date
9/23/16                    24.05       56837       24.13       24.22       23.88
9/22/16                     24.1       56675       23.49       24.18       23.49
9/21/16                    23.38       70925       23.21       23.58      23.025
9/20/16                    23.07       35429       23.17      23.264       22.98

You might notice the first row is actually the header of the CSV file on HDFS. On HDFS, when I created the hive table, I used a property tblproperties (“skip.header.line.count”=”1”) to skip the header row when showing the result from the hive table. Unfortunately I don’t find a way to allow me to skip header row from Big Data SQL. Anyway, it is not a big deal for me. I just need to remember to remove the header row when loading the data to HDFS.

Advertisements

3 thoughts on “Data Query between BDA and Exadata (Part 1): Query Hive Table from Oracle on Exadata

  1. Pingback: Data Query between BDA and Exadata (Part 2): Query Oracle Table from Hive on BDA | My Big Data World

  2. Pingback: Data Query between BDA and Exadata (Part 3): Load data from BDA to Oracle Table on Exadata Using OLH | My Big Data World

  3. Pingback: Data Query between BDA and Exadata (Part 4): Query Oracle Dump File on BDA Using Copy2BDA | My Big Data World

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s