Use Toad to Access Hadoop Ecosystem

Sometime back I wrote a blog about Use SQL Developer to Access Hive Table on Hadoop. Recently I noticed another similar product: Toad for Hadoop. So I decided to give a try.
Like many people, I like Toad products in general and use Toad in many of my projects. Toad for Hadoop is a new product in the Toad family. The current version is Toad for Hadoop 1.3.1 Beta on Windows platform only. The software supports Cloudera CDH 5.x and Hortonworks Data Platform 2.3. The software is free for now. But You need to create an account with Dell before you can download the zip file. The entire process of installation and configuration are pretty simple and straight forward. Here are the steps:

Download the zip files
Go to Toad for Hadoop. Click Download button. The zip file is 555 MB in size.

Installation
I installed the software in my Window VM. Just double click ToadHadoop_1.3.1_Beta_x64.exe file and take the default values for all of installation screens. At the end of installation, it will open the software automatically.

Configuration
Unlike so many buttons in the regular Toad software, this one looks quite simple.
Toad_config_1
Click the dropdown box on the right of Ecosystem box, then click Add New Ecosystem. The Select your Hadoop Cluster setup screen shows up as follows.
Toad_config_2
Input the name you want for this connection. For this one, I configured the connection for our X3 Big Data Appliance (BDA) full rack cluster with 18 nodes. So I input the Name as Enk-X3-DBA. For Detection Method, you can see it support Cloudera CDH via Cloudera Manager or Hortonworks HDP via Ambari. For this one, I chose CDH managed by Cloudera Manager for Detection Method.

Next screen is to Enter your Cloudera Manager credentials. Use the same url and port number that you access your Cludera Manager for Server Address. The user name is the user name you access Cludera Manager. Make sure you create your user directory on HDFS before you run the installation of the software, for example, create a folder /user/zhouw and change the permission to zhouw user for read/write access. Otherwise you will see permission exception later on.
Toad_config_3
Next screen shows Autodetection. It does many checks and validations and you should see the successful status for all of them.
Toad_config_4
Next one shows Ecosystem Configuration. In this screen, I just input zhouw for User Name. Then click Activate button. There is a bug in this version. Sometimes both Activate and Cancel buttons disappear. The workaround is just to close and restart the software.
Toad_config_5

SQL Screen
The most frequently used screen is SQL Screen. You can run the SQLs against either Hive or Impala engine.
Toad_SQL_1
The screen is very similar to traditional Toad screen I use to. On the left panel, it shows the schemas and table names. The bottom panel shows the result. Although it has Explain Plan tab in the result panel, I usually consider Explain Plan on Hadoop as a joke at this time of writing. You can take a look, but I would not waste the time in checking out the plan. You will see more issues from other parts of Hadoop world instead of suboptimal query plans. The History panel on the right is an interesting one, which I found it very useful later on. It is not only shows the timing for my queries (or jobs), but also cache the result from the previous runs. It proves a smart feature and I don’t have to rerun my queries to get the result back.
Toad_SQL_2
Sometimes you might want to check out DDLs for certain tables. You can just right click the table and select Generate Create Table statement as follows:
Toad_Generate_table_1
Here is an example of generated DDL.
Toad_Generate_table_2

HDFS Screen
HDFS Screen is another feature I really like. It works just like Window Explorer and shows HDFS directory and files under it in tree structure. It also shows the size information for directory and files. With just a few clicks, you can quickly find out which directories and files are taking a lot of space. On the right panel, it can show you some content of the files. By default, it shows the first 4K of data. Very convenient and save me the time in typing multiple commands to find out the same kind of information. If you want to download and upload files from/to HDFS, just click Download and Upload buttons on the top.
Toad_HDFS_1
Sometimes I am interested in the replication factor and physical locations of certain files on HDFS. Just right click the file on HDFS, then select Properties.
Toad_File_Properties
It shows everything about this file. For sizing information, it shows both Summary and individual block information.
Toad_File_Property_1

Chart Screen
The Chart Screen also looks nice. It does not have many charts in Cloudera Manager, but does have the necessary key information I usually want to know. I just list a few of them as follows:
Toad_Chart_1

Toad_Chart_3

Log Screen
The Log Screen is an ok one. Here are some of them:
Toad_Log_1

Toad_Log_2

Transfer Screen
The Transfer Screen is supposed to support the data to/from RDBMS, like Oracle, MySQL and SQL Server. I haven’t really tried out this one.
Toad_Transfer_1

Service Screen
The Service Screen is useful when you want to know where you deploy your services on Hadoop, like hostname and port number for certain services. It does not have everything, but good enough.
Toad_Service_1

Toad_Service_2

In general, Toad for Hadoop is a nice tool that can help you to quickly find out certain information on Hadoop without going through many screens and commands. I would say this tools is for Hadoop Administrators instead of regular Hadoop user. The reason is that you probably don’t want to give the Cloudera Manager access for every user.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s