This document describes how to install an Hive service on a Jethro machine via Cloudera Manager, thereby allowing direct access from Jethro to Hive or Impala tables on Hadoop. Once direct access is enabled, you can extract data from Hive directly into Jethro.
Important: Implementing the procedure described below requires that:
1. Log in to Cloudera Manager where you manage your Hadoop services (Hive, Impala and so on)
2. Go to Hosts > All Hosts.
3. Click Add New Hosts to Cluster.
4. Click the Classic Wizard link.
5. In the Add Hosts Wizard that appears now, click Continue to proceed.
6. Use the next screen to specify the internal IP of the host (Jethro Node) that you would like to add. When done, click Search.
7. Once the host is found, verify that the check box next to it is selected and click Continue.
8. Ensure that the default option Matched release for this Cloudera Manager Server is selected and click Continue to proceed.
Note: This option requires Cloudera cluster and Jethro Node to have the same Linux OS release; for example, Jethro Node with Centos release cannot be added to Cloudera cluster with Ubuntu release.
9. Select the option Install Oracle Java SE Development Kit (JDK) and click Continue.
The next wizard screen requires you to provide SSH login credentials.
10. In the Login To All Hosts As section select the option Another user and specify the requested user name; for example, ec2-user.
11. In the Authentication Method section select the option All hosts accept same private key.
The section Private Key File will appear.
12. Browse for the necessary key (for example: GUI.pem), and click Continue.
Cloudera Manager starts installing all installation packages necessary for adding the Jethro Node to the cluster.
13. Once the installation completes successfully, activate all installed packages by clicking Continue in all of the following screens.
14. Click Create to open the screen wizard that allows defining a new host template.
15. Fill-in this screen as follows:
a. Specify the requested Template Name; for example: Jethro Node.
b. Select the check box near the option Gateway under the section Hive.
c. Select Gateway Default Group from the drop-down list and click Create:
The newly created template now appears on the list.
16. Select the template and click Continue.
The next screen displays the progress of starting roles on the selected hosts.
17. Wait until all necessary roles are started and click Continue.
Next, the deployment of the Client Configuration is displayed.
19. Wait until the Client Configuration is deployed successfully and click Finish.
20. Verify that the new host appears in the host list with green status and Hive Gateway Role.