The purpose of this document is to describe the Jethro management package for Apach Ambari.
The overview of Ambari and Jethro's management package for Ambari will be followed by installation, management and monitoring, and uninstall instructions.
In a nutshell, Ambari is an open-source management platform for provisioning, managing, monitoring and securing Apache Hadoop clusters. Jethro's Ambari service goal is to supply a simple integration with Ambari, so that HDP users will be able to deploy and manage Jethro across their Hadoop clusters. Once integrated, users will have a ce
ntral location for monitoring their Jethro clusters, in terms of security, cluster health and resource utilization.
The Jethro's components exposed in Ambari within the Jethro Ambari service are:
- Jethro Server - The query engine of Jethro. It can be deployed on any host (minimum 1 host). The more hosts this component will be on, the better availability the BI users will get.
- Jethro Maint - The maintenance Linux service of Jethro. It can be deployed on any host (minimum 1 host). One component per Jethro instance, should be active at any time.
- Jethro Scheduler - The scheduler Linux service of Jethro. It can be deployed on any host (minimum 1 host). At least one component per Jethro instance, should be active at any time.
- Jethro Manager - A web interface served by a Linux service that provides wider management abilities to Jethro's administrator. The web is also accessible beyond Ambari's web. Only a single manager component can be deployed, and any host can be used for that although it is recommended to install it on the Name Node.
During the installation of Jethro service for Ambari, Ambari service advisor will automatically provide a recommendation for a basic deployment of all four components, on the available hosts .
- Number of running instances - Will be presented as a gadget by default.
- Jethro instances storage size - Will be presented as a gadget by default.
- Running maint service (per instance) - Will be used as an alert, in case any Jethro Instance doesn't have one active maint component, on any of hosts this Instance is active on.
- Running load scheduler service (per instance) - Will be used as an alert, in case any Jethro Instance doesn't have at least one active scheduler component, on any of hosts this Instance is active on.
The parameters exposed through Ambari, are Jethro's Instances global parameters.
- Enable cubes - The default value is '1' (enabled). It represents the Jethro configuration parameter: dynamic.aggregation.auto.generate.enable.
- Cube queries server - The default value is 'auto' (localhost). When using this value, each host will function as a query engine for cubes creation and maintenance. To assign a specific host for it, replace this value with the desired host's details, according to the format: <IP>:<PORT> (without '<>'). The IP should be the IP of the host which will be used as a cube queries server (must have an active Jethro Server component), and the PORT should be the port of the Instance on that host.
- Editing Jethro's cubes parameters through Ambari, for more than one instance, is not recommended, since the query engine defined in the 'Cube queries server' belongs to a single Instance, and therefore cannot be used for a different Instance needs. The only value that can fit all Instances at the same time, is 'auto'.
- Only Instances which were deployed by Ambari, can updated their parameters through Jethro Ambari service.
Working with multiple Jethro instances under Ambari
Jethro's Ambari service, supports well the usage of a single Jethro Instance per host. However, having 2 or more Jethro instances on the same host, creates a limitation:
You cannot deploy more than one Jethro instance, to the same host.
The reason for that is that Ambari represents less levels of objects than Jethro. While Jethro uses a 4-level object model (Application ,Instance, Host, Service), Ambari uses a 3-level object model (Service, Host, Component). This creates a requirement to reduce one Jethro object model from being used in Ambari, which limits the ability to deploy more than one Jethro instance, per host.
The integration is supported for:
- Ambari 2.4 and above
- Centos/Redhat 6 and above
Please note - If you are using a Centos/Redhat version earlier than 7.0, make sure to update the default links to the Jethro and Jethro Manager RPM files, during the process of adding the Jethro service, to RPM links that supports these OS versions. The default RPM links are for versions of Jethro that supports Centos/Redhat 7.0 and up.
Management Package Installation
Log in your ambari-server with user root, or with a sudoer (requires to add sudo before every command).
Copy/Download the Jethro management pack (mpack) to your ambari-server (jethro-mpack.tar.gz - don't ungz it).
Install the Jethro management pack on your ambari-server:
Restart the ambari-server:
Create an Extension Link, to link the installed extension to Amabri according to your HDP stack version:
For example, if you are running HDP 2.6.0:
*To find the HDP stack version - Open Ambari, click on the Admin tab and then select the Versions tab.
Adding the Jethro Service to Ambari
After the installation of the package completes, you'll be able to add Jethro's service for Amabri's web interface:
- Navigate to Ambari's home page. Click 'Actions' (buttom-left on the services bar), and click 'Add Service':
- Select the Jethro checkbox from the available services, and click next:
- Amabari will display all the components contained in the selected service. You can select a target host for each component.
It is also possible to add more components (for multiple hosts deployment):
Jethro's Ambari Service Configs
Before every deployment, Jethro's Ambari configuration properties are required to be set.
During the process of adding Jethro's Ambari service, you will be required to define for the first time, at least two mendatory parameters:
- A name for the Jethro Instance.
- A storage path on HDFS, to be used as the Instance's storage.
It is assumed that the given HDFS path was already defined prior to every deployment of an instance, and owned by user 'jetrho'. The recomnded path is /user/jethro/instances.
If the instance name provided already exists on the provided storage path, the Instance will be 'attached' to the host. Otherwise, the Instance will be created.
The configs screen will be accessible afterwards also via the 'configs' tab of the Jethro Ambari service.
The rest of the cofig values, are set with default values, and most of it can be edited if desired to. Please note that the management package of Jethro, doesn't pre-contain the Jethro software RPM, so that it could be dynamically added all the time, using the latest and most updated version of Jethro.
Full list of the config parameters:
|jethro RPM||jethro-config||latest Jethro RPM||RPM path to download Jerhro software|
|jethro manager RPM||jethro-config||latest Jethro Manager RPM||RPM path to download Jethro Manager software|
|jethro user name||jethro-env||jethro|
The OS user to be used for the installation of Jethro software on each host
|Jethro Instance Cache Path||jethro-config||/home/jethro/instances_cache||The path on the host, which the Jethro instance will use for caching|
|Jethro Instance Storage||jethro-config||10||The maximum size of storage space on the host, which the Jethro instance will use for caching|
|jethro instance name||jethro-config||-||If the given instance name already exists, Jethro will attach it. Otherwise, it will be created.|
|jethro instance storage||jethro-config||-||The path to setup the instance storage on. It is assumed the given HDFS path was already defined prior to installing the Jethro service. The recomnded path is /user/jethro/instances.|
|jethro server PID directory||jethro-env||/var/run/jethro||Auto-configured upon installation, after that it becomes read-only.|
|jethro manager PID file||jethro-env||/opt/jethro/jethromng/pm2/pids/JethroManager-0.pid||Auto-configured upon installation, after that it becomes read-only.|
|jethro manager port||jethro-config||9100||Required for the quick link functionality. If the port is changed at jethro manager, the same value needs to be entered here as well.|
|Enable cubes||jethro-global||1||The actual Instance param name affected: dynamic.aggregation.auto.generate.enable|
|Cube queries server||jethro-global||"auto"||The actual Instance param name affected: dynamic.aggregation.auto.generate.execution.hosts|
Completing the Installation
After a 'Review' step and clickling 'Deploy', the installation proceess will start for the selected components and hosts, and will run in a fully automated manner:
Once installation process completes, the Jethro service will appear on the left side of the screen, on the services bar:
Most of Jethro's monitoring in Ambari is being done using the 'Summary' pane, and by using the services bar.
The services bar can indicate when there is an alert, and encourage you to see it.
The summary pane can show the state of all the Jethro components which were deployed to the hosts, and to allow viewing of alerts and metrics.
The default metrics presented under the summary pane, are:
- The Number of active instances
- The utilization of Jethro's storage
- The total storage being used by Jethro's instances
All of these indicators can help in monitoring the health of the system, it's current availability, and its future storage needs.
Alerts, as described in the Overview section of this document, can be presented in detail if accessed from the summary pane:
Jethro Ambari service can be used to control the amount of Jethro components being deployed/used at any time.
It also allows the user to perform maintenance actions on the Jethro linux services, such as restart/start/stop actions, straigh from the GUI.
By clicking on 'Hosts' from the header menu, and choosing a specific host, you can deploy Jethro components to that host:
Once a component is deployed to a host, you can click on the component name, and control it's state within the host (Started/Stopped/Restarted):
- Select 'Jethro' from the side bar, to reach the summary screen.
- For each 'Jethro Maint' component shown on the summary pane:
- Click on it to reach the screen that shows all the component that are on that host.
- Find the 'Jethro Maint' component, click on the dropdown next to it, and choose 'Stop Jethro Metrics'.
- Go back to Jethro's summary pane, and do the same for the next 'Jethro Maint' component, until all of them were treated.
- Stop all Jethro's components on all hosts. You can do that by selecting 'STOP' from the Jethro service summery page:
- Choose 'Delete Service' from the same menu.
To remove the extension link, log in to your Ambari-machine, and run the following command with the proper parameters:
To find the <link_id>, you can run the command 'curl' on the final http link.
To uninstall the package, run:
Known Issues & Limitations
Service Configuration Is Unavailable
After Uninstalling Jethro Ambari service, and reinstallation, no configurations (beside custom configurations) are available under service → configuration section:
The Amabri server fails to clean the custom service data after uninstalling the service, and the custom service data is still stored on Ambari DB.
Delete Jethro Ambari service data manually from the DB, by running the following commands:
First, uninstall the Jethro Ambari service from the Ambari server (services → Jethro → stop service → delete service)
Log in to Ambari DB
(The default password is 'bigdata').
Delete the Jethro Ambari service records:
Exit Ambari CLI (by \q or CTRL+D).
Login as root. then, Restart Ambari server:
- Make sure you see the following output at the end of the restart procedure:
Jethro services are still running on hosts after Jethro Ambari service is removed
After succefully deleting Jethro Ambari service from Ambri UI, Jethro and Jethro Manager are still up and running on the deployed hosts:
Ambari doesn't support uninstallation hooks for custom services.
Uninstall Jethro and Jethro Manager manually: