Managing the Hive Connector
The Zoomdata Hive connector lets you access the data available in Hive storage for visualization and exploration using the Zoomdata client. It can connect to both Hive on Tez and Hive on Tez with LLAP, depending on the JDBC URL you provide (see Connecting to Hive below). The Zoomdata Hive connector supports Hive versions 2.1 through 3.1.
Before you can establish a connection from Zoomdata to Hive storage, a connector server needs to be installed and configured. See Managing Connectors and Connector Servers for general instructions and Connecting to Hive for details specific to the Hive connector.
After the connector has been set up, you can create data source configurations that specify the necessary connection information and identify the data you want to use. See Managing Data Source Configurations for more information. After data sources are configured, they can be used to create dashboards and charts from your data. See Creating Dashboards and Creating Charts.
Zoomdata Feature Support
The Hive connector supports all Zoomdata features.
This connector supports pushdown joins for Fusion data sources.
To establish a connection to Hive, you must specify a JDBC URL on the Connection page of your Zoomdata data source definition for the Hive connection.
- Specify the JDBC URL for Hive.
- If authentication has been set up, provide the user name and password.
- If required, specify the Hive/YARN queue name in the Queue Name box.
- Specify the server timezone. If the timezone of your Hive server is in UTC, leave the Server Timezone box blank. Otherwise, specify the timezone abbreviation in all caps for correct handling the time data (for example, EST, EDT, or CST).
- Click Validate to test the connection.
To connect to Hive LLAP, the JDBC URL you must specify is different. If you use Hortonworks Data Platform (HDP), you can copy the URL from Ambari. See https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/performance-tuning/content/hive_connect_clients_to_llap.html.
Migrating Your Hive Connectors for Zoomdata 3.7
In 3.7, the Zoomdata Hive on Tez connector was renamed Hive. If your installation used the Hive on Tez connector in previous releases, you will have two connectors after the upgrade to 3.7: Hive on Tez and Hive. Hive on Tez connector is outdated and should not be used anymore.
To migrate your existing data source configurations and connections to use the new Hive connector:
- Copy any configuration properties you had customized in the Hive on Tez connector's
edc-tez.propertiesconfiguration file to the new Hive connector's
edc-hive.propertiesconfiguration file. See Connector Properties.
- Verify that the new Hive connector, with the
zoomdata-edc-hivepackage name, is running and enabled. See Managing Connectors and Connector Servers.
- Log into Zoomdata as the supervisor.
- Click to access the Supervisor toolbar and then select Connectors. The Manage Connector Services page appears.
- At the bottom of the Manage Connector Services page, in the Connectors table (not the Connector Servers table), locate and select the Hive on Tez connector. The Edit Hive on Tez Connector page appears.
Select the Hive connector from the drop-down list in the Connector Server field.
The User Attribute checkboxes in the Connector Parameters list on the Edit Hive on Tez Connector page are cleared when you changed the Connector Server field. So, before you make this change, make note of which connector parameters were marked as User Attributes.
- If any connector parameters listed on the Edit Hive on Tez Connector page had the User Attribute checkbox selected, select the checkbox again.
- Click Save.
- Disable the old Hive on Tez connector with the package name
zoomdata-edc-tez. See Managing Connectors and Connector Servers.
Your existing data source configurations and connections will now work with the new Hive connector. The old Hive on Tez connector server can be deleted. See Managing Connectors and Connector Servers.
If you run into a warning message that is displayed when you try to open a dashboard based on a Hive data source, see Resolving the Hive Timeout Warning Message.