Managing the HDFS Connector
Zoomdata offers connection to Cloudera’s open source Hadoop platform - Cloudera Distributed Hadoop (CDH)*. CDH provides unified batch processing, interactive SQL, interactive search, and role-based access controls. In addition, it offers enterprise-grade continuous availability. Specifically, Zoomdata connects to CDH’s fault‐tolerant storage system called the Hadoop Distributed File System (HDFS).
The Zoomdata HDFS connector uses its own embedded Apache Spark functionality that runs separately from the Zoomdata-embedded Spark. It supports Apache Spark 2.2 in its implementation.
By default, the HDFS connector is not included with Zoomdata. You or your administrator need to download and enable it before configuring the connector.
After the connector has been set up, you can create data source configurations that specify the necessary connection information and identify the data you want to use. See Managing Data Source Configurations for more information. After data sources are configured, they can be used to create dashboards and charts from your data. See Creating Dashboards and Creating Charts.
Zoomdata Feature Support
The HDFS connector supports all Zoomdata features, except for the following features:
- Custom SQL queries
- Live mode/playback
- User delegation