Manage the HDFS Connector

Composer offers connection to Cloudera’s open source Hadoop platform - Cloudera Distributed Hadoop (CDH)*. CDH provides unified batch processing, interactive SQL, interactive search, and role-based access controls. In addition, it offers enterprise-grade continuous availability. Specifically, Composer connects to CDH’s fault‐tolerant storage system called the Hadoop Distributed File System (HDFS).

The Composer HDFS connector uses its own embedded Apache Spark functionality. It supports Apache Spark 2.2 in its implementation.

By default, the HDFS connector is not included with Composer. You or your administrator need to download and enable it before configuring the connector.

After the connector has been set up, you can create data source configurations that specify the necessary connection information and identify the data you want to use. See Manage Data Source Configurations for more information. After data sources are configured, they can be used to create dashboards and visuals from your data. See Create Dashboards.

Composer Feature Support

HDFS connector support for specific Composer features is shown in the following table.

Key: P - Supported; O - Not Supported; N/A - not applicable

Feature Supported?
Admin-Defined Functions P
Box Plots P
Custom SQL Queries O
Derived Fields (Row-Level Expressions) P
Distinct Counts P
Fast Distinct Values N/A
Group By Multiple Fields P
Group By Time P
Group By UNIX Time P
Histogram Floating Point Values P
Histograms P
Kerberos Authentication P
Last Value P
Live Mode and Playback P
Multivalued Fields N/A
Nested Fields N/A
Partitions N/A
Pushdown Joins for Fusion Data Sources O
Schemas P
Text Search N/A
TLS O
User Delegation O
Wild Card Filters P
Wild Card Filters, Case-Insensitive Mode P
Wild Card Filters, Case-Sensitive Mode P