Zoomdata 5.0 Release Notes

The following enhancements and updates were made to Zoomdata in version 5.0. See the following topics:

Email salesteam@logianalytics.com to purchase Composer.

Connector Changes

The following connector changes were made in this release:

  • Connector version information is now included in the connector logs. This will help when debugging connector issues with the Zoomdata support team.

  • In this release, the Cassandra over Presto connector has been officially renamed the Presto connector and Presto version support changed from version 0.132 to version 319. In addition, the following changes were made:

    • The property files changed to edc-presto.properties and edc-presto.jvm.
    • The microservice name changed to zoomdata-edc-presto.
    • The lib directory for the JDBC driver is now /opt/zoomdata/lib/edc-presto.
    • The docs folder is now /opt/zoomdata/docs/edc-presto.
    • The connector server name and the data source name were changed to Presto.

    In past releases, many of these names included the supported release number. See Managing the Presto Connector.

  • This version introduces Kerberos support for Spark SQL connectors. See Connecting to Spark SQL Sources on a Kerberized HDP Cluster.

  • Two new properties have been added for the Apache Solr and the Cloudera Search connectors:.

    • solr.query.limit.grouped sets the maximum number of groups that should be returned by an Apache Solr or Cloudera Search query when the data is being grouped by an attribute. Any positive numeric value will limit the number of groups returned. Any non-positive numeric value indicates that there is no limit and an unlimited number of groups can be returned. The default is -1 (unlimited).

      Ordinarily, most queries include some limits, so this default does not need to be changed. However, if your group fields include many unique values (as might occur in a fused data source created using Apache Solr or Cloudera Search data sources), you may risk a memory overflow. In these cases, you should consider changing the default setting of this property.

    • solr.query.limit.ungrouped sets the maximum number of records that should be returned by an ungrouped (raw) Apache Solr or Cloudera Search query. Any positive numeric value will limit the number of raw data records returned. Only positive numeric values are allowed. The default is 10000.

      Ordinarily, the query engine does not exceed the default limit, so this value does not need to be changed. If a query requests a page of raw data with a size that exceeds this property setting, the results returned on the page are limited. However, if you encounter a memory overflow for an ungrouped query that uses an Apache Solr or Cloudera Search data source, you might consider changing the default setting of this property.

    These properties are specified (and documented) in the edc-apache-solr.properties and the edc-cloudera-search.properties files. See Connector Properties and Property Files.

Query Engine Changes

The following query engine changes were made in this release:

  • The legacy query engine is no longer provided in this release. Only the new query engine (z-Engine) is available. Customers who require data sharpening should not upgrade to this release. Data sharpening will be reintroduced in a future version 5 release.

  • A new calculations.detect.array.fields query engine property is now provided. This property allows you to disable the query engine validation of multivalue fields in a custom metric. By disabling this functionality, any custom metrics that you may have created in earlier versions of Zoomdata that are aggregations of multivalue fields will produce valid values. Set the value of calculations.detect.array.fields to false in the query-engine.properties file to disable the validation. This property is set to true, by default. See Managing the Composer Query Engine.

Fusion Changes

The following fusion changes were made in this release:

  • In a fused data source, Zoomdata can now use information about column uniqueness when building a query plan that can be critical for a connector's resource usage and query time. Column uniqueness can be specified in the fused data source definition on the Editor tab by selecting the Unique checkbox.

Data Changes

The following data manipulation changes were made in this release:

  • Row-level functions and expressions can now be used in WHERE clauses in custom metrics.

    In a custom metric, WHERE clauses allow you to specify a formula without first creating a derived field. The WHERE clause must be in the leftmost part of the custom metric expression, but it can be expressed with a row-level function or any of the aggregate functions available for custom metrics. In the following example, the total planned sales is calculated for men.

    SUM(plannedsales) WHERE UPPER(gender) = 'MALE'

Visual Changes

  • Line trend attribute value and multiple metric charts now have a new visual style setting called Area Chart. The alternate and default option is Line Chart. The Area Chart setting turns on the fill option for the line chart (fills the chart with appropriate colors between the lines). The Line Chart setting turns off the fill option (so only lines appear). The Area Chart visual style setting is set on the Chart Style sidebar.

  • If you disable the Volume property in a data source configuration, it no longer appears in visual tooltips. The only exception to this is in histograms which plot the volume or number of records.

What's Fixed?

The following problems were fixed in this release:

  • Hive connectors now read tables from schemas in Hortonworks Data Platform (HDP) 3.1.

  • Time fields are now processed as time fields and are not automatically converted to attributes.

  • Zoomdata back-end dependencies and libraries have been upgraded.

What's Deprecated?

The legacy query engine is no longer provided. Only the new query engine (z-Engine) is provided. Customers who require data sharpening should not upgrade to this version.