Managing the Elasticsearch Connector
The Zoomdata Elasticsearch connector lets you access the data available in the Elasticsearch storage for visualization and exploration using the Zoomdata client. The Zoomdata Elasticsearch connector supports the following Elasticsearch versions.
- Elasticsearch 6.0 - 6.7
- Elasticsearch 7.0 - 7.1
You cannot import or export Elasticsearch data sources (or the visuals and dashboards that use those Elasticsearch data sources) if the version of the Elasticsearch connector in the Zoomdata environment is different from the version used by the data sources. For example, you cannot import an Elasticsearch 6 data source you have exported if your Zoomdata environment only has an Elasticsearch 7 connector defined. When you change connector versions in your Zoomdata environment, we recommend that you also create new data source configurations (and associated visuals and dashboards) for the newer version.
Before you can establish a connection from Zoomdata to Elasticsearch storage, a connector server needs to be installed and configured. See Managing Connectors and Connector Servers for general instructions and Connecting to Elasticsearch for details specific to the Elasticsearch connector.
After the connector has been set up, you can create data source configurations that specify the necessary connection information and identify the data you want to use. See Manage Data Source Configurations for more information. After data sources are configured, they can be used to create dashboards and charts from your data. See Creating Dashboards and Creating Charts.
Zoomdata Feature Support
The Elasticsearch connector supports all Zoomdata features, except for the following features:
- Admin-defined functions
- Group by UNIX time
- Kerberos authentication
- User delegation
- Wild card filters, case-insensitive mode
- Wild card filters, case-sensitive mode
When establishing a connection to Elasticsearch, make sure you:
Specify the Connection String: you may use HTTP/HTTPS or Transport (TCP)/Transports protocols to connect to your data source.
For HTTP/HTTPS protocol specify the base URL, whereas for Transport/Transports, specify the list of nodes. Keep in mind, that you must specify the nodes within one cluster.
Provide the connection string in the corresponding format:
Protocol Connection String Format Example HTTP/HTTPS
<schema> - stands for the protocol that you want to use:
- HTTP or HTTPS (with SSL support)
- Transport/Transports (with SSL support)
<node>- an address of a node within a cluster in the following format: host:port
If required, specify your Elasticsearch User Name and Password.
Select Validate to confirm your connection.
To connect to your Elasticsearch cluster and data set secured by X-Pack, see Support of X-Pack for Elasticsearch.
Data Source Configuration Notes
When setting up an Elasticsearch data source configuration, consider the following notes for the Indices tab.
You select the indices and types to be queried, and select the fields to be handled. You can do this in three steps:
Select indices and aliases to be queried.
You can select indices Manually or Automatically.
If you want to get the data only from specific indices, select the Manually option and choose the corresponding indices from the list below.
The Automatically option is more flexible. It lets you set the pattern by which the indices will be selected automatically.
For this option, you can select one of the pattern types. Note that when no indices match the pattern while querying, your charts are returned empty.
Native - specify the pattern for index names. Use an asterisk (*) to replace one character or a set of characters.
For example, you want to get all the indices whose name starts with log and ends with 16. In this case, specify the following pattern:
Time-Based - set the time pattern to get the matching indices. Check the supported date and time patterns.
For example, the time pattern YYYY-MM will return all the indices, whose name will match the pattern in the following examples. Note that if the Index Name includes text with the time and date pattern, you need to enclose the text portion in brackets [ ]:
Index name Pattern 2016-01 YYYY-MM 2016-3 YYYY-Q 10:23:11 HH:MM:SS logstash-2016-06-14 [logstash-]YYYY-MM-DD
Optionally, configure filtering by type. If you need to filter by type, select the Enable Filter By Type checkbox. The type by which filtering will occur is shown. Click Edit to alter the filter by type by selecting one from a list of types available for the selected index
If the Enable Filter By Type checkbox is cleared, all the types that refer to the selected indices are selected.
If some fields have different data types in types, you are not able to use them for grouping, filters, and so on. However, the option is still available for raw export.
Keep in mind, the fields for indexes will not be refreshed. If new fields are added to your data source, they are added to Zoomdata only after you click the Refresh Fields button on the Fields tab of the data source configuration. If there are some changes in the existing fields (for example, if a field has been removed) they won't be applied.
When you connect to your Elasticsearch data source, the additional service field _type is added. The _type field contains all the selected Elasticsearch types you can visualize as attributes on your charts.
Working with Elasticsearch
Distinct Counts and Percentiles
Distinct count and percentiles metrics return approximate values in Elasticsearch. The precision of the result returned by distinct count metric depends on the precision threshold setting (default value is 1000).
You can change the value of the precision threshold by setting the
property in the
See Elasticsearch's documentation on the following for more information:
- For Elasticsearch version 6, see the following for percentiles and distinct count.
- For Elasticsearch version 7, see the following for percentiles and distinct count.
The table below lists all available properties that you can modify to work with Elasticsearch.
|elasticsearch.query.cardinality.precision.threshold||1000||control the level of accuracy of the distinct counts||The maximum supported value is 40000. However, Zoomdata does not recommend to set such value as it may result in performance issues and the data source itself may return errors. For more info, refer to the Precision Control section by Elasticsearch.|
|elasticsearch.query.limit.nongrouped||10000||set the limit for the number of non-grouped records (per shard) to execute on.|
|elasticsearch.query.limit.grouped||10000||set the limit for the number of grouped records (per shard) to execute on.|
If you need to change the default settings, you can add the corresponding properties (listed above) to the
file and assign the required values. For more details about working with the
file, refer to the topic
Managing Configurations in Zoomdata
Keep in mind that Elasticsearch, by default, tokenizes or analyzes fields that are of type
text. As a result, strings consisting of two or more words may become separate fields when connected to Zoomdata (for example, city names like
Las Vegas). To disable this process and ensure that a string field is not analyzed, specify its type as
The IP Address data type is supported for Elasticsearch data connectors. Fields of this type are treated as ATTRIBUTEs and can be used in:
- An Elasticsearch text search box. When searching via the text search, Zoomdata also supports the CIDR notation for IP addresses as described in the Elasticsearch documentation (https://www.elastic.co/guide/en/elasticsearch/reference/current/ip.html).
- The Group By selection box.
- Filters, although Zoomdata does not support CIDR notation in filters for an IP address field. An exact match is required.
- Row-level expressions. In row-level expressions, Zoomdata treats IP addresses as strings and expect an exact match.