Logi Data Services technology, found in recent Discovery Module versions, uses Dataviews to define datasource
connections, queries, and data enrichment. This topic discusses
Dataviews and the tools for creating and managing them.
- About Dataviews
- Launching the Dataview Authoring Tool
- Dataview Authoring Tool Menu
- Managing Your Profile
- Connecting to Databases
- Connecting to Data Files and Applications
- Managing Data Sources
- Creating a Dataview
- Dataview Loading
- Filtering Data
- Data Enrichment
- Creating Data Relationships
- Managing Dataviews
Advanced features discussed here work with Logi Info v12.5. Earlier and later Info versions may not support them; consult the Release Notes for specific details.
Logi Data Services technology provides you with an advanced means of data retrieval based on the "Dataview", a definition that specifies the connection information, query details, and data enrichment details.
Dataviews are stored in a system database and can be used with the SuperWidget Authoring too. For more information, see SuperWidgets. Dataviews can also be used in Logi Info reports using a special DataLayer element, and the data can be altered at runtime using special Data Service ("Ds") elements. When executed at the server, the Dataview retrieves the data and makes it available in the standard datalayer format to other elements for analyses and visualizations. Some of the benefits of this include:
- Decoupling of developer and data architect duties
- Shared access to stored Dataviews by multiple users
- Better support for Self-Service analytics
- Excellent performance for very large datasets (250M+ rows)
Prior to working with Dataviews, a connection to Logi services must be configured. See Get Started with the Discovery Module v3.1 for more information. Dataviews are created using tools like the Dataview Authoring tool accessible from Logi Studio. The Dataviews used with Logi Info can retrieve data from a number of databases.
If our DataHub 3.0+ product is installed, then Dataviews can also retrieve and join file-, cloud-, and application-based data, such as Facebook, Google Analytics, and Marketo, and cache it in a separate data store.
The following sections describe how you can create and manage Dataviews using the browser-based tool in Logi Studio.
You launch the Dataview Authoring tool from within Logi Studio:
On Studio's Tools tab, you'll find the Dataview Authoring menu item, circled above. This menu item will only be visible if DM v3.1 has been installed. The tool will open in your default browser:
You'll be presented with the Login page, shown above. Enter either the "admin" password that was established when DM v3.1 was installed, or similar credentials, and click Login to access the tool.
You can also reach the tool outside of Studio, by entering the URL http://localhost:3000/Datahub/Account/Login in your browser (assuming a DM v3.1 installation with default parameters).
After you log into the Dataview Authoring tool, you'll see the main menu, above the Dataviews page:
The main menu options include:
- Dataviews - This option lets you manage Dataviews, by creating them or selecting one from lists of Recent or All Dataviews.
- Sources - This option lets you manage your data sources, by creating connections to databases.
- Help - This option displays the "About" panel, which displays the version number, and includes a link to the included REST API documentation.
- UserName - This option allows you to manage your user profile and to logout. It also allows Administrator users to manage other users and groups, and to generate a Client Secret security value.
The "Home" icon at the left end of the menu always returns you to the Dataviews page.
The minimum device width and height for working with this tool is 1300px by ~600px. At lower resolutions, you may not be able to navigate around the interface properly.
The details of your profile are stored in the system database. If you click the main menu option with your user name (#4 above), you'll see the Profile Information page:
In the related page, you can manage the data associated with your user account and change your password. Be sure to provide an email address here if you're going to be using DataHub 3.0 to cache data and you want to be notified when data loading completes.
You can also choose a specific avatar, as shown above.
In DM v3.1, a connection to data is called a "Source" and can be an application, database, or file connection. In this section, we'll create a new Source based on a SQL database. There are multiple paths to the Add Source panel, including:
- Sources menu optionCreate New Source.
- Dataviews menu optionCreate New DataviewFrom Source tabAdd New Source
The Add Source panel will appear. Note that there will be different fields presented depending on the Data Provider selection.
Select or provide the required information, as follows:
- Database - Select the Database radio button, making the fields shown above visible.
- Data Provider - Select the desired database or provider type. There are several MS SQL Server options, allowing for its different security schemes.
- New Source Name - Give the source an arbitrary name for easy recognition later in the list of sources.
- Server Name - Enter the database server name or IP address.
- User Name - Enter the user name required to access the database.
- Password - Enter the password required to access the database.
- Database Name - Enter the target database name (or, for Oracle, the Service Name).
For Microsoft SQL Server, MySQL, and PostgreSQL providers, you can use the Get List button to select the database name from a list.
For some providers, an Advanced Options link will be shown, allowing you to set special configuration options, such as Schema Name or Warehouse Name, specific to that provider.
- Port Number - Enter the Port number for the connection. The default port number for the provider will be displayed.
- Test Source - Click the button to attempt to make the connection specified and provide a status message. In addition to indicating either success or failure, any existing Source with the exact same specifications will be identified so you can decide whether to use it instead or proceed to save your new Source.
Working with a SQL Server Named Instance
If you're trying to connect to a named instance of Microsoft SQL Server, such as yourDBServer\SQLEXPRESS,8484 then the example connection shown above will not work. You may have noticed that the Data Providers selection list includes several Microsoft SQL Server variants. One of them is Microsoft SQL Server Named Connection and here's an example configuration for it:
The example above shows how a complex SQL Server connection ID is parsed and placed appropriately in the Add Source panel. You have to expand the Advanced Options area to access the Port Number control.
If our DataHub 3.0+ product is installed, then Dataviews can also retrieve and join file-, cloud-, and application-based data, such as Facebook, Google Analytics, and Marketo, and cache it in a separate data store.
As you create Sources, they're shown on the Sources page as graphic "pills":
The collection of Sources can be searched and filtered using the provided controls.
Each pill displays the Source name and database type, as shown above,
and includes a Gear icon. Hovering your mouse cursor over the icon
displays a menu of management actions.
The Info gear menu option allows you to see, among other details, who created the Source, as shown above. This can be helpful if the Source has been shared with you by another user.
The Delete gear menu option will only be included if you're the user who created the Source.
Source pills are also visible in the Create a New Dataview page, under the From Source tab, as shown above. The gear icon, however, only displays an information option from this page.
Once you've created one or more Sources, you can create a Dataview. There are multiple paths to the Create a New Dataview page, including:
- Dataviews menu optionCreate a New Dataview
- Sources menu optionGear icon in Source pillCreate Dataview
Once you arrive at the Create a New Dataview page, click the From Source tab:
Your data Source pills will be displayed, as shown above. Click
a pill to create a Dataview that uses it, and the Dataview Configuration
tab will appear:
Here are the important features of this tab, keyed to the image shown above:
- Sources in Use - This panel displays the data Source in use by this Dataview. When creating a Dataview for use with DataHub, the "+" icon allows you to add multiple Sources, and create relationships between them.
- Data Objects - This panel displays a list of the data
objects (tables and views) available in the selected Source. The list
can be searched and filtered using the included controls.
Click an object to select it. When you do, its columns appear in the Columns panel. If you select any of an object's columns, the object will be placed in the "Objects in Use" list. Database objects can be filtered by clicking their Filter icons; filtering is discussed in the next section.
- Available Objects - This is a list of the data objects that are available for use but haven't been used yet.
- Action Icons - Click an icon to Save your definition, Reset all selections to their defaults, or Cancel Dataview creation.
- Columns - This panel displays a list of the columns available in the selected data object. The list can be searched and filtered using the included controls, and the All and None links can used for bulk selection. Click a column to add or remove it from the Dataview.
- Column Pills - When you select a column in the Columns panel, it will be represented by a "pill" at the bottom of the page. These provide a representation of the data, in tabular form, included in the Dataview.
When you have selected your Sources, Objects, and Columns, click the Save icon...
...and you'll be prompted to provide a name for your Dataview. Click Create to save the Dataview.
Creating Custom SQL Objects
When creating a Dataview that uses a SQL database Source, you can created a data object by defining a custom SQL query. This query can only select data from the available data objects.
If the data source is a SQL database, at the top of the middle panel in the Dataview Configuration tab, you'll see the Create Custom SQL Query link, shown circled above.
When you click the link, the Customer Query Editor panel, shown above, will be displayed, with these controls:
- Query Name - Enter a name for your custom query.
- Query - Enter your SQL query, using the correct SQL syntax for the Source database.
- Actions - Click the action buttons to Test your query or Cancel the operation. You must successfully test your query before the Save button will be enabled. A Delete button (not shown) will only be visible if you're editing an existing custom query and you can use it to delete the custom query.
When you click Test, the query will be validated and the results shown in a message adjacent to the buttons.
Once you test and save your custom query, the new object will appear in the list of objects in the middle panel:
Its columns will appear in the right-hand panel. Like other columns, they can be used to create relationships with other data.
To edit or delete a custom query, click the Edit icon, shown circled above. Note that you can only modify or delete a custom query if no Dataviews are dependent upon it.
When a Dataview is saved you'll see its details in the Dataview Status tab:
The (In Place) label on the data objects means that the data will be accessed directly from the data source at runtime - it's not cached and no data loading occurs.
However, if our DataHub 3.0+ product is installed, then Dataviews can also retrieve and join data from Excel and CSV files, and from cloud services and applications, such as Facebook, Google Analytics and Marketo, and cache it in a separate data store. If data is being loaded and cached, then additional details and controls are available in this tab.
Dataviews can include instructions for a filtering operation, if the Source is a SQL database or an application that uses a SQL-like query language. This is done internally by adding a "WHERE" clause to the query used to retrieve the data.
The filter criteria is set on a Dataview's page, in its Dataview Configuration tab:
Click the Filter icon, shown circled above, to specify the filter criteria for that Object. If a filter already exists for an object, the filter icon will appear "filled in", as shown above for the Orders object. If there is no icon, then filtering is not supported.
Clicking the icon will display the Filter Criteria panel:
Enter your filter expression. If the filter already exists, its filter expression will be loaded into the panel, as shown above. Here are some guidelines for filter expressions:
- Column names must be enclosed in square brackets, e.g. [OrderDate].
- Column names are case-sensitive.
- Column names can only reference a column in the current data object.
- Column names can be either the "friendly" name found in the Columns panel or the actual column name in the data source.
- Conditions may be nested.
- String and date values must be enclosed in single quotes.
- Numeric values must not be enclosed in quotes.
The complete syntax for filter expressions is described in Dsexpression Reference.
Do not surround your entire expression with double-quotes, as suggested by the example in the panel.
If you try to modify the filter expression for an object that's already
been used in a Dataview, a warning message will be displayed.
"Data Enrichment" refers to instructions in a Dataview definition that enhance the data it retrieves. Enrichment consists of the following possible actions:
- Set a column's "physical" name
- Categorize a column as a Dimension or a Metric value
- Set a column's data type Sub-Category
- Set a column's display Format
- Create a new Calculated column
Additional enrichment options are available when working with DataHub; these are discussed in a separate topic that's included with the DataHub 3.0 documentation. All of these actions are undertaken in a Dataview's Data Enrichment tab:
and are discussed individually in the following sections. The image above shows the selected column properties, identifies the column selection indicator (simply click a column to select it), and identifies the column Gear icon used to display a drop-down menu of options.
Click the Preview icon to preview the effects of formatting changes. Note that this requires the Physical Name property be the same as the original name of the column in the data (the Source Name). Use Reset icon to undo any previewed changes.
Click the Reset icon to restore all the properties for a column to their original values.
Click the Save icon to save Dataview enrichment changes.
Setting Column Properties
To set column properties, select a column by clicking its name and its properties will be displayed.
The column properties that can be set are shown above and include:
- Physical Name - This property is the column name that will be displayed in the column header and elsewhere, sometimes called the "friendly" name. The Source Name (the name of the column in the data source) is shown below it for reference.
- Category - This property lets you specify if the column should be treated as a Dimension (an independent variable - usually descriptive or categorical) or a Metric (a dependent variable - usually numeric). An initial assignment is made when the data is retrieved, based on an analysis of the data, but you may wish to change it.
- Sub-Category - This property further categorizes the column data type. If the Category is set to Dimension, then options here include Boolean, Identity, Numeric, Temporal, and Text. If set to Metric, then options include Currency, Decimal, and Integer.
- Format - This property determines how the data values in this column will be formatted for display. Options displayed here are related to the Sub-Category property setting and include a wide variety of formats.
Remember to click the Save icon to save your changes.
Hide / Show / Change Column Order
You can choose to hide or show Dataview columns by clicking the button at the top of the column header:
You can rearrange columns by clicking a "drag handle", shown above, and dragging the column to a new location. Drag handles are color-coded by data type.
Creating a Calculated Column
A "calculated column" adds a new column in the data and is usually the result of operations on one or more existing columns. Once defined, a calculated column is created automatically when the data is retrieved.
In the following example, we'll create a new column that contains the
results of multiplying the Quantity column by the UnitPrice column.
To get started, we'll hover the mouse cursor over the gear icon of one of the existing columns, Quantity, as shown above, and select the Calculation drop-down menu item.
The Calculation panel, shown above, will appear, with these keyed elements:
- Built-in Functions - Hover your mouse cursor over one of these built-in functions to see its description, click to add it to the expression area. See the section below about using special SQL functions.
- Expression Area - This is where you build your expression for generating the calculation column values. You can type directly into this field, if desired. Field names must be enclosed in [square brackets] and, as soon as you type an opening square bracket, a list of available columns will be shown for you to choose from.
- Operators - Click to add one of these standard operators to the expression.
- Test Area - Click the Test button to run the expression and see the results in this area.
- New Column - The new column will be inserted into the columns at the bottom of the page, to the right of the column whose Gear icon menu you used, and marked as the selected column.
To exit the Calculation panel, select a different column by clicking on it. You can then re-select the new calculated column and set its properties (for example, if you want to change its name).
Click the Save icon to save the new calculated column.
Click the Reset icon to clear the expression and results before previewing the new column.
Click the Preview icon to preview the calculated column, which will be populated with values based on the expression.
Click the Delete icon to delete the calculated column altogether and start over. To exit the
Calculation panel, select a different column by clicking on it.
Null values in a column pose special challenges. In an application consuming a Dataview, they may be considered when grouping on the column but may not be included in aggregates and charts. While this may be the desired behavior, there are situations where a Null value should be counted. In those cases, the Null must be converted to a real value (e.g. "N/A", 0, -1, "Unknown"). This calculated column expression can be used to convert the Null values to a real value:
- IFNULL ([column], 'N/A')
For Dataviews that use database Sources, you can use SQL functions if the standard expression functions aren't sufficient.
As shown in the example above, you can wrap an ANSI SQL-92-compliant function in a "SQL_FUNCTION( )" structure in your expression and it will be understood and executed as part of the query used in the data retrieval process. Column names must still be enclosed in square brackets. Complex SQL statements, like this, are supported:
- SQL_FUNCTION("CASE WHEN [Country] = 'Argentina' THEN 1 ELSE 0 END")
You can define data relationships between data objects in a Dataview. These are the support relationships:
|A Left Join will return all of the data from the left object and the data from the right object where the data in the related columns match. Where there isn't matching data, return Null information for the right object's columns.|
|An Inner Join will return all of the data from the left and right objects where the data in the related columns match. Exclude the data from both objects if the data in the related columns doesn't match.|
|A Right Join will return all of the data from the right object and the data from the left object where the data in the related columns match. Where there isn't matching data, return Null information for the left object's columns.|
|An Outer Join will return all of the data from both the left and right objects. Where there are matches in the related columns, return the data from both left and right objects. Where a match can't be found, return Null information. All records from both data objects will be represented in the final dataview.|
|A Union All will return all of the data from both the left and right objects where there are matches in column names and data types. The non-matching columns will be appended to the rows. Note that no column specification is required for this type of relationship. DataHub will analyze the objects to determine the resulting record set.|
In the following examples, we'll use the Northwind database as our primary data source and go through the process of creating a Dataview that includes a relationship. Once you understand the user interface basics, the techniques used to select the objects and columns, and establish relationships is relatively simple.
We'll being by navigating to DataHub's Create a New Dataview, selecting or creating a Northwind data source (not shown here), and displaying the Dataview Configuration tab.
We're going to need Customer information and Order information, so let's start by selecting the
Customers data object and identifying the customer data that we need.
We clicked on the Customers data object and clicked on the CompanyName, Region, and Country columns, as shown above. Notice at the bottom-left of the tab, the current definition of the columns in the Dataview is taking shape.
Our next step is to identify the Order information we need in the Dataview. If we click on the Orders object, the Dataview Configuration tab will look like:
Notice that the Customers columns are still in the Dataview, the Orders columns are available for selection, but no relationship between them has been defined yet.
If we click Create Relation in the Columns panel, the Create Relationship panel is displayed:
In order to create a relationship, do the following:
- Relationship Type - Click the type of relationship you want to use; in our example, it's a Left Join, which is selected by default.
- Relate To Object - Select the object we want to relate the Orders object to; we want to relate to Customers.
- Select "Left" Key Column - Select the column in Customers that we want to relate to one in Orders; we'll use CustomerID.
- Select "Right" Key Column - Select the column in Orders that we want to relate to the one we selected in Customers; we'll use CustomerID.
It's not necessary for this example, but we can create multiple key column pairs by clicking the "+" icon. Click the Save button to save the relationship.
The relationship description has been added to the top of the Columns panel, as shown above. To modify it, click the Join icon in the middle.
It's possible to create additional relationships between these and other objects by repeating the process described above.
Click the Save icon in the upper right-hand corner of the tab to save the Dataview with its relationship.
As you create Dataviews, they're represented in the Dataviews page with graphic "pills":
The collection of pills can be searched and filtered using the provided controls.
Each pill displays the Dataview name, its status ("In-Place" for a regular Dataview), its loaded record count ("### Records" if cached in DataHub) as shown above,
and includes a gear icon. Hovering your mouse cursor over the icon
displays a menu of management actions. The Delete option will only be included if you're the user who created the Dataview.