Using the Zoomdata Scheduler
The Zoomdata Scheduler is a component within the server used to run jobs that update the metadata for the source in asynchronous mode. Zoomdata scheduler is integrated with the data connectors and supports the following types of jobs:
- Refreshing the data sources that are connected to Zoomdata (in other words, refreshing metadata and clearing the cache)
- Refreshing the specific fields for the data source.
The Zoomdata administrator and users with admin privileges can access the Scheduler on the Refresh Tab in each data source configuration.
Administrators can view the status of scheduled jobs in the Zoomdata Console, which is available on the Settings menu.
The following topics are covered in this topic:
- How the Scheduler Works
- Setting Up Data Source Refresh Job
- Refreshing the Data Source
- The Zoomdata Console
When you initially connect Zoomdata to your data source the following activities are automatically run:
- A sampling of the data set is executed to determine distinct values for all field types set to Attribute and to determine the minimum and maximum values for all field types set to Number, Integer, and Money (used by a chart's Filter controls).
To refresh the lookup and minimum and maximum values for your fields you can set the fields as refreshable . You can also refresh them manually by clicking refresh in the Statistics column for the corresponding field on the data source configuration's Fields Tab. This action kicks off an asynchronous job to determine the distinct values and minimum and maximum range based on the entire data set rather than the sample.
Besides these initial activities, administrators can set the Scheduler to perform jobs related to the data sources. The following tables identifies the jobs that are supported currently, the triggers for these jobs, and the activities that occur when the job is run.
|Data Source Refresh||
|Data Source Fields Refresh||By selecting the refresh options on the Fields Tab||
By default, when you define the data source configuration to Zoomdata, the No Schedule option is selected in the Schedule section of the Refresh Tab. This means that the Scheduler runs an initial data source refresh job after the source has been successfully created and saved.
To enable the Scheduler to run at predetermined points in time, perform the following steps:
Select the Periodically option.
Set the Start on date and time.
Select the time interval for the job to be run from the Runs list (which includes monthly, weekly, daily, or hourly). Depending on the option that you select in this list, corresponding options will be available in the Run every section:
Monthly - specify the time interval (months) for the job to be run. The job runs every M months starting January (included), where M is the value specified in the Run every field. For example, your job starts on March 10, 2016 and is scheduled to run every 3 months. Therefore, the job runs every third month at the specified time (that is, April, July, and October, the following January, etc).
Weekly - select the days of the week for the job to be run.
Daily - specify the time interval (in days from 1 to 31) for the job to be run.
The job will run every D days from the first day of the month (inclusive), where D is the value specified in the Run every field. The first job runs on the date and time you specified in the Start on field. For example, you set the job to start on March 10, 2016 at 5:00 AM and to run every five days. The next job runs on March 11 at 5:00 AM and subsequent jobs will run every fifth day at the specified time until the end of the month.
Hourly - specify the time interval in hours (1-23) and minutes (1-59) for the job to be run.
You can set the specific hour and minute for the initial job to run (in the Start on field). Then set the time interval for jobs to be run down to the hourly and minutes granularity (in the Run every field). For example, you can set your job to start on March 10, 2016 at 5:00 AM and to run every 3 hours and 20 minutes. The next job run will be at 8:20 AM and so on.
Your configuration summary displays in the Summary section.
For more complicated update schedules, use the Advanced option to set cron expressions.
A cron expression sets a schedule using a string of six fields and separated by a blank space. The format for a cron expression is:
[seconds] [minutes] [hours] [days of the month] [months] [days of the week]
The standard values that are supported by each field (and with Zoomdata’s Scheduler) include:
|Field||Allowed Values||Additional Characters|
|Seconds||0-59||, - * /|
|Minutes||0-59||, - * /|
|Hours||0-23||, - * /|
|Day of the month||1-31||, - * / ? L W|
|Month||1-12 or Jan-Dec||, - * /|
|Day of the week||1-7 or Sun-Sat||, - * / ? L W #|
When creating a cron expression, keep the following requirements in mind:
- Either ‘Day of the month’ or ‘Day of the week’ is needed, but not both; insert a question mark (?) as a placeholder for the one not specified.
- Names of the ‘Month’ and ‘Day of the week’ are not case sensitive; for example, ‘FRI’ and ‘fri’ are both acceptable formats.
|Special Character||What It Means|
All values. Represents all the values within the specified field. For example, when used in the minute field, a job will run every minute.
0 * 0 0 0 0
No specific value. Used as a placeholder when no value is needed in the field. For example, if specifying a ‘Month’ value you would enter ‘?’ for the ‘Day of the week’ field.
0 0 0 0 6 ?
Range. Enter a time range for the field using this symbol. For example, 3-6 in the ‘Hours’ field means a job will run at 3:00, 4:00, 5:00 and 6:00 am.
0 0 3-6 0 0 0
Comma. When a series of information is needed, use the comma to identify all the values for the field. For example, Wed, Thur, Fri in the ‘Day of the week’ field means a job is run on Wednesdays, Thursdays and Fridays.
0 0 0 0 0 Wed,Thur,Fri
Forward slash. Specifies the starting time value and the incremental increase of time. For example, 0/5 in the minutes field means a starting point of 0 and running a job every 5 minutes.
0 0/5 0 0 0 0
Last. Used in two fields only - ‘Day of the month’ and ‘Day of the week’.
0 0 0 5L 0 0
Weekday. Used in two fields only - ‘Day of the month’ and ‘Day of the week’.
Identifies the weekday closest to the given day. For example, 15W means the closest weekday to the 15th of the month. The following results are possible:
Number sign. Used only with the ‘Day of the week; identifies the specific day of the month. For example, both Wed#2 and 3#2 identifies the second Wednesday of the month.
0 0 0 0 0 Wed#2
Examples of cron Expressions
|0 0 12 * * ?||Noon every day|
|0 30 20 ? * *||8:30 p.m. every night|
|0 0/10 17 * * ?||Every 10 minutes starting at 5 p.m. and ending at 5:50 p.m., every day|
|0 15-30 20 * * ?||Every minute starting at 8:15 p.m. and ending at 8:30 p.m., every day|
|0 45 20 ? * Mon,Wed,Fri||8:45 p.m. every Monday, Wednesday and Friday|
|0 0 20 3/3 * ?||
8 p.m. every 3 days in every month, starting on the third day of the month
You can select specific fields in your data source to be refreshed in the Configuration section of the Refresh Tab.
All the fields from your data source are listed in the Refresh Fields Metadata section. By default, only the fields of type Time are selected. If you want to refresh all the fields from your data source, click Select All in the heading of the Refreshable column. Otherwise, select the checkboxes for specific fields in the Refreshable column.
To update the entire list of fields in a data source configuration, access the Fields Tab and click Refresh Fields . This option differs from the functionality on the Refresh Tab because it is a manual refresh of the fields contained in the data source.
You can also refresh specific fields from your data source. Click the refresh button in the Statistics column for the field. The job immediately begins and the status shows in that cell.
Administrators can monitor jobs using the Console (which is located in the Settings menu). The Console automatically refreshes every 15 seconds.
The jobs (that is, the Job Names) are identified in the Console by the data source configuration name.
If you have scheduled many jobs, you can quickly filter by a specific job status: Upcoming, In Progress, or Finished. To return to the comprehensive list of all jobs, select Clear.
The Console provides the following details for jobs:
- Data Source - the name of the data source, for which the job has been created
- Status - the status of the job
- Last Finished - date and time of the most recent executed job
- Next Run - the next scheduled run for the job
- Job History - opens new pop-up window showing all jobs that have been run for the data source
You can also sort the Jobs table by the following column headers:
- Data Source
- Job Type
- Last Finished
However, keep in mind that the sorting automatically resets to the default state every time the table is refreshed.
The Source Refresh window provides an historical view of the jobs that have been run for the selected data source. The information provided includes: job start time, job finish time, and the job execution status. Use the quick filters to view the jobs in the In Progress or Finished status.
For the Status column, three conditions are used to identify the status of the most currently run job:
- COMPLETE: The job was successfully completed
- INCOMPLETE: The job has only been partially completed
For example, the minimum and maximum values were successfully refreshed, but the distinct values were not refreshed.
- FAILED: The job could not run or could not be completed due to some error in the system
For example, Zoomdata may be experiencing connection issues with the data source. Click the arrow to view the details on the issues that occurred while executing the job.