Any reports the business uses, whether regulatory reports for regular submission or reports run and used internally, can be defined in the ‘Reports’ page of Aperture Governance Studio. Any data fields which contribute to a report can then be allocated to the defined report. This can be found in Governance > Lineage > Reports:

You will be presented with the following screen:

Reports are defined on the left-hand side of the screen, while the report fields which make up that report are defined on the right-hand side of the screen.

To add a report, click on ‘Create Report’ in the top right-hand corner of the ‘Reports’ window.

  • Name: give the report a name
  • Description: describe what this report’s function is, what it is used for etc.
  • Report Icons: either select an icon from the list (as shown above), or click on the ‘icon’ toggle, which will allow you to browse your internal files for an icon or image of your choice
  • Report Type: select a report type from the list

Remember to SAVE.

Once you have defined your report, you can add the necessary fields or attributes which make up that report.

To view, edit or delete a report you have defined, click on the ellipse next to the report name and select an option.

To add an attribute, click on ‘Create Report Field’ in the top right-hand corner of the ‘Report Fields’ window. You will be presented with the following window:

  • Report: Select which report this data field feeds into from the list of reports already defined
  • Name: The name of the data field, e.g. Date
  • Description: give a brief description of what the data field is used for
  • Data Type: select a data type from the list (this should match the database data type)
  • Length: enter a length for the field (If applicable; this should match the database field length)
  • Format: enter a format for the field (if applicable)
    Remember to SAVE.

To view, edit or delete a report field you have defined, click on the ellipse next to the report name and select an option.

The reports and attributes defined here can be used within the lineage functionality to create a visualisation.

Report Groups

These are groupings of records that provides the flexibility to apply Rules in specified circumstances only. After a Report Group has been created, Rules can be set to only run on the specified Report Group. This is done is the Rule settings. 

Example a Report Group could be created for ‘Commercial Properties’.  The specification of the Report Groups is done via a editable SQL script which in this case would select Records that are Commercial Properties determined via a certain logic. Any Rule that has been set to ‘Commercial Properties’ as the ‘Report Group’ will run specifically on the records that are ‘Commercial properties’ as specified on the Report Group.

To navigate, go to Report Groups go to (Quality > Rules > Report Groups).

Clicking on a specific Report Group on the ‘Report Groups’ window will populate the ‘Rules Window’ will all the Rules currently attached to it.

To create a new ‘Report Group' simply click on the ‘Create Report Group’ button and you will be presented with the screen below.

  • Tags: Attach Tag (To be used in Tag Search)
  • Subject Area: Subject Area that contains records to be specified in the ‘Report Group’
  • Entity: Entity that contains records to be specified in the ‘Report Group’
  • Name: Name of Report Group
  • Description: Description of Report Group
  • Code: Code specifying/selecting the records that belong to the Report Group

Group Hierarchy

Together, alongside the entity groups and extensions applied on the Entity, Rule and Rule Implementation layers, these groupings allow for a clear breakdown of errors within the Data Quality reports with their respective contexts. The best application of these features depends on the overall use case. 

Figure describes the individual use case of the rule groupings.

Batch Job (Rules execution)

A Batch job is a scheduled task that performs the execution of all the active Rules. To set up a Batch job navigate to (System Admin > Task Scheduler ) and you will be presented with a screen (shown below) showing all available ‘Scheduled Tasks’ that have already been set up.

Clicking on the ‘Schedule a Task’ button will prompt a window (shown below) where you can create a new Batch job.

  • Name: Name of the task.
  • Type: Type of the task being set up. Options include Data Quality Batch Job, Daily Workflow Email. When creating a new job select the ‘Data Quality Batch Job’ item.
  • Description: Description of the task being set up. 
  • Schedule: Setting of frequency (Daily, Weekly, Monthly) & setting of time in UTC 24 hour format.

Report Setup (Quality Report Config)

To setup the Reports and Dashboards that will provide a detailed analysis of the data quality issues identified, navigate to (Quality Report Config > Report Config ).

You will be presented with the window below in which you will be required to select the ‘Entity’ whose report you would like to setup. A report is set up for each Entity.


After selecting an Entity, several sets of screens will appear (shown below) and they include:

  1. Entity Settings
  2. Report Keys
  3. Report Steps
  4. Report fields

1.Entity Settings

  • Entity: The Entity whose report is being set up.
  • Description: Description of Entity whose report is being set up.
  • Submitted: Will be set to ‘No’ in most cases. In special cases whereby the Entity has been specifically set up to utilize additional data submitted by the user, this would be set to ‘Yes’.

Click on the ‘Save Entity Settings’ to save the changes made in the Entity settings section so far.

2.Report Keys

These are key fields used to breakdown the data quality reports. Ideally the ‘Keys’ added in this field are the primary keys of the tables being analyzed. 

To Add a new Key simply click on the ‘Add Key’ and fill in the details accordingly on the screen below

  • Key Name: Name to assigned to Key
  • Data Type: Data Type assigned to Key.
  • Character Length: Set length of string. This setting is only available for some of the Data Types eg. Varchar.

3. Report Steps

This section details the various steps taken by back end SQL scripts to produce the Data quality error reports. These steps include:

  1. Cleardown: Clear previous results
  2. Preparation: Create the report Tables ,Indexes and Temp tables if necessary
  3. ReportGroups: Build group tables
  4. RecordCount: Count the number of records in the entity
  5. Rules: Execution of the Quality rules
  6. Summary: Build Report Summary Statistics
  7. Detail: Build detailed report with contextual data

Here the users with the necessary expertise can configure the steps to suit their reporting needs.

Each step that can be configured has the ‘>’ sign before the step title. Example ‘> Cleardown’ .

To configure each step click on the Step Title and a screen showing the code behind the step will be displayed. 

In the screen below the ‘Cleardown’ step has been clicked on.

Here you can adjust the script accordingly or you can run the default script by clicking on the ‘Reset to Default’ button on the top right of the window.

The same applies to the other configurable steps show on the Report steps screen ( Preparation, RecordCount, Summary, Details).

4. Report Fields

This section contains the list columns/fields that will be displayed in the data quality report. By default, the fields listed below should always be displayed.

  1. DateStamp : Date/Time at which Rule was executed.
  2. RuleImplementation ID: Unique ID of Rule Implementation executed
  3. ID: Report Key field.
  4. ErrorTable: Table on which Error is located.
  5. ErrorColumn: Column with the ErrorTable on which Error is located.
  6. ErrorValue : The value within the ErrorColumn causing the error that has been flagged.

To Add new fields to be displayed in addition to the above fields, click on the ‘Add Field’ button located on the top right of the Report Fields screen. It will prompt the screen below.

  • Rule group: Select to ‘Always Display’ to have the field always displayed in the output. If you link the field to a specific Rule group, the field will only be displayed when the Rule group is selected in the filter.
  • Internal Name: Actual field name as it appears in the Table. It should match exactly what is displayed in the Report Detail under the Report Steps section.
  • External Name: The user friendly name that is the display title for the field.
  • Display: Select ‘Yes’ or 'No' to have the field displayed or not displayed in the output.
  • Order: The Position among other fields in which the field is displayed in the output. Example if it is set to ‘1’, it would be the first field that is displayed in the output. 

Dashboard and Tabs

To access the various dashboards and gain a visual representation of the data quality issues identified , navigate to (Reports > Dashboards).

The dashboards are updated whenever a Batch Job is executed. 

The Dashboards are split into 6 sections/Tabs. They include:

  1. Overview Dashboard
  2. Entities
  3. Processes
  4. Ages
  5. Trajectory
  6. Counts

1. Overview Dashboard

Provides an overview analysis of the data quality. 

To filter the results based on ‘Subject Area’ and ‘Process Category’ simply enter them in the input boxes above.

  • Run Time: Duration of the Batch Job.
  • Entities: The figure next to the green circle shows the number of Entities that were successfully processed in the latest Batch Job run. The figures next to the red circle displays the number of Entities that were failed to be processed.
  • Implementations: The figure next to the green circle shows the number of Rule Implementations that were successfully applied in the latest Batch Job run. The figures next to the red circle displays the number of Rule Implementations that were failed to be applied.
  • % of Clean Records: Number of records with no errors based on the Rules applied in the latest Batch Job.
  • Latest Record Count: Number or records analyzed in the latest Batch Job run.
  • Latest Error Count: Number of Errors identified in the latest Batch Job run.
  • % Of Checks Passed: Percentage of Rules that found no errors in the Latest Batch Job run.
  • Error Trajectory: Number of errors gained or reduced in comparison to a Batch Job run in the previous month. 
  • Highest Severity Errors: Number of errors found in the latest Batch Job run that were discovered as a result of Rules classified as ‘Critical’.
  • Last Quality Run: Date and time of the latest Batch Job run. 
  • Entities Chart: By default the chart displays the percentage of clean records found in each Entity.
  • Processes Chart: By default the chart displays the percentage of clean records linked to each Process.

The green bars indicate that the Entity or Process is in great condition with regards to data quality. Yellow bars indicates a good to average condition .Red bars would indicate that the Entity or Process is in a poor condition and may require more attention in the cleansing process.

To toggle between Percents & Counts being displayed on the charts use the buttons (shown above) located at the top left of the charts.

To Toggle between the Latest data and Trend data being displayed on the charts use the buttons (shown above) located at the top right of the charts.

To Toggle between the Top 10 and Bottom 10 Entities or Processes with regards to data quality, use the buttons (shown above ) located at the bottom left of the charts.

2. Entities Dashboard

Provides analysis of the data quality broken down by Entities.

Navigate to (Reports > Dashboard  > Entities Tab )

Use the ‘Subject Area' section (shown above) to filter the results displayed on the charts to a specific Subject Area

  • Entity Leaderboard: Displays Entities ranked by % of clean records. Clicking on a particular Entity (Bar) will adjust the displays on the right of the chart to display further information on the selected Entity. In this case (as shown above) the Entity selected is ‘Emergency Vehicle’.
  • 1,476 Records: The number of records belonging to the ‘Emergency Vehicle’ Entity. 1,359 records with no errors and 117 records  with errors. 
  • 94.95 % Checks Passed:  % of Rules applied on the ‘Emergency Vehicle’ Entity that found no errors.
  • 4.25 Day Average Age: The average period Errors remain in the data stream before they are cleansed.
  • Entities Over Time: A trend chart that displays the performance (% of clean records) of all Entities over a specified Time scale (The default time scale is a month).

To adjust the Time period displayed on the chart.  Simply click accordingly on the settings (shown above) located above the chart key.
(1D = 1 Day, 7D = 7 Days etc.)

When you hover over any data point a tooltip (shown below)  will pop up. It displays the time frame at the top and a list of scores (% of all clean records) of all the Entities at that particular point in time.

3. Processes Dashboard

Provides analysis of the data quality broken down by Processes.

Navigate to Reports > Dashboards > Processes Tab

Use the ‘Process Category' section (shown above) to filter the results displayed on the charts to a specific Process category. 

  • Process Leaderboard: Displays the Processes ranked by % of clean records. Clicking on a particular Process (Bar) will adjust the displays on the right of the chart to display further information on the selected Process. In this case (as shown above) the  selected is ‘Can we Safely Transport Patients ?’
  • £60,500.00 Process cost: The total cost incurred as result of errors found that were linked to the ‘Can we Safely Transport Patients?’. Each process has a cost assigned to it, therefore the total cost is essentially calculated as (Process cost *  Number of errors linked to that cost).
  • 1476 Records: Number or records attached to the ‘Can We  Safely Transport Patients’ process.
  • Processes Over Time: A trend chart that displays the performance (% of clean records) of all Processes over a specified Time scale (The default time scale is a month).

To adjust the Time period displayed on the chart.  Simply click accordingly on the settings (shown above) located above the chart key.
(1D = 1 Day, 7D = 7 Days etc.)

When you hover over any data point a tooltip (shown below)  will pop up. It displays the time frame at the top and a list of scores (% of all clean records) of all the Processes at that particular point in time.

4. Ages Dashboard

Provides analysis of the data quality broken down by Age (the amount of time the errors have been circulating in the system).

Navigate to (Reports > Dashboards > Ages Tab)

Age profile of current errors : The chart displays the errors counts and their Age in days (length of time they circulated in the Data stream before being cleansed)  for each Entity. The series are split by Entities by default , but their is an option to split the series by another element using the using the drop down (shown below) on the top left of the chart.

The Report Type of the chart can also be changed from displaying errors of 'All Ages’, to only displaying the ‘Overdue errors’. This is done by adjusting the setting (shown below) on the Top right of the chart. The Overdue errors are determined by the severity attached to them. Each severity type has a cleanse window set for it. Example Critical severity errors have a cleanse window of ‘1 day’ while Medium severity errors have a cleanse window of ‘14 days’.

5. Trajectory

Provides analysis of the trajectory of the data quality over certain time periods (time period is set to monthly by default) .

Navigate to Reports > Dashboards > Trajectory Tab

  • Inflow & Outflow of Errors: The chart displays the errors that have been found & removed within a specific timeframe (The default is set to month).  The red bars represent the inflow of errors (errors found) and the green bars represent the Outflow of errors (errors removed).
  • 1,101,436 Inflow Errors: Total count of errors found within the last month.
  • 1,166,215 Outflow Errors: Total count of errors removed in the last month. 

Filters can be applied on the chart by clicking on the ‘Filters’ button (shown below) located on the top left of the chart.

6. Counts Dashboard

Provides analysis of the data quality in the form of raw error counts.

Navigate to (Reports > Dashboards > Counts Tab)

  • Latest Error Counts: Chart displaying the latest count of errors split by (grouped by) Entities. 
  • Error Counts Over Time:  Chart displaying the error counts over a specific time period split by (grouped by) Entities. The time period is set to Month by default and can be adjusted on the settings (shown below) located on the top right of the chart. 

On both charts the series groupings is set to ‘Entity’ by default and this can be changed by clicking on the ‘Group by’ drop down (shown below) below the tabs.

Reports

To navigate to the reports section go to (Reports > Report ).

The this section provides a summary and detailed reporting on data quality errors and affected records across the stored dataset. These reports can be used by data cleanse teams to target and resolve erroneous errors within the dataset.

The are several filters available in the ‘Filters' window (shown below) located at the top of this section. The several filters should enable users to effectively navigate to and analyze the data quality errors they are specifically interested in.

To run the report after the filters have been set accordingly click on the ‘Run Report’ button (shown below).

To clear all the filters and refresh the errors listed in the ‘Summary’ window, simply click on the ‘Clear Filters’ button (shown above) then click of the ‘Run Report’ button.

Report type (Rule Grouped and Record Grouped report)

There are a set of buttons (shown below) located on the top right of the page that the user can use to toggle between the two report types  ('Rule Grouped’ & ‘Record Grouped’) . After selecting a report type, you have to click on the ‘Run Report’ button for the change to take effect.

Rule Grouped

By default the Report type selected is ‘Rule Grouped’. This displays a list of Rule Implementations in the ‘Summary’ window (shown below) that picked up errors.  

Essentially the errors are grouped by the Rule implementations. Therefore when you click on a particular Rule Implementation on the ‘Summary’ window (shown above) A ‘Selected Detail’ window (shown below) will appear below the ‘Summary’ window displaying all the errors linked to the selected Rule implementation.

Report Grouped

This displays a list of records in the ‘Summary’ window (shown below) where errors have been found. The errors are essentially grouped by Records in this case the errors are grouped by (ID). This report type is ideal for determining entities (eg. Property, Customer, Tenancy ) within the data that are problematic with regards to data quality issues.  

Therefore when you click on a particular Record on the ‘Summary’ window (shown above) A ‘Selected Detail’ window (shown below) will appear below the ‘Summary’ window displaying all the errors linked to the selected Record.