Data Quality user documentation

Data Catalog and Governance unlocks the ability to store and manage the metadata of your entire data estate in a single location available to your entire organization.

This Aperture Data Studio functionality is available through a licensed add-on module. It maps data quality results to business processes to determine the effectiveness of an organization’s governance program.

A well-designed catalog populated with high quality data helps create a common understanding of terms, ensure data standards, drive data literacy and ownership, govern data usage, evidence compliance with regulatory standards and achieve trust in your data assets.

As with the rest of Aperture Data Studio, the Data Catalog and Governance module is designed to be both powerful and easy to use, even for less technical business users. Each page features a primary call to action Suggest changes so that the best possible information can be crowdsourced from subject matter experts across your organization.

Key features

Data dictionary – connect directly to source systems and populate a dictionary with technical metadata that describes the physical data housed there.
Business glossary – capture common definitions and context relating to your data assets within a single location and map these to entries in the data dictionary to help drive greater data literacy across your organization.
Flexible data model – build out a Catalog modelled to match your industry and organization (rather than trying to fit into a fixed model) to ensure your users can immediately understand and begin contributing with little to no training.
Search and discovery – enable all users to browse metadata housed within the Catalog, search across the full system and answer their questions about the data estate, such as what data is available on a topic?, who is responsible for it?, how to request access? etc.
Conformance monitoring – Ensures data adheres to predefined rules and standards, maintaining quality and usability across systems. Choose what is important to capture (for example, systems have owners, stewards are assigned per data domain, risk levels are mapped for each data asset etc.) and use this information to drive custom scoring of conformance at both record level and in top-down progress dashboards.
Change management – Audit and approval functionality that allows you to control changes in line with business needs. Every edit in the Data Catalog is audited, with the options to trigger approvals where necessary to prevent automatic updates to all users. Full visibility and an audit trail are promoted as audits of changes are recorded and available for review.
Collaboration and engagement – Roles can be explicitly assigned within the catalog to drive accountability and transparency (for example, over who is a data owner vs steward - using terminology that matches your organization).

Prioritization and Reporting

The Catalog takes your Validation Rulesets and business rules already built out in Aperture Data Studio and allows them to be mapped to your organization’s key processes, i.e. can we contact all customers (to meet regulations)?’ or are we able to deliver customer orders successfully?.
Against these processes you can map a cost, which combined helps to build out a Quality tab containing data quality trends over time for stakeholder reporting and a consistent output that can be used to judge where your data quality efforts should be focussed to have the most impact.

User access and Permissions

Once your Aperture Data Studio license contains the Catalog and Governance add-on module (and your admin has migrated your repository database to Postgres) the Catalog will be available.

All users, even Consumer users, can:

access the Catalog using the button in the top-right corner of the application.
subscribe to items within the Catalog to register their interest and be alerted to changes that will keep them up to date with information relevant to their role and responsibilities.
use the Discussion tab to post new comments, reply to other users and ask the Owners for information they require.
Suggest changes to outdated or incorrect information.
be made the Owner of an object and the Approver of any changes to an object.

Catalog administrators require both the user role permissions ‘Access Designer interface’ and ‘Manage Catalog model’ to be able to modify the Catalog to match your organization.

Using Workflows to manage the Catalog

The role permission ‘Bulk load Catalog data’ allows an API Key with ‘Create/update’ Catalog permissions to be created. This API Key allows a Workflow to both extract and bulk load data to and from the Catalog.

Flexible data model

A governance catalog data model is a structured representation of how governance-related metadata is organized, stored, and managed within a data governance framework. It acts as the backbone for cataloging and enforcing governance policies across data assets.

Flexibility to adapt to your unique needs

If a goal of your governance program is to engage people and teams from across your entire organization then it is vital that the application is easy to use and intuitive, without requiring training.
To reduce the amount your users need to learn, you want your catalog to be an electronic version of your organizational structures. Users will already be familiar with the established terms, so if they are called ‘Business Units’, ‘Departments’ or ‘Divisions’ etc. then name the Object accordingly. Similarly, if you call them ‘Projects’ or ‘Processes’ or ‘Initiatives’ etc. then name the object this in the catalog so users can easily navigate around.
Each Object type is defined by your Catalog admins to have a unique collection of Fields to capture precisely the defined metadata values you require.

Object types

Catalog admins can create as many Object types as required. Each Object type is a list of objects that map to elements of your organization, so that could be 100k ‘Columns’, 150 ‘RoPAs’, 4 ‘Priority levels’ or 250 ‘Departments’ etc. Object types are associated (linked) to each other, so ‘Car Manufacturers’ have ‘Car Models’ that are available in certain ‘Countries’, etc.

There are some default Object types in the Catalog that are used to link Rulesets containing business rules used to demonstrate improvements to the quality of your data over time and External systems (databases) and the metadata lineage of the data within these systems.

System object types cannot be deleted, but it is possible to rename these system object if you wanted to call ‘Systems’ ‘Databases’ or ‘Rules’ ‘Business rules’ etc. or to hide them from appearing to users if they are not required.

Was this helpful?

Next: Catalog admins

Introduction