Data Catalog and Governance unlocks the ability to store and manage the metadata of your entire data estate in a single location available to your entire organization.
A well-designed catalog populated with high quality data helps create a common understanding of terms, ensure data standards, drive data literacy and ownership, govern data usage, evidence compliance with regulatory standards and achieve trust in your data assets.
As with the rest of Aperture Data Studio, the Data Catalog and Governance module is designed to be both powerful and easy to use, even for less technical business users. Each page features a primary call to action Suggest changes so that the best possible information can be crowdsourced from subject matter experts across your organization.
The Catalog takes your Validation Rulesets and business rules already built out in Aperture Data Studio and allows them to be mapped to your organization’s key processes, i.e. can we contact all customers (to meet regulations)?’ or are we able to deliver customer orders successfully?.
Against these processes you can map a cost, which combined helps to build out a Quality tab containing data quality trends over time for stakeholder reporting and a consistent output that can be used to judge where your data quality efforts should be focussed to have the most impact.
Once your Aperture Data Studio license contains the Catalog and Governance add-on module (and your admin has migrated your repository database to Postgres) the Catalog will be available.
All users, even Consumer users, can:
Catalog administrators require both the user role permissions ‘Access Designer interface’ and ‘Manage Catalog model’ to be able to modify the Catalog to match your organization.
The role permission ‘Bulk load Catalog data’ allows an API Key with ‘Create/update’ Catalog permissions to be created. This API Key allows a Workflow to both extract and bulk load data to and from the Catalog.
A governance catalog data model is a structured representation of how governance-related metadata is organized, stored, and managed within a data governance framework. It acts as the backbone for cataloging and enforcing governance policies across data assets.
If a goal of your governance program is to engage people and teams from across your entire organization then it is vital that the application is easy to use and intuitive, without requiring training.
To reduce the amount your users need to learn, you want your catalog to be an electronic version of your organizational structures. Users will already be familiar with the established terms, so if they are called ‘Business Units’, ‘Departments’ or ‘Divisions’ etc. then name the Object accordingly. Similarly, if you call them ‘Projects’ or ‘Processes’ or ‘Initiatives’ etc. then name the object this in the catalog so users can easily navigate around.
Each Object type is defined by your Catalog admins to have a unique collection of Fields to capture precisely the defined metadata values you require.
Catalog admins can create as many Object types as required. Each Object type is a list of objects that map to elements of your organization, so that could be 100k ‘Columns’, 150 ‘RoPAs’, 4 ‘Priority levels’ or 250 ‘Departments’ etc. Object types are associated (linked) to each other, so ‘Car Manufacturers’ have ‘Car Models’ that are available in certain ‘Countries’, etc.
There are some default Object types in the Catalog that are used to link Rulesets containing business rules used to demonstrate improvements to the quality of your data over time and External systems (databases) and the metadata lineage of the data within these systems.
System object types cannot be deleted, but it is possible to rename these system object if you wanted to call ‘Systems’ ‘Databases’ or ‘Rules’ ‘Business rules’ etc. or to hide them from appearing to users if they are not required.