A Catalog has complex inter-connected information with many different types of objects and associations.

Searching enables you to find a specific object(s) or you can browse and filter individual object lists. Advanced search takes this further, allowing you to:

  • Query across multiple associated object types to answer more complex questions.
  • Discover connections between datasets, policies and other object types.
  • Answer association-based questions that standard search cannot support.
  • Reduce time spent browsing and filtering multiple object lists.
  • Perform calculations on your data such as finding the sum, largest, or earliest, etc.

For example, you could identify all Data Assets that are linked to Policies owned by Fred Jones or calculate the number of Policies that each person owns.

Access Advanced search

To access Advanced search, click the Search magnifying glass button in the header bar, then the Advanced search button on the right-hand side of the search results page.

Write queries using Cypher

To use Advanced search in the Catalog, you need to write (or at least evaluate) Cypher queries. The Cypher query language is similar to SQL and is designed for connected data. It is highly intuitive, making it easy to read and write, even for users who are new to query languages. It is well documented with many examples and resources available online (so it is known to AI tools like ChatGPT, discussed below).

The cypher query has two main parts - the MATCH clause and the other clauses that follows it.

MATCH clause

The MATCH clause is the first line in the Cypher query. It identifies the different associated Object types that you want to search across in your data model. A good example is an Object type Columns that belongs to the Object type Tables, which are grouped into Schemas within a System. These associations or relationships can be visualized as:

system
  └── schema_system
        └── schema
              └── table_schema
                    └── table
                          └── column_table
                                └── column

The general format of the MATCH clause is (o:object_type)-[:association]-(ot:object_type).
Example:

    MATCH (s:system)-[:schema_system]-(sc:schema)-[:table_schema]-(t:table)-[:column_table]-(c:column)

Syntax key points:

  • The direction does not matter. You can start the query with column and end with system or vice versa – the result will be the same.

  • Object types are placed in round brackets () with a unique alias (could be the same value as the system label), followed by a colon, followed by the system label value of the Object type.

  • Object types are connected to associations with a hyphen character.

  • Associations are in square brackets. They don’t require an alias but do start with a colon, followed by the system label value of the association type.

  • Interest types are treated as associations that have a Party as the Object type.

  • OPTIONAL MATCH is used in the same way as a LEFT JOIN in SQL:

    To list all Policies that have an Owner:

        MATCH (p:policy)-[:owner]-(user:party) 
        RETURN p.name, user.name
    

    To list all Policies with their Owner, including those that do not yet have an Owner:

        MATCH (p:policy) 
        OPTIONAL MATCH (p)-[:owner]-(user:party) 
        RETURN p.name, user.name
    

Other clauses

The Cypher script following the MATCH clause is a flavor of SQL with the following clauses supported:

Supported clause Modifier Comment
RETURN AS Mandatory to be included in the script. Specifies the fields/column to be included in the result with any aliases.
WHERE CONTAINS, =, <, >, IS NULL, IS NOT NULL, STARTS WITH, ENDS WITH Equals, greater than, less than for numbers and dates. Can be combined using AND / OR / NOT (equivalent to <>).
LIMIT
ORDER BY ASC, DESC
COUNT, SUM, AVG, MIN, MAX
DISTINCT
CHARACTER_LENGTH (string) Size of a string.
COALESCE (value1, value2, …) Returns the first non-null value left to right from a list of values.
SUBSTRING (string, start, [length]) Returns a portion of a string given a starting position and optional length.
WITH Used to bind new variables, which can then be referenced further on in the query. Anything not explicitly referenced in the WITH clause will be unavailable further down the query.
Description Cypher query
Show all fields (columns) for the System named “Postgres DB”.
MATCH (s:system)
WHERE s.name = "Postgres DB"
RETURN s
Show the top 10 Tables by Column count along with the System each table belongs to.
MATCH (s:system)-[:schema_system]-(sc:schema)-[:table_schema]-(t:table)-[:column_table]-(c:column)
RETURN s.name AS SystemName, t.name AS TableName, COUNT(c) AS ColumnCount
ORDER BY ColumnCount DESC
LIMIT 10
Find all the Business Areas that have an Owner defined who is not "Fred Jones".
MATCH (ba:business_area)-[:owner]-(p:party)
WHERE p.name <> "Fred Jones"
RETURN ba.name AS Business_Area, p.name AS Owner
Show all Business Areas with their Owners and Followers, including Areas without an Owner yet defined.
MATCH (ba:business_area)
OPTIONAL MATCH (ba)-[:owner]-(p:party)
OPTIONAL MATCH (ba)-[:follower]-(pa:party)
RETURN ba.name AS Business_Area, p.name AS Owner, pa.name AS Follower
List the character count of Entity names that appear frequently (at least six times).
MATCH (e:entity)
WITH CHARACTER_LENGTH(e.name) AS name_length<br>COUNT(*) AS name_length_freq
WITH *
WHERE name_length_freq > 5
RETURN name_length, name_length_freq

Using generative AI as an assistant

You need to input the MATCH clause manually, since the external labels of Object types and associations are specific to your model. Once you have defined the MATCH clause, you can copy it into your preferred GenAI tool (Copilot, ChatGPT, Claude, etc.) to generate the remainder of your Cypher query. Then paste it back into the Advanced search box.

Example prompt for creating a Cypher script:

Write a Cypher query script to show the top 10 Tables by column count along with the System they belong to, using this match clause: MATCH (s:system)-[:schema_system]-(sc:schema)-[:table_schema]-(t:table)-[:column_table]-(c:column)
Another useful option is to paste the full Cypher script into your GenAI tool and ask it to explain it. This is helpful to validate that the query logic matches your requirements and will return the intended results.

Example prompt to explain your Cypher script:

Explain this Cypher query in a few sentences: MATCH (s:system)-[:schema_system]-(sc:schema)-[:table_schema]-(t:table)-[:column_table]-(c:column) RETURN s.name AS SystemName, t.name AS TableName, COUNT(c) AS ColumnCount ORDER BY ColumnCount DESC LIMIT 10

Example response:

This query finds systems, their schemas, tables within those schemas, then counts how many columns each table contains. It then sorts tables by highest column count and returns the top 10 tables along with the systems they belong to.