A Catalog has complex inter-connected information with many different types of objects and associations.

Searching enables you to find a specific object(s) or you can browse and filter individual object lists. Advanced search takes this further, allowing you to:

  • Query across multiple associated object types to answer more complex questions.
  • Discover connections between datasets, policies and other object types.
  • Answer association-based questions that standard search cannot support.
  • Reduce time spent browsing and filtering multiple object lists.
  • Perform calculations on your data such as finding the sum, largest, or earliest, etc.

For example, you could identify all Data Assets that are linked to Policies owned by Fred Jones or calculate the number of Policies that each person owns.

Access Advanced search

To access Advanced search, click the Search magnifying glass button in the header bar, then the Advanced search button on the right-hand side of the search results page.

Write queries using Cypher

To use Advanced search in the Catalog, you need to write (or at least evaluate) Cypher queries. The Cypher query language is similar to SQL and is designed for connected data. It is highly intuitive, making it easy to read and write, even for users who are new to query languages. It is well documented with many examples and resources available online (so it is known to AI tools like ChatGPT, discussed below).

The cypher query has two main parts - the MATCH clause and the other clauses that follows it.

MATCH clause

The MATCH clause is the first line in the Cypher query. It identifies the different associated Object types that you want to search across in your data model. A good example is an Object type Columns that belongs to the Object type Tables, which are grouped into Schemas within a System. These associations or relationships can be visualized as:

system
  └── schema_system
        └── schema
              └── table_schema
                    └── table
                          └── column_table
                                └── column

The general format of the MATCH clause is (o:object_type)-[:association]-(ot:object_type).
Example:

    MATCH (s:system)-[:schema_system]-(sc:schema)-[:table_schema]-(t:table)-[:column_table]-(c:column)

Syntax key points:

  • The direction does not matter. You can start the query with column and end with system or vice versa – the result will be the same.

  • Object types are placed in round brackets () with a unique alias (could be the same value as the system label), followed by a colon, followed by the system label value of the Object type.

  • Object types are connected to associations with a hyphen character.

  • Associations are in square brackets. They don’t require an alias but do start with a colon, followed by the system label value of the association type.

  • Interest types are treated as associations that have a Party as the Object type.

  • OPTIONAL MATCH is used in the same way as a LEFT JOIN in SQL:

    To list all Policies that have an Owner:

        MATCH (p:policy)-[:owner]-(user:party) 
        RETURN p.name, user.name
    

    To list all Policies with their Owner, including those that do not yet have an Owner:

        MATCH (p:policy) 
        OPTIONAL MATCH (p)-[:owner]-(user:party) 
        RETURN p.name, user.name
    

Other clauses

The Cypher script following the MATCH clause is a flavor of SQL with the following clauses supported:

Supported clause Modifier Comment
RETURN AS Mandatory to be included in the script. Specifies the fields/column to be included in the result with any aliases.
WHERE CONTAINS, =, <, >, IS NULL, IS NOT NULL, STARTS WITH, ENDS WITH Equals, greater than, less than for numbers and dates. Can be combined using AND / OR / NOT (equivalent to <>).
LIMIT
ORDER BY ASC, DESC
COUNT, SUM, AVG, MIN, MAX
DISTINCT
CHARACTER_LENGTH (string) Size of a string.
COALESCE (value1, value2, …) Returns the first non-null value left to right from a list of values.
SUBSTRING (string, start, [length]) Returns a portion of a string given a starting position and optional length.
WITH Used to bind new variables, which can then be referenced further on in the query. Anything not explicitly referenced in the WITH clause will be unavailable further down the query.
Description Cypher query
Show all fields (columns) for the System named “Postgres DB”.
MATCH (s:system)
WHERE s.name = "Postgres DB"
RETURN s
Show the top 10 Tables by Column count along with the System each table belongs to.
MATCH (s:system)-[:schema_system]-(sc:schema)-[:table_schema]-(t:table)-[:column_table]-(c:column)
RETURN s.name AS SystemName, t.name AS TableName, COUNT(c) AS ColumnCount
ORDER BY ColumnCount DESC
LIMIT 10
Find all the Business Areas that have an Owner defined who is not "Fred Jones".
MATCH (ba:business_area)-[:owner]-(p:party)
WHERE p.name <> "Fred Jones"
RETURN ba.name AS Business_Area, p.name AS Owner
Show all Business Areas with their Owners and Followers, including Areas without an Owner yet defined.
MATCH (ba:business_area)
OPTIONAL MATCH (ba)-[:owner]-(p:party)
OPTIONAL MATCH (ba)-[:follower]-(pa:party)
RETURN ba.name AS Business_Area, p.name AS Owner, pa.name AS Follower
List the character count of Entity names that appear frequently (at least six times).
MATCH (e:entity)
WITH CHARACTER_LENGTH(e.name) AS name_length<br>COUNT(*) AS name_length_freq
WITH *
WHERE name_length_freq > 5
RETURN name_length, name_length_freq

Under the useful tips panel on the Advanced search page, there is a link to download your Catalog model as a JSON file. This is extremely useful to identify the correct system label for each object type and associations or the exact name of a field being used in your query.

Using an external generative AI tool as an assistant

As Cypher scripting language is well-documented online, your preferred generative AI tool (Copilot, ChatGPT, Claude, etc.) can help you:

  • Write Cypher queries
  • Explain existing queries
  • Troubleshoot errors
  • Generate queries from your data model

Provide a MATCH clause

Since the MATCH clause contains system labels of Object types and associations specific to your model, include it in your prompt.

Example input prompts:

  • Write a Cypher query to show the top 10 tables by column count, along with the system they belong to, using this MATCH clause: MATCH (s:system)-[:schema_system]-(sc:schema)-[:table_schema]-(t:table)-[:column_table]-(c:column)

Upload your model as a JSON file

If you upload your model as a JSON file, the genAI tool can use it as a reference.

Example input prompts:

  • Use the uploaded JSON file to create a direction-agnostic Cypher MATCH clause from Systems to Columns
  • Use the uploaded JSON file to create a Cypher query that will show the top 10 Tables by column count along with the System they belong to
  • Use the uploaded JSON file to create a Cypher query to show the distinct data type of columns

Ask AI to explain a query

Example prompt:

  • Explain this Cypher query in a few sentences: MATCH (s:system)-[:schema_system]-(sc:schema)-[:table_schema]-(t:table)-[:column_table]-(c:column) RETURN s.name AS SystemName, t.name AS TableName, COUNT(c) AS ColumnCount ORDER BY ColumnCount DESC LIMIT 10

Example response:

  • This query finds systems, their schemas, tables within those schemas, then counts how many columns each table contains. It then sorts tables by highest column count and returns the top 10 tables along with the systems they belong to.

Validate AI-generated queries

Before running a generated query:

  • Review the query logic.
  • Ask the AI to explain how the query works.
  • Confirm that the object types and associations match your requirements.

Using your AI tool to explain the Cypher query will help you validate that it meets your requirements and will return the intended results.

Once you are confident the Cypher query is accurate, paste it back into the box on the Advanced search page and click Search to apply it to your data.