Use the step

Use this step to validate and enrich addresses in bulk using Experian Batch, depending on your license.

If your data has address columns tagged already, this step will automatically pick up all the columns tagged as addresses and list as Selected columns.

To enrich valid addresses, choose one of the available Additional datasets. The additional datasets that are available to you will depend on your license.

Using Additional options you can specify how the validated addresses will be returned:

  • Output columns - This defines how a cleansed address will be returned from the Validate Addresses step: How many columns there will be, and which address elements will go in each column, and what additional formatting (for example casing or truncation) will be applied.
    • Standard (7-column layout).
    • Component (28-column layout).
    • One custom layout can also be defined for each country.
  • Results columns - This defines what information is returned about how the address was cleansed. You can ether return a simple summary of the cleaning action (Good Match, Unmatched, and so on), or a much more detailed breakdown of the match code.
    • Standard (returns the result code).
    • Detailed (returns the result code and additional metadata including the match success and the confidence of the match)

When using this step immediately before the Find duplicates one, the Generate find duplicates data option is selected by default. This speeds up the Find duplicates step by bypassing the address standardization process for addresses with a high quality match.
Find out how to configure Experian Batch for the Validate addresses step.

Address validate cache

The maximum number of address searches that can be stored in Data Studio's in-memory cache.

The default is 1 million. To change this, go to Configuration > Step settings > Validate addresses > Address validate cache.

The higher the value, the larger the cache, meaning more memory will be used and more searches will be saved, improving the performance of the Validate Addresses step if that same search is submitted a second time.
Result codes

An address cleansed in Data Studio will result in one of the following possible results:

Validation result Description
Verified Correct Experian Batch verified the input address as a good-quality match to a complete address. No corrections or formatting changes were necessary.
Good Match Experian Batch verified the input address as a good-quality match to a complete address, although minor corrections or formatting changes may have been applied.
Good Premise Partial Experian Batch was not able to find a full match to a correct address, but found a good match to premise level by excluding organization or sub-premise details.
Tentative Match Experian Batch found a match to a complete address, but the overall differences between the input and cleaned addresses are significant enough to reduce the confidence in the match.
Multiple Matches Experian Batch found more than one correct address which matched the input address. This means that no single address could be matched with high confidence.
Poor Match Experian Batch found a match to an address, but with low confidence. This often means that the cleaned address is not deliverable.
Partial Match Experian Batch was unable to find a full correct address which matched the input address. This often occurs when the property number is missing from the input address.
Foreign Address Experian Batch could not find a matching address because the input address referred to a different country.
Unmatched Experian Batch was unable to match the input address to any correct address.

Logging progress

To get a more detailed view of how many addresses have been validated, you can customize how often an entry in the log will be created.

Go to Configuration > Step settings > Validate addresses and change the value of Address Validate Log Step Size to the required integer value. The default value (0) means that no specific progress logging will be given for the Validate addresses step. Changing the value to 10,000 will mean that a log entry will be created after every ten thousand processed rows.