home/Data quality/Aperture Data Studio v2/Set up/Configure Validate addresses

Configure Validate addresses

The Validate addresses step validates and enriches postal addresses based on Experian Batch reference data, which is updated on a regular basis. You have to install the reference data and then configure it before use:

Install reference data

Install any of the data sets you have received to a location on the server where Data Studio is deployed. We recommend using Electronic Updates to download the latest version of the reference data automatically. Alternatively reference data can be downloaded and updated manually.

Configure reference data

Navigate to addressValidate\runtime folder in the Validate addresses installation directory (by default C:\ProgramData\Experian for Windows) and edit the qawserve.ini file.

This file defines where data files are and how the data is mapped to a country.

Installed data directory

Under the [QADefault] section, add a line to the InstalledData setting specifying the location of the folder where the data is installed: InstalledData={ISO},{Data Directory}
If you have more than one dataset, each one must be on its own line preceded by a '+' sign. For example:

InstalledData=GBR,C:\DataStudio\BatchData\GBR
+USA,C:\DataStudio\BatchData\USA

Data mapping

In the same section, add at least one line to the DataMappings setting to specify the datasets you wish to use:
DataMappings={data mapping identifier},{dataset/group name},{dataset+additional datasets}

The {data mapping identifier} must be a three character alphanumeric code.

The {dataset/group name} element of the setting should be a meaningful name as this is what will appear in the reference data selection dropdown on the Validate Addresses step.

You can add a data mapping for each dataset or combination of related datasets that you want to validate against. The first mapping must be directly after the '=' sign and each subsequent mapping must be on its own line preceded by a '+' sign. For example, if you had data for UK, Australia and USA and also had UK Names and Business additional datasets and wanted to validate against different combinations of the UK datasets you can use this setting as follows:

DataMappings=GBR,United Kingdom,GBR
+GBB,UK with Business,GBR+GBRBUS
+GBN,UK with Names,GBR+GBRNAM
+GB1,UK with Names and Business,GBR+GBRNAM+GBRBUS
+AUS,Australia,AUS
+USA,USA,USA

A valid address layout for each mapping needs to be added to the qaworld.ini file.

Additional settings for USA data

If you are using USA data, you have to also specify the location of the supplementary USA Batch data and libraries.

In the [QADefault] section of the qawserve.ini file, set the path to the CorrectAddress data files using the CorrectADataLocUSA setting. For example:

CorrectADataLocUSA=C:\DataStudio\BatchData\USA\CorrectAddress\Data

Additionally, update the CorrectAApiLoc setting to point to the CorrectAddress library used for USA address matching. This supplementary library is usually installed alongside the CorrectAddress data. For example:

CorrectAApiLoc=C:\DataStudio\BatchData\USA\CorrectAddress\API

Additional settings for Canada data

If you are using Canada data, you have to also specify the location of the supplementary Canada Batch data and libraries. In the same section of qawserve.ini file, set the path using the CorrectADataLocCAN setting. For example:

CorrectADataLocCAN=C:\DataStudio\BatchData\CAN\CorrectAddress\Data

Additionally, update the CorrectAApiLoc setting to point to the CorrectAddress library used for Canada address matching. This supplementary library is usually installed alongside the CorrectAddress data. For example:

CorrectAApiLoc=C:\DataStudio\BatchData\USA\CorrectAddress\API

License data

Reference data sets are typically licensed using separate license keys. An Experian Batch license key can be applied through Data Studio's user interface, in the same way as a regular Data Studio update code is applied: click on your user icon, select Update license and enter your license keys. Note that you will have to restart the Data Studio service for the changes to take effect.

Performance

By default, 8 threads are used by Validate addresses to parallelize address cleaning. This can be configured by changing the Maximum concurrent Address Validate searches in Settings > Performance. Reducing the number of concurrent searches will lower CPU load, but could increase the time taken to process your data.