Clean Web Service v1

Version 1 of Clean Web Service is synchronous and can only be used to clean postal addresses. As such the only available workflow is the CleanEnrich workflow.

As Version 1 is synchronous, it means that you only have to send one request. You will then receive a response once the cleaning has been completed, this response will contain your cleaned postal address records.

This section identifies the workflow which should be applied to the address records sent in a request.

SOAP snippet

<ns:workflow>CleanEnrich</ns:workflow>

Comments

This section must be used in every call. It informs Clean Web Service of the type of cleaning that should take place on each Record included in the request.

Requests and responses

<soap:envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:ns="http://www.qas.com/BulkWebService/2011-04" xmlns:arr="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
  <soap:header xmlns:wsa="http://www.w3.org/2005/08/addressing">
    <wsa:action>http://www.qas.com/BulkWebService/2011-04/IBulkWebService/ProcessRecords</wsa:action>
    <wsa:to>https://api.experianmarketingservices.com/CleanWS/V1/BulkWebService.svc</wsa:to>
  </soap:header>
  <soap:body>
    <ns:processrecords>
      <ns:records>
        <ns:record>
          <ns:fields>
            <arr:string>REF001</arr:string>
            <arr:string>MetaFiction Ltd,14 Old St,London</arr:string>
          </ns:fields>
        </ns:record>
        <ns:record>
          <ns:fields>
            <arr:string>REF002</arr:string>
            <arr:string>10 Downing St,London</arr:string>
          </ns:fields>
        </ns:record>
      </ns:records>
      <ns:workflow>CleanEnrich</ns:workflow>
      <ns:settings>
        <ns:namedvalue>
          <ns:key>UserLabel</ns:key>
          <ns:value>CleanTest</ns:value>
        </ns:namedvalue>
        <ns:namedvalue>
          <ns:key>AddressDataSet</ns:key>
          <ns:value>GBR</ns:value>
        </ns:namedvalue>
        <ns:namedvalue>
          <ns:key>Layout</ns:key>
          <ns:value>StandardWithCoordinates</ns:value>
        </ns:namedvalue>
        <ns:namedvalue>
          <ns:key>InputFieldMappings</ns:key>
          <ns:value>Reference,Address</ns:value>
        </ns:namedvalue>
        <ns:namedvalue>
          <ns:key>OutputHeader</ns:key>
          <ns:value>True</ns:value>
        </ns:namedvalue>
        <ns:namedvalue>
          <ns:key>OutputMatchProfile</ns:key>
          <ns:value>True</ns:value>
        </ns:namedvalue>
        <ns:namedvalue>
          <ns:key>JobLocale</ns:key>
          <ns:value>en-GB</ns:value>
        </ns:namedvalue>
      </ns:settings>
    </ns:processrecords>
  </soap:body>
</soap:envelope>

Note that email address validation is only available in version 2 of Clean Web Service.

Requests to Clean Web Service use a standard SOAP envelope which declares a bulk web service namespace (the xmlns:ns attribute). SOAP headers should be empty; authorisation tokens are sent in the HTTP header.

The contents of the request body are sent to Clean Web Service in a ns:ProcessRecords SOAP object. The following sections must be included:

ns:records

In this request, two Records are sent to Clean Web Service for processing. The Fields parameters each contain strings, identified as string objects.

In this example, the first field of each Record is a reference field. Including a reference field makes comparisons easier, and will allow us to easily insert the processed address back into our source database. To ensure that reference fields are not processed by Clean Web Service, we map them using the InputFieldMappings setting (see below).

The next field of each Record are address fields (again, we map the order of the fields using the InputFieldMappings setting ).

Any blank fields are sent as empty string objects.

ns:workflow

The workflow section specifies the CleanEnrich workflow, meaning that the address record should be matched against postal address records and cleaned. The results of the cleaning process can be seen in the Sample ProcessRecordsResponse.

ns:settings

The settings section contains several NamedValues, each made up of a Key / Value pair:

  • The UserLabel key assigns the label 'CleanTest' to this batch of records.
  • The AddressDataSet key tells Clean Web Service that GBR data should be used to match records.
  • The Layout key tells Clean Web Service to provide the cleaned addresses using the StandardWithCoordinates layout. This is a custom layout similar to Standard, except that it includes Eastings and Northings.
  • The InputFieldMappings key tells Clean Web Service that the first field in each row is for reference purposes only, the second field contains address data. The order of components in this setting must correspond to the order of the fields in each Record block.
  • The OutputHeader key tells Clean to include an additional Record block at the beginning of the response (before the first cleaned record) which contains headers for each field in all subsequent Record blocks.
  • The OutputMatchProfile key tells Clean to include an additional string in each Record block which details the Match Profile for the address. For more information, see Match Profiles.
  • The JobLocale key tells the Clean which language the results should be sent in. Only OutputHeader and OutputMatchProfile are affected by this setting. Supported values are en-US, en-GB or fr-FR.

Each of these keys are covered in more detail in the SOAP Method Reference section.

<s:envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing" xmlns:u="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd">
  <s:header>
    <a:action s:mustunderstand="1">http://www.qas.com/BulkWebService/2011-04/IBulkWebService/ProcessRecordsResponse</a:action>
  </s:header>
  <s:body>
    <processrecordsresponse xmlns="http://www.qas.com/BulkWebService/2011-04">
      <processrecordsresult xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
        <additionalinformation>
          <namedvalue>
            <key>TotalRecords</key>
            <value>2</value>
          </namedvalue>
          <namedvalue>
            <key>RecordsProcessed</key>
            <value>2</value>
          </namedvalue>
          <namedvalue>
            <key>RecordsSubmitted</key>
            <value>2</value>
          </namedvalue>
          <namedvalue>
            <key>ProcessingTime</key>
            <value>8328.3915ms</value>
          </namedvalue>
          <namedvalue>
            <key>BatchVersion</key>
            <value>7.30</value>
          </namedvalue>
          <namedvalue>
            <key>DatasetName</key>
            <value>GBR</value>
          </namedvalue>
          <namedvalue>
            <key>DataSetVintage</key>
            <value>21/02/2014</value>
          </namedvalue>
          <namedvalue>
            <key>CLEAN-GBR_RecordsCleaned</key>
            <value>2</value>
          </namedvalue>
          <namedvalue>
            <key>CLEAN-GBR_RecordsProcessed</key>
            <value>2</value>
          </namedvalue>
          <namedvalue>
            <key>CLEAN-GBR-GBRGRD_Dataplus_RecordsCleaned</key>
            <value>2</value>
          </namedvalue>
          <namedvalue>
            <key>CLEAN-GBR-GBRGRD_Dataplus_RecordsProcessed</key>
            <value>2</value>
          </namedvalue>
        </additionalinformation>
        <jobid>1225f4e3-f20f-4766-8334-9ad9b923aa41</jobid>
        <message>Records processed successfully.</message>
        <records>
          <record>
            <fields xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
              <b:string>MatchProfile</b:string>
              <b:string>MatchCode</b:string>
              <b:string>ReferenceField</b:string>
              <b:string>CleanedAddress1</b:string>
              <b:string>CleanedAddress2</b:string>
              <b:string>CleanedAddress3</b:string>
              <b:string>CleanedAddress4</b:string>
              <b:string>CleanedAddress5</b:string>
              <b:string>CleanedAddress6</b:string>
              <b:string>CleanedAddress7</b:string>
            </fields>
          </record>
          <record>
            <fields xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
              <b:string>Good Premise Partial</b:string>
              <b:string>P923004020220000100000000000:GBR</b:string>
              <b:string>REF001</b:string>
              <b:string>14-18 Old Street</b:string>
              <b:string></b:string>
              <b:string>LONDON</b:string>
              <b:string></b:string>
              <b:string>EC1V 9BH</b:string>
              <b:string>05320</b:string>
              <b:string>01822</b:string>
            </fields>
          </record>
          <record>
            <fields xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
              <b:string>Good Premise Partial</b:string>
              <b:string>P923000000220000000000000000:GBR</b:string>
              <b:string>REF002</b:string>
              <b:string>10 Downing Street</b:string>
              <b:string></b:string>
              <b:string>LONDON</b:string>
              <b:string></b:string>
              <b:string>SW1A 2AA</b:string>
              <b:string>05300</b:string>
              <b:string>01799</b:string>
            </fields>
          </record>
        </records>
        <status>Completed</status>
      </processrecordsresult>
    </processrecordsresponse>
  </s:body>
</s:envelope>

Note that email address validation is only available in version 2 of Clean Web Service.

Responses are sent from Clean Web Service in a ProcessRecordsResult SOAP object. The following sections are included:

AdditionalInformation

The first section of each response is AdditionalInformation , which contains information regarding the data used to process and clean the records. This information is contained within NamedValues, which are in turn made up of Key / Value pairs:

  • the DataSetVintage key details the age of the data which was used to match address records. Results of address cleaning may vary slightly depending on the vintage of data that is used, as newer vintages contain updated data. Clean Web Service always uses the latest data as soon as it is available.
  • the […]RecordsProcessed keys detail how many address records Clean Web Service was able to match against each cleaning process or postal address dataset. In the example above, we can see that our postal address records were successfully processed against the core GBR dataset (CLEAN-GBR) and the GBR grid reference dataset (CLEAN-GBR-GBRGRD). This corresponds to the AddressDataset and Layout settings sent as part of the ProcessRecords request.
  • the […]RecordsCleaned keys detail how many records were successfully cleaned using each postal address dataset. In the example above, we can see that our postal address records were cleaned using both datasets which were used to process them (the GBR dataset, explicitly specified in the ProcessRecords request using the AddressDataset setting, and the GBR-GBRGRD dataset and some of its DataPlus sets, implicitly included by the Layout setting which was specified in the ProcessRecords request). If any records could not be cleaned successfully against the GBR dataset, Clean Web Service would not attempt to enrich the record by appending GBR-GBRGRD DataPlus information to it. In this situation, neither the RecordsProcessed or RecordsCleaned values for any datasets would be incremented.
  • The TotalRecords, RecordsProcessed and RecordsSubmitted keys details the total number of records in the response, the number of records that were successfully processed and the number of records that were originally sent in the corresponding request.
  • The ProcessingTime key details the time Clean took to process the request, in milliseconds.
JobId

A unique JobId is included in every response. If you need to contact Experian regarding one of your jobs, this can be used for tracing purposes.

Message

The Message field contains general information about the completed job, including any errors that may have been encountered. In this example, we can see that all records were processed successfully.

Records

Each response contains cleaned Records – one for each Record you sent in a ProcessRecords request. Each Record's Fields are represented as string objects.

If you set the OutputHeader setting to True, the first Record will contain headers for each field in all subsequent Record blocks (for example, 'Match Profile', 'ReferenceField', 'CleanedAddress1').

In all subsequent Record blocks in this example:

  • The first string is the address match profile (if this was requested using the OutputMatchProfile setting). Match profiles define Clean's level of success in terms of cleaning this specific postal address Record. Match profiles are closely related to match codes.
  • The next string is always the address match code. In the example above, each match code begins with P or R and ends with GBR, which tells us that Clean Web Service successfully matched the postal address records to addresses in the United Kingdom.
  • In this example, the next string is the reference field. If used, this is always the same as the reference field you supplied.
  • The next five strings (in this example) are separate address lines, organised according to the Layout setting specified in the request.
  • The next two strings contain two DataPlus elements, which are implicitly specified by the Layout setting of the request (in this case, this is the Easting and Northing coordinates of the addresses).
Status

The Status field contains a short summary of the completed job. In this example, this tells us that our request was completed without errors or timeouts.