Read release notes

August 12th, 2021

This maintenance release includes updates to dependent libraries for security purposes, as well as the underlying APIs of the Find duplicates step.

Note that this release is for the Windows installer only (the Linux build is unchanged).

Bug fixes
  • Fixes to Lookup functionality and concurrent workflow step execution that could sometimes cause steps to close prematurely
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • The Swagger UI page for the Find duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields and transactional add/update endpoints. See the tutorials for examples.
  • Blank rows can be returned in the Harmonize step when the chosen score column does not exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow execution schedule is reset if the server is restarted before all the scheduled jobs have been run.
  • When using a trigger to replace a workflow's source, manual configuration of the original source file is not applied.
  • "Windows Defender SmartScreen prevented and unrecognized app from starting" warning is displayed on some Windows systems when executing the installer.
  • If SSL has been configured for the Find duplicates step, using the embedded server will not work. A workaround is to install a remote server instance.

September 25th, 2020

This maintenance release adds email validation to two-factor authentication, a fix to download authentication and updates to dependent libraries for security purposes.

Note that this release is for the Windows installer only (the Linux build is unchanged).

New features
  • Hosted Installations: Email as secondary authentication method
Bug fixes
  • Reference the Experian Business Privacy Policy in the UI
  • File Download Issue - Recurring Authentication
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • The Swagger UI page for the Find duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields and transactional add/update endpoints. See the tutorials for examples.
  • Blank rows can be returned in the Harmonize step when the chosen score column does not exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow execution schedule is reset if the server is restarted before all the scheduled jobs have been run.
  • When using a trigger to replace a workflow's source, manual configuration of the original source file is not applied.
  • "Windows Defender SmartScreen prevented and unrecognized app from starting" warning is displayed on some Windows systems when executing the installer.
  • If SSL has been configured for the Find duplicates step, using the embedded server will not work. A workaround is to install a remote server instance.

May 26, 2020

This maintenance release resolves several issues with workflow execution and performance, and makes updates to many workflow step libraries for security purposes.

Note that this release is for the Windows installer only (the Linux build is unchanged).

Download Data Studio 1.6.3.

Download ODBC drivers 1.6.3.

New features
  • Find duplicates server version updated to v3.2.7
  • Standardize server version updated to v4.7.2
  • Address validation engine updated to QAS Batch v7.89
  • Updated Validate phone numbers libraries for more accurate phone validation
  • Added support for OpenJDK v8
Bug fixes
  • Workflows with several JDBC export steps no longer cause execution errors
  • Fixed width text files can be configured (with associated .ddl) to have no delimiter
  • Executing and editing large workflows was infrequently leading to memory exhaustion and associated errors
  • Memory consumption improvements when indexing, profiling and exporting
  • Group step improvements to deal with complex aggregates and aggregates returning incorrect values after source changes
  • Newly defined constants no longer cause existing validation rules to disappear
  • Filters/expressions will remain visible even if they contain missing constants or columns
  • Branch/Join/Split/Validate steps will now output correct columns when input source columns are modified
  • Fixed complex workflows where a join step will occasionally produce no rows
  • Imported workflows containing Validate addresses step no longer lose their country configuration
  • Concurrent Validate addresses steps no longer conflict with one another
  • Harmonize step results now updated if cluster ID column is previously filtered
  • Harmonize step no longer unresponsive if source contains no rows
  • Snapshots referenced in workflows can now be deleted successfully
  • User display names can now include commas
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • The Swagger UI page for the Find duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields and transactional add/update endpoints. See the tutorials for examples.
  • Blank rows can be returned in the Harmonize step when the chosen score column does not exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow execution schedule is reset if the server is restarted before all the scheduled jobs have been run.
  • When using a trigger to replace a workflow's source, manual configuration of the original source file is not applied.
  • "Windows Defender SmartScreen prevented and unrecognized app from starting" warning is displayed on some Windows systems when executing the installer.
  • If SSL has been configured for the Find duplicates step, using the embedded server will not work. A workaround is to install a remote server instance.

Nov 28, 2019

This maintenance release resolves several issues with workflow execution using the Data Studio REST API.

Note that this release is for the Windows installer only (the Linux build is unchanged).

Download Data Studio 1.6.2.

Download ODBC drivers 1.6.2.

New features
  • New server settings provide the ability to disable connectivity and permissions checks on a JDBC source when using a loaded data table.'''
Bug fixes
  • Executing a workflow with the Use Latest Snapshot orUse Snapshot Range as a source using the REST API no longer causes the 'workflow is incomplete' error.
  • The REST API workflow execution fails if a workflow contains aChart step .
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • The Swagger UI page for the Find duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields and transactional add/update endpoints. See the tutorials for examples.
  • Blank rows can be returned in the Harmonize step when the chosen score column does not exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow execution schedule is reset if the server is restarted before all the scheduled jobs have been run.
  • When using a trigger to replace a workflow's source, manual configuration of the original source file is not applied.
  • "Windows Defender SmartScreen prevented and unrecognized app from starting" warning is displayed on some Windows systems when executing the installer.
  • Workflows containing multiple JDBC export steps can error during execution in some scenarios.
  • If SSL has been configured for the Find duplicates step, using the embedded server will not work. A workaround is to install a remote server instance.

Nov 1, 2019

This maintenance release resolves several issues with memory usage and locking at high load as well as a range of issues in the workflow builder, workflow step and data connections.

Note that this release is for the Windows installer only (the Linux build is unchanged).

Download Data Studio 1.6.1.

Download ODBC drivers 1.6.1.

New features
  • The official SAP HANA JDBC driver is now supported.
  • Updated the Find duplicates server version to v3.2.2.
Bug fixes
  • Workflows:
    • The Branch step now always cascades the correct columns when the source is replaced in a workflow.
    • A lookup function with multiple lookup columns no longer returns "Lookup input not found" when defining one lookup value as a constant.
    • Connections directly from a Branch step to a Transform step's lookup node no longer breaks during editing.
    • Validation results from a rule defined in a Transform step no longer change when a Branch step is added between the two steps.
    • Aggregates no longer return zeros when columns are hidden or removed in a preceding Transform step.
  • Find duplicates step:
    • Find duplicates will no longer return different match statuses within the same cluster.
    • Large clusters are now processed using a lower memory footprint.
  • Profiling:
    • Re-profiling a source after overwriting a file now correctly updates profile results.
    • Profiling very large volumes of unique hash records now completes correctly.
    • Quick-filtering on a sorted values drilldown in Data Explorer's profile view no longer locks the UI.
  • Data connections:
    • Intermittent database (JDBC) connections can no longer cause tables to be temporarily removed from the data source's list of tables.
    • Amazon S3 connections can now use HTTPS.
    • Workflow executions against table sources with insufficient DB permissions are handled correctly.
  • Error handling:
    • Workflows executed via the REST API now fail as expected when the Validate addresses step's configuration is invalid.
    • If the GDQ Standardize service is not running, a workflow including Find duplicates step will now executing successfully.
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • The Swagger UI page for the Find duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields andtransactional add/update endpoints. See the tutorials for examples.
  • Blank rows can be returned in the Harmonize step when the chosen score column does not exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • the Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • the Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow execution schedule is reset if the server is restarted before all the scheduled jobs have been run.
  • When using a trigger to replace a workflow's source, manual configuration of the original source file is not applied.
  • "Windows Defender SmartScreen prevented and unrecognized app from starting" warning is displayed on some Windows systems when executing the installer.
  • Workflows containing multiple JDBC export steps can error during execution in some scenarios.
  • Workflows with Use Latest Snapshot or Use Snapshot Range step as a source can't be executed using the REST API (The workflow is Incomplete error is shown instead).

Sept 9, 2019

This release introduces new features, existing functionality enhancements, performance improvements and a number of bug fixes.

Download Data Studio 1.6.0.

Download ODBC drivers 1.6.0.

New features
  • A number of enhancements to the Validate addresses step:
    • Ability to select predefined custom layouts for output columns, giving you more control over the format in which cleaned addresses are returned;
    • Availability of new enrichment data including:
      • Australia Customer Barcode and various propensity, predictor and geographical data sets,
      • United Kingdom data for sets including Just Built, Not Yet Built and the Multiple Residence file,
      • United States Tigerline Co-ordinates, and various barcode sets,
      • A variety of data for Australia G-NAF, New Zealand, Singapore and UK Addressbase. See data guides for more information.
  • Various enhancements to the Find duplicates step and the Find duplicates real-time search and transactional API:
    • To increase the security, you can now encrypt the duplicate store,
    • The default country used in the address standardization process can now be defined in the rules when a country is not explicitly supplied as input,
    • The real-time search and transactional API now returns additional information on duplicate stores and their settings, and on cluster auditing. Note that the Find duplicates server has been updated to v3.2.1.
  • Ability to manage the maximum number of snapshot versions using the Take snapshot step.
  • A new setting allowing you to automatically profile a data source on load.
  • A trigger file can now optionally be used as the new source in the triggered workflow.
  • The Download as .CSV option can now be hidden from the product interface.
  • A variety of SDK enhancements and bug fixes, giving custom step developers more of the tools needed to create powerful and feature-rich steps:
    • Introduced a new method to retrieve the datatype of a column,
    • Enhanced the SDK for custom parsers to prevent duplicate table/sub-table names when loading,
    • Improved the sample custom steps. Find out more.
Bug fixes
  • The Join step now handles empty input correctly and rebuilds caches following all configuration changes to avoid stale results being displayed.
  • Joining on the Find duplicates step's Cluster ID now returns expected results.
  • It's now possible to configure how Data Studio handles exporting of epoch datetime (01-01-1970 00:00:00) to a DBMS: users can choose whether to convert to NULL.
  • Editing a duplicated workflow no longer affects the original.
  • Improved column manipulation inPreview and Configure :
    • It's now possible to remove columns from fixed width files without disordering data,
    • Columns in preview are updated when the delimiter is changed,
    • Excluding columns in preview displays correct data for load.
  • Removing columns from a snapshot between one version to the next no longer shows unmapped columns.
  • the Dependency analysis conflicts view now correctly shows only true conflicts.
  • A newly created integer column can be used as a measure for the Validate step.
  • A function can now be used to define a lookup's value (both default and not).
  • Changes to column data types and type standardization settings are now applied consistently.
  • Data types are now picked up correctly for columns from .xlsx files.
Known issues
  • Intermittent database (JDBC) connections can cause tables to be temporarily removed from the data source's list of tables.
  • In the workflow builder, connections directly from a Branch step to a Transform step's lookup node can be deleted unexpectedly, causing the workflow to become invalid.
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • The Swagger UI page for the Find Duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields andtransactional add/update endpoints. Refer to the online tutorials for correct examples.
  • Blank rows can be returned in the Harmonize step when the chosen score column does not exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • the Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • the Take snapshot step doesn't disableShow data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow Execution Schedule is reset if the server is restarted before all scheduled jobs have been run. When using a trigger to replace a workflow's source, manual configuration of the original source file is not applied.
  • Re-profiling a source after overwriting a file does not correctly update the profile results.
  • Branch step does not always cascade the correct columns when the source is replaced in a workflow.
  • Profiling very large volumes of unique hash records causes the profile process to loop.
  • When the Cache execution plans server setting is turned off, editing and executing large workflows can result in memory usage issues.
  • A Lookup function with multiple lookup columns returns the error "Lookup input not found" when defining one lookup value as a constant.
  • When using a trigger to replace a workflow's source, manual configuration of the original source file isn't applied, resulting in invalid mappings.
  • Linux build specific:
    • When using Linux build, R scripting will require additional setup
    • Auto-tagging fingeprint files isn't copied on server startup.

July 15, 2019

This maintenance release resolves workflow and custom step backward compatibility issues as well as bugs that caused incorrect workflow results.

Download Data Studio 1.5.1.

Download ODBC drivers 1.5.1.

Bug fixes
  • Mapping sources in a workflow no longer causesJoin , Sort and Group steps to lose column configuration information.
  • Using two filtered Group steps after a Branch step no longer returns the same data.
  • Profiling the output of a Join step no longer repeats the first column's results for each profiled column. This is also fixed for Splice , Union andMulti-view steps.
  • Scheduled jobs now export data after a server restart.
  • All custom steps built using older versions of the SDK will now work as expected.
  • A workflow which includes a custom step can now be successfully imported.
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • The Swagger UI page for the Find Duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields and transactional add/update endpoints. Refer to the online tutorials for correct examples.
  • Connections between steps in the workflow builder are not highlighted when selected or hovered over.
  • Blank rows can be returned in Harmonize duplicates step when the chosen score column doesn't exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • It's currently not possible to create a user with the same name as a previously deleted user.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, you should complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow Execution Schedule is reset if the server is restarted before all scheduled jobs have been run.

Jun 21, 2019

This release introduces new features, existing functionality enhancements, performance improvements and a number of bug fixes.

Download Data Studio 1.5.0.

Download ODBC drivers 1.5.0.

Download Data Studio 1.5.0 for Linux.

New features
  • The Find duplicates step enhancements to provide more powerful data matching capabilities:
    • cross-field matching allows you to match across multiple fields of the same type to find potential duplicates. For example, if your data consists of three phone number fields (e.g. home, work and mobile number), you can configure the Find duplicates step to find potential phone number matches across all three of them.
    • search-specific rules and schemas allow you to find customers in the duplicate store based on specific or limited search criteria rather than having to submit data for every field mapped in the Find duplicates step. This is important functionality for implementing searching for non-technical users that may only know some of fields that are mapped or may want a looser ruleset for returning higher confidence results.
    • an updated contact data standardization service with address classification improvements
  • Address validation engine updates, re-certified for CASS and other global certification programs, and with access to a range of new data enrichment sets. This update also includes stability improvements.
  • A variety of SDK enhancements and bug fixes, giving custom step developers more of the tools needed to create powerful and feature-rich steps. Including improvements to custom step caching: per cached record TTL and step level caching. Find out more.
  • Improvements to the Validate phone numbers step - ability to select a country column and filter on countries.
  • Built-in support for the Netezza JDBC driver

Dependencies updated as part of this release:

  • The Find duplicates server version has been updated to v3.1.0
  • The standardize component has been updated to Standardize v4.1.8
  • The address validation engine has been updated to QAS Batch v7.77
Bug fixes
  • Deleting the latest version of a snapshot no longer causes Use latest snapshot views for that snapshot to contain unmapped columns.
  • It's now possible to group by a transformed column in Data Explorer.
  • A workflow trigger execution (YAML) config file can now be uploaded when it references a workflow that has a snapshot as a source.
  • Workflows no longer triggered multiple times when the YAML config file defines multiple sources.
  • Validate step views now pick up changes made in a lookup function in a preceding Transform step.
  • Aggregate column names are now displayed correctly when editing an aggregate on a hidden column.
  • When connected to Amazon S3, we no longer make an unnecessary ListBuckets call when a bucket name is supplied.
  • Custom repository paths are now correctly applied after installing on Windows Server 2012 and 2016.
  • the Default Value is now correctly returned in a lookup function when no result is found.
  • The Export step now correctly exports tab delimited files.
  • It's now possible to configure an SMTP connection without defining credentials.
  • Sorting on the dependency analysis conflict view now works as expected.
  • Multi-byte characters are now loaded correctly when present in a .csv file column name.
  • It is now possible to export with INSERT mode into Hive via JDBC.
  • the Harmonize duplicates step results cache is now updated when settings in the step (or a preceding step) are changed.
  • A column created in a Transform step can now be used as a cluster id column in the Harmonize duplicates step. It's now also possible to rename a column that is selected as a cluster ID.
  • When creating and deleting tables via a JDBC data source, duplicate table names no longer appear.
  • Underscore characters are no longer stripped from Glossary constant names.
  • In the Group step it's now possible to aggregate on a column with a name containing parentheses.
  • Match blocking keys and rules copied by a Designer & Developer user can now be edited by that user.
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • After scheduling a workflow containing Export step(s) with default settings to re-execute periodically, if you restart the server between runs the Export step will not run.
  • Mapping sources in the mapping dialog for an imported workflow can cause the workflow to lose step configuration information.
  • The Swagger UI page for the Find Duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields andtransactional add/update endpoints. Refer to the online tutorials for correct examples.
  • Connections between steps in the workflow builder are not highlighted when selected or hovered over.
  • Blank rows can be returned in Harmonize duplicates step when the chosen score column doesn't exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • the Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • It's currently not possible to create a user with the same name as a previously deleted user.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • the Take snapshot step doesn't disable*Show data * when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, you should complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow execution schedule is reset if the server is restarted before all scheduled jobs have been run.
  • Linux build specific:
    • When using the Linux build, R scripting will require additional setup
    • Auto-tagging fingerprint files is not copied on server startup

May 14, 2019

This release resolves an issue where changing the path for the database location (root or data) during Windows installation caused the Find duplicates workflow step to fail and appear to be unlicensed when the Match server is run within Data Studio. This was due to an invalid file being created when the match store location was changed by the installer.

This release is for the Windows installer only (the Linux build is unchanged). If you have v1.4.0 installed, there's no need to upgrade to this version.

If you are having issues with the Find duplicates step, please contact us.

Download Data Studio 1.4.1.

Download ODBC drivers 1.4.1.

Bug fixes
  • The embedded Match server now starts up correctly when the Find duplicates data store location is changed in the installer (either explicitly or through a changed root/data location).
Known issues
  • Custom database paths configured in the installer are not honored on Windows Server 2012 and 2016. Paths need should be modified after install.
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementation of the HTML5 standard. We therefore** strongly recommend using Chrome **to access Data Studio as currently, this is the only fully supported browser.
  • When upgrading from v1.2 or earlier, workflows that include a Transform step and source data files that can't be found may fail to display the mapping dialog.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • It's currently not possible to create a user with the same name as a previously deleted user.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • When using the Linux build, R scripting will require additional setup.
  • Auto-tagging of data doesn't work for Excel files.
  • When tab delimiter is chosen on export, the selection ignored.
  • The Default Value in a lookup function always returns null.
  • Workflow Execution Schedule is reset if the server is restarted before all scheduled jobs have been run.
  • When configuring SMTP settings for email notifications, mail server credentials are required. This prevents administrators from connecting to servers that do not require credentials.

Apr 30, 2019

This release introduces new features, existing functionality enhancements, performance improvements and a number of bug fixes.

Download Data Studio 1.4.0.

Download ODBC drivers 1.4.0.

Download Data Studio 1.4.0 for Linux.

New features
  • Find duplicates step enhancements:
    • Significant increase in the number of records that can be matched using the step including a huge improvement in throughput speed.
    • New transaction matching and searching operations using a persistent match store created in the step.
    • Support for transpose names matching in default individual rules.
    • Improvements to the Standardize knowledge bases for AUS, GBR, and USA for better address standardization.
    • The remote match server connection can now support HTTPS.
  • SDK enhancements for quicker and easier custom step building:
    • A test framework for the custom step SDK to allow developers to add unit/component tests in their code. This makes it easier and quicker to develop and maintain custom steps.
    • Support for personalized step icons, using tags in custom steps, and Select all checkbox in the Multi-Column chooser property.
  • A new Validate phone numbers step has been added allowing you to validate the phone number format for a selected country.
  • When profiling, users can now drill down to multiple values for more streamlined data discovery.
  • Added the ability to upload a new version of an existing file through the UI with the option to overwrite it.
  • Files exported to the user's export directory are now available from the Jobs view for easier access to results from your previously executed jobs.
  • The Salesforce credential management (including a security token for more granular assignment of login credentials) is now supported.
  • The row limit for the Standard and Professional base license editions has been increased. The Standard license now has a row limit of 1 million (up from 10k) and the Professional has 10 million (up from 100k). If you're an existing user with one of these licenses, the increased limit will be applied on upgrade.
  • A new tutorial (quick UI tour) has been added.
  • You can now configure when the user's login session expires.
  • The username of the initiator of a task is now shown in the Tasks view.
  • Improved the server startup time, reducing the downtime during updates or maintenance. This will be particularly noticeable for heavy use deployments with infrequent restarts.
  • Improved the performance of the Validate step.
  • Improved the handling of milliseconds in date and time functions.
  • Improved the accuracy of job progress information for workflows that create snapshots.
  • Updated the Linux version of Data Studio.
Breaking changes

Custom blocking keys and rules created for the previous version of Match or Data Studio will not be compatible with this version without making the following change: remove the maxBlockSize entry from all blocking key definitions.

Note that the default rules and blocking keys supplied with Data Studio have already been updated and will work out-of-the-box.

Bug fixes
  • The custom field delimiters (e.g. pipe characters) and quote characters are now available when configuring delimited export settings in the*Export * step.
  • Job status now reports correctly for workflows that produce large snapshots or execute Find Duplicates step.
  • Job progress now updates correctly for scheduled workflows.
  • Refresh on execution enabled on a JDBC source will no longer cause the job to hang.
  • Changing the Database root of the application from the UI behaves correctly on restart.
  • Clearing a filter when exploring in the Data Explorer no longer throws an exception.
  • A newly created column can now be used in a column specific rule or as a score column in the Harmonize duplicates step.
  • A mix of data types for a Cluster ID column in the Harmonize duplicates step is now supported.
  • A filter change in a step now propagates through all branches of a workflow.
  • Updating the source or adding a Branch step no longer breaks lookup references in the following steps.
  • The Validate step executes on correctly refreshed data after a filter is updated in a previous step.
  • Files uploaded in Workflow Designer now appear as data sources immediately.
  • It's now possible to show data on a custom parser sub-file when the file name matches the username.
  • Clicking Show data in the Harmonize duplicates step is no longer selectable when no cluster ID column is defined.
  • Exporting from a workflow using the Latest snapshot step as a source now reflects the updated data when a new snapshot is taken.
  • Removed JSON parsing errors from the log files.
Known issues
  • When the Find duplicates datastore location is changed in the installer (either explicitly or through a changed root or data location), the Match server fails to start up in an embedded local mode and the license page shows Match as being unlicensed.
    The workaround is to stop the Data Studio service and locate the match-rest-api-3.0.3.war file (by default in C:\Program Files\Experian\Aperture Data Studio 1.4.0\matchDefaults). Change the file extension from .war to .zip then unzip the file. Zip it again and rename it file back to .war.
  • The custom database paths configured in the installer are not honored in Windows Server 2016, therefore, paths need to be modified after the install.
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • When upgrading from v1.2 or earlier, workflows that include a Transform step and source data files that can't be found may fail to display the file mapping dialog.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • It's currently not possible to create a user with the same name as a previously deleted user.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • When using the Linux build, R scripting will require additional setup.
  • Auto-tagging of data doesn't work for Excel files.
  • The Default Value in a lookup function always returns null.
  • Workflow Execution Schedule is reset if the server is restarted before all scheduled jobs have been run.
  • When configuring SMTP settings for email notifications, mail server credentials are required. This prevents administrators from connecting to servers that do not require credentials.

Feb 22, 2019

This maintenance release fixes a number of bugs, including significant performance improvements to the Harmonize duplicates and Use snapshot steps.

We've also made a range of improvements to information available in the audit log as well as the Salesforce and Google BigQuery connections.

Download Data Studio 1.3.2.

Download ODBC drivers 1.3.2.

Download Data Studio 1.3.2 for Linux.

New features
  • The time between service status checks for JDBC connections in now configurable using a new JDBC connection test interval server setting (the default is 60 seconds between pings).
  • Updated the Linux version of Data Studio.
Bug fixes
  • the Harmonize duplicates step no longer throws an OutOfMemoryError for a large (more than 10 million rows) input.
  • Workflows that use snapshots as a source now execute with the same performance as those from loaded table source.
  • You can now take a new version of a snapshot when it's used as the source in the same workflow.
  • Data connectors:
    • When creating a rule on a table in a data source from a Salesforce JDBC connection, functions now consistently list all the source column names.
    • Retrieval of a large table list from a Salesforce JDBC connection no longer causes a maximum web service call limit to be breached.
    • Resetting the configuration on a table in a Salesforce JDBC connection no longer removes the table from the list in Data Explorer.
    • It's now possible to export with INSERT and UPDATE into a Google BigQuery JDBC data source.
  • Auditing:
    • Performing Preview and Configure on a file now shows the file name in the audit entry.
    • The correct user name is now recorded when uploading a file.
    • the Data Viewed audit entry now returns the correct username when a user creates a filter on a source.
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementation of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • When upgrading from v1.2 or earlier, workflows that include a Transform step and source data files that can't be found may fail to display the mapping dialog.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Attempting to edit a column that's already being used in a lookup causes the edit to fail without showing an error.
  • It's currently not possible to create a user with the same name as a previously deleted user.
  • Setting the browser zoom to 110% may cause the bottom menu bar to be cut-off.
  • Connecting to multiple SQL server instances on the same server may cause the instance name to be ignored.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Changing the delimiter of a file in Configuration may cause headings to be lost. To work around this issue, you can manually change the column headings or reset the configuration (and lose all changes).
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • When using the Linux build, R scripting will require additional setup.
  • The Harmonize duplicates step fails when the cluster ID column contains a mix of data types.
  • An error is thrown when using a column added by a Transform step as a cluster ID column in the Harmonize duplicates step.
  • Custom field delimiters (e.g. pipe characters) and quote characters are ignored when configuring delimited export settings in the Export step.
  • Auto-tagging of data doesn't work for Excel files.

Jan 28, 2019

This maintenance release fixes a number of bugs, including significantly speeding up the workflow builder UI, as well as introducing features to support hosting Data Studio as a single-tenant managed service. Download Data Studio 1.3.1. Download ODBC drivers 1.3.1. Download Data Studio 1.3.1 for Linux.

New features
  • The custom parser has now been added to the public SDK, allowing anyone to create a custom parser.
  • The default password complexity rules have been tightened to meet Experian standards.
  • Files can now be written directly to the Azure Blob Storage hosted file system using the Export step.
  • Audit events can optionally be logged centrally.
  • The 'end of line precedence' load option can now be configured on each file individually (this was previously a global setting).
  • The 'Lookup Aggregate' function has been renamed to 'Lookup'.
  • Updated the Linux version of Data Studio.
Bug fixes
  • UI responsiveness of the workflow builder has been significantly improved when creating and modifying workflows that contain many steps.
  • When adding rules using the Validate step, the UI is now much more responsive when the input includes many columns.
  • Adding multiple aggregates to a Group step is also now much more responsive.
  • The Validate step now evaluates rules much faster than was possible in previous releases.
  • When columns are reordered or sources changed, the downstream steps no longer modify the ordering.
  • Workflow steps no longer retain cached results when the data source is changed. This principally affects the Profile, Validate, Split and Validate addresses steps.
  • Profiling no longer loops indefinitely when very large numbers, converted from scientific notation, are encountered.
  • The Validate emails step's configuration is no longer reset if Show Data is not selected.
  • Additional data is now returned when selected if the Validate addresses step is configured with the component layout.
  • It's now possible to connect to a Vertica 8.1 database via JDBC.
  • When configuring an Oracle connection via JDBC, the SID parameter is no longer a mandatory field.
  • A JDBC connection to SQL Server with Windows Authentication (NTLM) can be established without the application having to be installed on the C:\ drive.
  • JDBC password fields no longer need to be populated in all cases.
  • It's now possible to configure the Export step for a Google BigQuery JDBC source.
  • Tables can now be loaded successfully when a Google BigQuery DataSet includes a table with a column of DATE type.
  • Tables from a Salesforce JDBC connection are no longer duplicated after a manual refresh.
  • Files which yield multiple source tables (such as JSON, Metro2) are now correctly handled when the name contains a dot.
  • When filtering on Profile view, values no longer shift to an incorrect column.
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementation of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • When upgrading from v1.2 or earlier, workflows that include a Transform step and source data files that can't be found may fail to display the mapping dialog.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The 'Download as csv' option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Attempting to edit a column that's already being used in a lookup causes the edit to fail without showing an error.
  • It's currently not possible to create a user with the same name as a previously deleted user.
  • Setting the browser zoom to 110% may cause the bottom menu bar to be cut-off.
  • Connecting to multiple SQL server instances on the same server may cause the instance name to be ignored.
  • The Take snapshot step doesn't disable 'Show data' when its input becomes invalid.
  • Changing the delimiter of a file in Configuration may cause headings to be lost. To work around this issue, you can manually change the column headings or reset the configuration (and lose all changes).
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • When using the Linux build, R scripting will require additional set-up.
  • When performing 'Preview and Configure' on a file, its audit entry doesn't return the name of the file.
  • The audit entry for uploading files always recorded as being done by the 'Administrator' regardless of the user who initiated the upload.
  • The audit entry 'Data Viewed' is recorded as being initiated by 'Administrator' when any user creates a filter on a source.
  • The Harmonize duplicates step fails when the Cluster ID column contains a mix of data types.
  • The Harmonize duplicates step can fail on large record sets when the available memory is exceeded.
  • Custom field delimiters (e.g. pipe characters) and quote characters are ignored when configuring delimited export settings in the Export step.

Nov 16, 2018

This release introduces several new features and a number of bug fixes. Download Data Studio 1.3. Download ODBC drivers 1.3. Download Data Studio 1.3 for Linux.

New features
Bug fixes
  • Loading a file with an incorrect quote no longer fails.
  • Address validate step no longer re-runs and re-caches when used after a Branch step.
  • Clearing the configuration of loaded data now behaves as expected.
  • Fixed an issue with the Neo4j custom JDBC connection.
  • Newly created rule columns in the Transform step don't disappear after being connected to a Validate step.
  • Improved memory management for the Find duplicates step when submitting data to Experian Match.
  • The workflow list is now updated when logging in as a different user.
  • The most recent change is now not lost in the workflow expression editor when re-opened.
  • Audit events don't display blank fields anymore.
  • Fixed various issues with Excel files:
    • dates are now parsed correctly
    • improved memory management when loading Excel file
  • In a complex workflow the grid's data values now behave as expected.
  • Trying to remove a Branch step dropped between steps now works as expected.
  • Searching on special characters now works as expected.
  • Custom steps' value chooser is now more responsive.
  • Fixed several issues with YAML files:
    • workflows executed by a trigger file now produce correct audit logs
    • fixed an action type in audit log for the YAML file processing
    • uploading a new YAML file and changing 'location:'' now behaves as expected
    • removing a source from one of the referenced workflows now doesn't cause the remaining valid workflow to execute twice
    • changing a workflow trigger source to a different source of the same name now doesn't cause it to be re-executed on import
    • removed repeated YAML messages
  • Fixed a loading issue with Amazon Redshift.
  • Fixed several issues related to data tagging:
    • auto-tagging for the email addresses now behaves as expected
    • in a transform step duplicating a column now copies the tag
    • the Union step doesn't lose tags
    • the tag is now saved when you click Apply
    • changing column header name now applies the correct tag
    • the Harmonize duplicates step now doesn't discard tags
  • Fixed an issue with inconsistent minimum/maximum profile results in mixed data type columns.
  • Fixed license page issues for Address validate step.
  • Fixed an issue with all the snapshots being available to all the users via ODBC.
  • A null export date is now saved correctly to Oracle.
  • Nullable columns are now recognized in Oracle.
  • Fixed an exception when trying to create a table in MongoDB.
  • Correct data is now displayed in a step after a Multi-view step.
  • The security token is now not a required field for the Salesforce JDBC connection.
  • Folders are now moved when resetting directory-related server property to the default.
  • Fixed an issue with the local Standardize service not running on a machine with a non-English OS.
  • The truncate statement is now supported for MongoDB.
  • Fixed an issue with default column widths when previewing data.
  • A method to delete cache in the SDK now works.
  • Added a missing 'MatchGBR' file for Validate address step.
  • The Profile step now correctly counts hidden columns in the UI.
Known issues
  • When upgrading from v1.2 or earlier, workflows that include a Transform step and source data files that can't be found may fail to display the mapping dialog.
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementation of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The 'Download as csv' option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Attempting to edit a column that's already being used in a lookup causes the edit to fail without showing an error.
  • It's currently not possible to create a user with the same name as a previously deleted user.
  • Setting the browser zoom to 110% may cause the bottom menu bar to be cut-off.
  • Connecting to multiple SQL server instances on the same server may cause the instance name to be ignored.
  • The Take snapshot step doesn't disable 'Show data' when its input becomes invalid.
  • Changing the delimiter of a file in Configuration may cause headings to be lost. To work around this issue, you can manually change the column headings or reset the configuration (and lose all changes).
  • The Harmonize duplicates step fails when the Cluster ID column contains a mix of data types.
  • When using the Validate address step, additional enrichment data is not returned when using the 'Component' layout.
  • If two users simultaneously upload files with the same name, the file may not appear for each user.
  • The Email validate step configuration is lost if you don't show data on the step.
  • The responsiveness of the workflow builder UI is degraded when modifying workflows that contain a large number of steps.
  • The responsiveness of the Validate step is degraded when the step's source data contains a large number of columns.
  • Loaded files which yield multiple source tables (Metro2, JSON) aren't correctly handled when the file name contains a dot.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • SQL Server with Windows Authentication fails if installed on any drive that's not C:
  • When using the Linux build, R scripting will require additional set-up.
  • When creating an Oracle connection for JDBC, the SID parameter is mandatory.
  • It's not possible to create a JDBC connection to a Vertica 8.1 database because table row counts can't be retrieved.
  • Custom field delimiters (e.g. pipe characters) and quote characters are ignored when configuring delimited export settings in the Export step.

Oct 02, 2018

This release introduces multiple brand new features, existing functionality enhancements, performance improvements and a large number of bug fixes.
Download Data Studio 1.2.
Download ODBC drivers 1.2.

New features
  • A new Harmonize duplicates step that allows you to form a single resulting record by applying business rules to a set of duplicate records.
  • The ability to automatically tag data by allowing the system to recognize data from its trained knowledge base. This includes the ability to further train the knowledge base using your own data.
  • Significant performance improvements for workflow execution.
  • Support for Red Hat Linux.
  • Ability to monitor source files such that their associated workflow(s) are automatically triggered when they change.
  • New lookup functions:
    • Lookup aggregate
    • Remove matches
    • Replace matches
    • Extract matches as list
    • Contains match
    • Expand list
  • Enhanced security allowing the credentials used to access external data sources to be dependent on the current Data Studio user.
  • SDK enhancements:
    • Ability to create custom workflow steps that do not require data. These can exist in a workflow prior to a source step.
    • Access to data sources.
  • New workflow steps:
  • Enhancements to the following workflow steps:
    • Join step columns are now selected in the step
    • Validate step rules are now defined more intuitively
    • Validate addresses step:
    • you can now find duplicates based on cleansed addresses
    • you can now enrich addresses with multiple datasets
  • Ability to rename data sources.
  • When viewing data, the column headings now display all assigned data tags.
  • Ability to add timestamps to exported files.
  • An interactive tutorial for first-time users.
Bug fixes
  • The Validate step now picks up the rules from previous steps.
  • Filtering and format sorting in profiling now works as expected.
  • The SDK backward compatibility has been fixed.
  • Renaming a JDBC source no longer invalidates the workflow.
  • Performance improvements when drilling down in Data Explorer.
  • Fixed a bug when downloading as a .csv from a snapshot.
  • Data from a large table in Redshift cluster now loads as expected.
  • Address Validate step with a transformed column as input now works as expected.
  • Fixed an issue with sorting when using 'Save as workflow'.
  • The 'Documented type' column is now removed from the Profile step view.
  • Grouping columns which contain large numbers of unique values no longer fails.
  • Fixed an issue where re-arranging columns in the side menu caused unexpected results.
  • A Find duplicates step immediately after a Split step will now execute correctly.
  • Fixed an issue where using a Validation step immediately after a Join, Splice or Union step may only show the columns originating from the first of the two inputs.
  • Validate and Join steps will now save rules or join columns when used immediately after the scripting (JS /R) step.
  • Validation rules are no longer lost when the following steps precede the Validation step: Union, Multi-View, Splice, Validate emails, Chart, Use snapshot and custom steps.
  • Deleting a workflow step immediately preceding a Validation step will no longer cause the validation rules to be lost.
  • A Join step applied immediately after any of the following steps will no longer cause the key columns to be lost: Find duplicates, Script, Use Snapshot, Union, Multi-View, Splice, Validate emails and Chart.
  • Using a Join step after a step with multiple outputs will no longer only allow you to store one set of key columns.
  • When performing a lookup transformation in a workflow on data which has been joined, the lookup values returned no longer only refer to the original source file but the joined rows.
  • The 'Download as .CSV' option can now export multibyte characters.
  • The Export step now fully supports multibyte characters.
  • Snapshots are no longer impacted by workflow name changes.
  • Multiple R script steps are now supported in workflows.
  • Long source file names are now handled in the workflow mapping dialog.
  • Fixed an issue where reordering columns after using Fix First Columns results in column data appearing in incorrect locations.
  • You will no longer be automatically logged out when configuring files.
  • You will no longer see an error when attempting to load large (over 10 million row) tables from the SQL Server using the default driver.
  • The Extract Date/Time functions will now apply the correct time zone offset.
  • Fixed an issue where changing the locale settings on a file wouldn't take effect on how dates were interpreted.
  • JDBC column mappings for exports can now be edited after saving.
  • Changes to source file configuration in the Data Explorer are now reflected in the Workflow Designer.
  • Only licensed additional data sets will appear in the Validate addresses step.
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementation of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • Determining the min/max for a mixed data type column can produce inconsistent results.
  • Chinese characters or numeric values as column headers in source data are standardized incorrectly on load.
  • The 'Download as csv' option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Attempting to edit a column that is already being used in a lookup causes the edit to fail without showing an error.
  • It's currently not possible to create a user with the same name as a previously deleted user.
  • Setting the browser zoom to 110% may cause the bottom menu bar to be cut-off.
  • Connecting to multiple SQL server instances on the same server may cause the instance name to be ignored.
  • The Take snapshot step doesn't disable 'Show data' when its input becomes invalid.
  • If the Validate Address step is used after a Branch step, previous results are not used when viewing the data in the Workflow Designer (slower performance).
  • Changing the delimiter of a file may cause headings to be lost. To work around this issue, you can manually change the column headings or reset the configuration (and lose all changes).
  • Harmonize step may cause column tags to be lost.
  • Exporting a workflow may cause its data tags to be lost.

Jul 26, 2018

This release fixes a high priority bug in the join engine that occurred when using key columns containing more than 100,000 unique values of mixed data types. Download Data Studio 1.1.5. Download ODBC drivers 1.1.5.

Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementation of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • Profiling
    • Determining the min/max for a mixed data type column can produce inconsistent results.

Jul 2, 2018

This release includes a number of high priority bug fixes and a few minor new features. Download Data Studio 1.1.4. Download ODBC drivers 1.1.4.

New features
  • Added a no delimiter option to the configuration of files.
  • Added a new option allowing administrator users to download log files via the browser. To do this, click on your username in the top menu and select 'Download log files'.
  • Updated the versions of Experian Match (2.9.0), Standardize and included the latest default USA match rules and blocking keys.
Bug fixes
  • Users will no longer see an error when attempting to load large (over 10 million row) tables from SQL Server using the default driver.
  • 'GC overhead limit reached' error messages no longer appear when joining two files with a very large number of unique values.
  • Selecting files with similar names will now always select the correct file.
  • Filters containing invert logic in the multi-compare function no longer filter out all rows.
  • It's now possible to delete additional administrator users.
  • Datatype-specific functions won't output an error message when a null value is supplied as the input.
  • The mapping dialog will appear correctly when a workflow is imported where the source file is not present.
  • Improved the way that the source mapping dialog handles long names.
  • Certain functions (e.g. compare) will no longer standardize the supplied input strings.
  • You can now duplicate more than one column from the explorer menu of a Transform step.
  • Snapshots taken after renaming a workflow will now have the correct workflow name and be accessible (snapshots taken before the name change will retain their original workflow name).
  • 'DatabaseException' stack traces are no longer thrown when removing files via the UI.
  • Improved the performance of using snapshots as lookup tables.
  • The edit column dialog will no longer replace the tagging dialog in preview.
  • Batch all datasets license keys will now be accepted in the Data Studio 'Update License' dialog.
  • The correct error message from data-type specific functions will now be displayed for each row (rather than duplicating the first message every time).
  • Retrieving the base license on server start-up will no longer throw an OutOfMemory error when there is less than 2 GB available.
  • Dragging one function on top of another in the expression editor will not throw a NullPointer exception.
  • Fixed the 'Extract Date/Time' function to apply the correct time zone offset.
  • Viewing the filter on a Split step of an imported workflow before viewing the data will not throw a NullPointer exception anymore.
  • Changing the source file in a workflow and not viewing the data will now correctly update the created snapshots on execution.
  • Reporting on a workflow with no source file defined will now show a meaningful error message.
  • The Validate emails step no longer returns a false negative for certain domains.
  • The Find duplicates step will now correctly trigger a matching job to be run when placed after a Split step.
  • Fixed a refresh issue with the selected columns validation behavior on the Find duplicates step.
  • Removed the unused scale and precision options from columns in preview and configure.
Known issues
  • Database
    • Chinese characters or numeric values as column headers in source data are standardized incorrectly on load.
  • Profiling
    • The 'Documented type' column is empty in Profile view.
  • Grouping
    • Grouping columns which contain large numbers of unique values may incorrectly return 0 rows (this occurred in testing between 500k to 1M unique values with default memory settings).
    • Re-arranging columns in the side menu may result in unexpected results.
  • Transform step
    • The 'Download as csv' option doesn't reflect columns being moved/re-ordered unless the grid is re-opened.
  • Validate and Join steps
    • These known issues stem from the fact that both steps store their rules/key columns in the previous workflow step.
      • Using a Validation step immediately after a Join, Splice or Union step may only show the columns originating from the first of the two inputs in choosers.
      • Validate and Join steps will not save rules or join columns when used immediately after the scripting (JS / R / Python) step.
      • Validation rules can be lost when the following steps precede the Validation step: Union, Multi-View, Splice, Validate emails, Chart, Use snapshot and custom steps.
      • Deleting a workflow step immediately preceding a Validation step will cause the validation rules to be lost.
      • A Join step applied immediately after any of the following steps will cause the key columns to be lost, resulting in an invalid join: Find duplicates, Script, Use Snapshot, Union, Multi-View, Splice, Validate emails and Chart.
      • Using a Join step after a step with multiple outputs will only allow you to store one set of key columns. If you wish to perform two different joins on the two outputs, you will have to insert a Transform step in between this step and the Join step(s).
  • Lookups
    • Attempting to edit a column that is already being used in a lookup causes the edit to fail without showing an error (the only indication of this is an exception in the log).
    • When performing a lookup transformation in a workflow on data which has been joined, the lookup values returned only refer to the original source file rather than the joined rows. This will show as incorrect or missing results for Lookup list, Lookup min/max and Lookup first/last transformation functions.
  • Workflow Designer
    • Undo and redo actions exhibit unexpected behavior in some cases.
    • JDBC column mappings for exports can't be deleted after saving. To make changes, you will have to re-save the Export step and start again.
    • The configuration changes (e.g. data tags) to source files in workflows aren't updated when changed in Data Explorer. To pick up the new configuration, they will have to be deleted and then replaced.
    • The Take Snapshot step doesn't ghost out 'Show data' when the input becomes invalid. Clicking on it will display a blank view.
    • Installed dataplus items that aren't licensed will still appear in the list of 'Additional data' on the Validate addresses step. Selecting unlicensed dataplus items will return empty columns.
  • Character encoding
    • The 'Download as .CSV' option doesn't export multi-byte characters.
    • The Export step will always default to the Windows-1252 character set. Trying to export other character sets will show an unmappable character exception. You can fix this by changing the character set in the advanced settings of the Export step.
  • Snapshots
    • Renaming a workflow after having taken snapshots won't assign the existing snapshots to the new workflow name.
  • Scripting
    • Multiple R script steps in a workflow will result in unexpected results.
  • User management
    • It's currently not possible to create a user with a same name as a previously deleted user.
  • UI
    • Setting the browser zoom to 110% may cause the bottom bar to be cut-off.
    • Reordering columns after using Fix First Columns results in column data appearing in incorrect locations.
    • Users will occasionally be automatically logged out when configuring files.
  • SQL Server
    • Connecting to multiple SQL Server instances on the same server causes the instance name to be ignored.
  • Date/time/time zone
    • In some cases, changing the locale settings on a file won't change how dates are interpreted.
  • Data Explorer
    • When changing the delimiter in Preview and Configure, the column counts and data types are not updated automatically.

May 18, 2018

This release includes a number of high priority bug fixes. Download Data Studio 1.1.3. Download ODBC drivers 1.1.3.

Bug fixes
  • The ODBC drivers installer now creates correct registry settings for the 32-bit driver.
  • We've made several fixes related to SQL Server:
    • You can now preview and load data from other schemas, not just the connected user's default schema.
    • You can now use Microsoft's latest SQL Server driver (mssql-jdbc-6.2.2.jre8.jar) to create custom connections.
    • You can now load views from SQL Server.
  • The JDBC exports will now log a single representative SQL statement for exports when debugging is enabled.
  • Applying an invalid Data Studio license is now handled correctly and doesn't prevent further use of the product.
  • Defining a filter on a Split step after the Validate Addresses step will no longer throw a NullPointer exception.
  • Clicking on column choosers for files with over 100 columns will no longer throw an IndexOutOfBounds exception.
  • Importing a workflow from a previous version will no longer throw a NullPointer exception.
Known issues
  • Using Firefox, IE and Edge browsers might cause various UI issues and inconsistent behavior because of different implementations of the HTML5 standard. We therefore strongly recommend using Chrome to access Data Studio as currently, this is the only fully supported browser.
  • The Swagger UI page for the Find duplicates API (used as an integration guide for searching/maintenance operations) is currently showing incorrect sample request models for the search by fields and transactional add/update endpoints. See the tutorials for examples.
  • Blank rows can be returned in the Harmonize step when the chosen score column does not exclusively contain numeric values.
  • Non-ASCII characters and numeric values as column headers in source data are standardized incorrectly on load.
  • The Download as csv option in the Transform step doesn't reflect columns being moved/re-ordered until the step is re-opened.
  • Browser zooming may cause the bottom menu bar to be cut-off.
  • The Take snapshot step doesn't disable Show data when its input becomes invalid.
  • Deleting Metro2 and JSON sources which are already in use in workflows may fail to display the mapping dialog.
  • When configuring files loaded by a custom parser, complete and apply any parser-specific configuration before proceeding to configure columns and auto-tagging options. Otherwise, the configuration will be lost when changes are made which can alter the set of available columns.
  • Auto-tagging of data doesn't work for Excel files.
  • Workflow execution schedule is reset if the server is restarted before all the scheduled jobs have been run.
  • When using a trigger to replace a workflow's source, manual configuration of the original source file is not applied.
  • "Windows Defender SmartScreen prevented and unrecognized app from starting" warning is displayed on some Windows systems when executing the installer.
  • Workflows containing multiple JDBC export steps can error during execution in some scenarios.
  • If SSL has been configured for the Find duplicates step, using the embedded server will not work. A workaround is to install a remote server instance.

May 4, 2018

This release includes a number of high priority bug fixes as well as several new features. Download Data Studio 1.1.2. Download ODBC drivers 1.1.2.

New features
  • A new 'Current limits' column has been added to the My Aperture Data Studio licenses page showing the server's current licensed limits.
  • A new option in the Analyze trends step now allows you to display categories evenly along the X-axis when displaying results in a chart.
  • A new 'Clear saved results' option in the Find duplicates step now allows you to clear results of previous step execution.
  • JDBC data source connections can now be debugged using the 'Debug connection' option in the Create/edit data source dialogs.
  • SDK example steps have been updated to showcase the new functionality introduced in recent releases.
  • SDK now supports an option to allow the selection of multiple columns.
  • Improvements of the license request process to cover regional business operations.
  • Updated to the latest Standardize knowledge bases to v4.0.1 for the following countries and territories:
    • AUS (Australia)
    • BRA (Brazil)
    • CAN (Canada)
    • FLK (Falkland Islands)
    • GBR (Great Britain)
    • GGY (Guernsey)
    • GIB (Gibralter)
    • IDN (Indonesia)
    • IMN (Isle Of Mann)
    • JEY (Jersey)
    • MYS (Malaysia)
    • NZL (New Zealand)
    • PRI (Puerto Rico)
    • SGP (Singapore)
    • THA (Thailand)
    • USA (United States of America)
Bug fixes
  • The Group step menu is now updated when an aggregate column is added.
  • Changes are now saved when editing hidden column details on a Join step.
  • Update for a MongoDB data source now doesn't throw a null pointer error when pre-check is turned on.
  • MongoDB now reconnects after a server restart.
  • Drilling down from an integer value in a unique values list in Profile now returns results.
  • Cached data is now deleted when the source file is removed via the UI.
  • Hiding columns or quick filtering in any step connected to the 'Failing rows' output of a Split step no longer causes incorrect data to be returned.
  • Using the 'Replace' action to replace a source file in a workflow will now update the file to the intended one.
  • 'Distinct' button on an aggregate column is now not reset in the UI when the aggregate is edited.
  • Duplicate files now don't appear in 'My files' when uploading a file on new database creation, prior to a restart.
  • On first login as LDAP users are not asked to change their password anymore.
  • Amazon S3 files can now be loaded from a sub-directory.
  • Removing a file from a multi-sheet Excel spreadsheet and viewing other sheets now doesn't cause an error.
  • The 'Hash code' function now returns results.
Known issues
  • Database

    • Chinese characters or numeric values as column headers in source data are standardized incorrectly on load.
  • Profiling

    • The 'Documented type' column is empty in Profile view.
  • Grouping

    • Grouping columns which contain large numbers of unique values may incorrectly return 0 rows (this occurred in testing between 500k to 1M unique values with default memory settings).
    • Re-arranging columns in the side menu may result in unexpected results.
  • Transform step

    • The 'Download as csv' option doesn't reflect columns being moved/re-ordered unless the grid is re-opened.
  • Find duplicates step

    • The Find duplicates step immediately after a Split step will not execute when the workflow is executed.
  • Validate and Join steps

    • These known issues stem from the fact that both steps store their rules/key columns in the previous workflow step.
    • Using a Validation step immediately after a Join, Splice or Union step may only show the columns originating from the first of the two inputs in choosers.
    • Validate and Join steps will not save rules or join columns when used immediately after the scripting (JS / R / Python) step.
    • Validation rules can be lost when the following steps precede the Validation step: Union, Multi-View, Splice, Validate emails, Chart, Use snapshot and custom steps.
    • Deleting a workflow step immediately preceding a Validation step will cause the validation rules to be lost.
    • A Join step applied immediately after any of the following steps will cause the key columns to be lost, resulting in an invalid join: Find duplicates, Script, Use Snapshot, Union, Multi-View, Splice, Validate emails and Chart.
    • Using a Join step after a step with multiple outputs will only allow you to store one set of key columns. If you wish to perform two different joins on the two outputs, you will have to insert a Transform step in between this step and the Join step(s).
  • Lookups

    • Attempting to edit a column that is already being used in a lookup causes the edit to fail without showing an error (the only indication of this is an exception in the log).
    • When performing a lookup transformation in a workflow on data which has been joined, the lookup values returned only refer to the original source file rather than the joined rows. This will show as incorrect or missing results for Lookup list, Lookup min/max and Lookup first/last transformation functions.
  • Workflow Designer

    • Undo and redo actions exhibit unexpected behaviour in some cases.
  • Character encoding

    • The 'Download as .CSV' option doesn't export multi-byte characters.
    • The Export step will always default to the Windows-1252 character set. Trying to export other character sets will show an unmappable character exception. You can fix this by changing the character set in the advanced settings of the Export step.
  • Snapshots

    • Renaming a workflow after having taken snapshots in the workflow causes the snapshots to be created in the wrong location and to be inaccessible.
  • Scripting

    • Multiple R script steps in a workflow will result in unexpected results.
  • User management

    • Creating a new user from the 'Create a new user' button under teams (Configuration > Teams > [Team Name] > Create a new user) will not bring up the 'Create a password' dialog after creating the user. This means that users made here can never be logged in. Note that this does not affect users created under the 'Users' in Configuration.
    • It's currently not possible to create a user with a same name as a previously deleted user.
  • UI

    • Setting the browser zoom to 110% may cause the bottom bar to be cut-off.
    • When mapping sources after having imported a workflow without the source file present, the mapping dialog does not handle long source file names or data source names well.
    • Reordering columns after using Fix First Columns results in column data appearing in incorrect locations.
    • Users will occasionally be automatically logged out when configuring files.
  • SQL Server

    • It's currently not possible to create a custom connection using the SQL driver from Microsoft (mssql-jdbc-6.2.2.jre8.jar).
    • Users will see an error when attempting to load large (over 10 million row) tables from SQL Server using the default driver.
    • It's possible to preview 'views' from SQL Server but they can't be loaded.
    • It's currently not possible to preview or load data from SQL Server that's in any schema other than the user's default schema.
  • Date/time/time zone

    • In some cases, the Extract Date/Time functions won't apply the correct time zone offset.
    • In some cases, changing the locale settings on a file won't change how dates are interpreted.
  • ODBC

    • When installing on 64-bit machines, the 32-bit ODBC driver's registry settings are not updating correctly. This means that the 32-bit driver can't be used and you'll see the 'Driver's SQLAllocHandle on SQL_HANDLE_ENV failed' error. The workaround is to use the drivers from a previous release. Download ODBC drivers 1.1.1.

Apr 12, 2018

This release includes a number of high priority bug fixes. Download Data Studio 1.1.1. Download ODBC drivers 1.1.1.

Bug fixes
  • Overall Datatype and Dominant Datatype in profile are now correct for decimal columns
  • Find duplicates step no longer throws a Null Pointer error when choosing columns to analyse after manually connecting to source
  • Exception no longer returned when using non-alphanumeric values in email validation input
  • Previews from Redshift and PostgreSQL databases now significantly faster
  • Grouping after a join is no longer slowed down by incorrect compression cache block for grouping index, giving much improved performance
  • Preview of multi-sheet .xlsx files no longer displaying blank data
  • Validation and splice column choosers now correctly show second input's columns
  • Deleting a user now deletes all their uploaded files (in My Files)
  • A Snapshot of a Profile now displays datatype names correctly, rather than as integers
  • Distribution column is now populated when drilling down to formats for a column in Profile
  • Multiple license keys/codes can now be added at the same time and we no longer log out the user after update
  • ComponentSGF Batch layout template has been fixed
  • The installer has been updated to install .net Framework 4.6.2 which the latest version of Standardise now targets
Known issues
  • Database

    • Cached data is not deleted when the source file is removed via the UI
    • Chinese characters or numeric values as column headers in source data are standardised incorrectly on load
  • Profiling

    • Drilling down from an integer value in a unique values list in Profile returns no rows
  • Grouping

    • Grouping columns which contain large numbers of unique values may incorrectly return 0 rows (this occurred in testing between 500k to 1M unique values with default memory settings).
  • Validate and Join steps

    • These known issues stem from the fact that both steps store their rules/key columns in the previous workflow step.
    • Using a Validation step immediately after a Join, Splice or Union step may only show the columns originating from the first of the two inputs in choosers.
    • Validate and Join steps will not save rules or join columns when used immediately after the scripting (JS / R / Python) step.
    • Validation rules can be lost when the following steps precede the Validation step: Union, Multi-View, Splice, Validate emails, Chart, Use snapshot and custom steps.
    • Deleting a workflow step immediately preceding a Validation step will cause the validation rules to be lost.
    • A Join step applied immediately after any of the following steps will cause the key columns to be lost, resulting in an invalid join: Find duplicates, Script, Use Snapshot, Union, Multi-View, Splice, Validate emails and Chart.
    • Using a Join step after a step with multiple outputs will only allow you to store one set of key columns. If you wish to perform two different joins on the two outputs, you will have to insert a Transform step in between this step and the Join step(s).
  • Lookups

    • Attempting to edit a column that is already being used in a lookup causes the edit to fail without showing an error (the only indication of this is an exception in the log).
    • When performing a lookup transformation in a workflow on data which has has been joined, the lookup values returned only refer to the original source file rather than the joined rows. This will show as incorrect or missing results for Lookup list, Lookup min/max and Lookup first/last transformation functions.
  • Split step

    • Hiding columns or quick filtering in any step connected to the 'Failing rows' output of a Split step causes incorrect data to be returned.
  • Workflow Designer

    • Undo and redo actions exhibit unexpected behaviour in some cases.
  • Snapshots

    • Renaming a workflow after having taken snapshots in the workflow causes the snapshots to be created in the wrong location and to be inaccessible.
  • User management

    • Creating a new user from the 'Create a new user' button under teams (Configuration > Teams > [Team Name] > Create a new user) will not bring up the 'Create a password' dialog after creating the user. This means that users made here can never be logged in. Note that this does not affect users created under the 'Users' in Configuration.
  • UI

    • Setting the browser zoom to 110% may cause the bottom bar to be cut-off.
    • When mapping sources after having imported a workflow without the source file present, the mapping dialog does not handle long source file names or data source names well.
    • Using the 'Replace' action to replace a source file in a workflow may not update the file to the intended one.
    • 'Distinct' button on an aggregate column is reset in the UI when the aggregate is edited
    • Duplicate files appear in My files when uploading a file on new database creation, prior to a restart
    • Reordering columns after using Fix First Columns results in column data appearing in incorrect locations
  • JDBC

    • Update for a MongoDB data source throws a null pointer error precheck is turned on

Mar 29, 2018

This release includes new CDM steps, bringing data matching as well as address and email validation into Aperture Data Studio.

We've also fixed a large number of bugs, added the ability to track data quality over time, publish to ODBC clients, extended the functionality in the SDK and much more.

New features
  • Loading and profiling performance improvements
  • Separate loading and profiling stages
  • A new Profile step added: allows you to profile data in workflows
  • Workflow execution now automatically re-loads the data sources used in the workflow
  • Multiple workflow executions can now run in parallel
  • A new Find duplicates step added: allows you to find potential duplicates in data (powered by Experian Match. This release also includes the latest sample rules and blocking keys for Great Britain and Australia.)
  • A new Validate addresses step added: allows you to clean and enrich postal address data (powered by Experian Batch v7.50). This release supports the following data sets: APR, AUG, AUS, CAN, DEU, DNK, FRA, FRP, IRL, GBR, GBR DataPlus, LPG, LUX, NLD, NZL, SGF, SGP, USA, USA DPV.
  • A new Validate emails step added: allows you to validate email address formats or domains
  • A new Analyze trends step added: allows you to view data changes over time
  • Three new steps added for snapshots:
    • Take snapshot - saves a versioned copy of your data
    • Use latest snapshot - get the latest version of the snapshot
    • Use snapshot range - get the combined range of several snapshot versions
  • Option to publish snapshots to ODBC clients
  • Various improvements to the custom step SDK
  • A new Python script step added
  • Improved password management
  • New licensing model
Bug fixes
  • Profile now reports uniqueness of values after standardization has been applied
  • Min and max values are now correct for date columns
  • Count and Grouping count now gives the option to include/exclude nulls
  • Aggregate Count is now correct for Null values
  • the Replace First function now correctly escapes input
  • You can now update via JDBC when a key column is defined in the mapping
  • Auditing now tracks the REST API actions
  • Workflow Modified auditing setting now works as expected
  • You can now use the concatenate function to add a plus sign at the start of a value which is a numerical digit
  • Lookup results are now returned when lookup column has been transformed in a previous workflow step
  • The chart in Chart step is created correctly when a null value appears in the data label column
  • The validation thresholds no longer change to 50% when opening the validation dialog after making a rule and viewing results
Known issues
  • Profiling

    • Format statistics for columns containing dates may be incorrect (as they may be standardized / parsed before profiling)
    • Distribution column is blank when you drill down to formats for a column in profile
    • Overall Datatype and Dominant Datatype will never display decimal
    • A snapshot of a profile displays datatypes as integers instead of translating to the string values
  • Grouping

    • Grouping columns which contain large numbers of unique values may incorrectly return 0 rows (this occurred in testing between 500k to 1M unique values with default memory settings).
  • Find Duplicates step

    • A NullPointerException occurs when choosing columns to analyze after manually connecting the step to a data source.
  • Validate Emails step

    • A class cast exception occurs if you have any non-alphanumeric value in your email validation column.
  • Validate and Join steps

    • These known issues stem from the fact that both steps store their rules/key columns in the previous workflow step.
    • Using a Validation step immediately after a Join, Splice or Union step may only show the columns originating from the first of the two inputs in choosers.
    • Validate and Join steps will not save rules or join columns when used immediately after the scripting (JS / R / Python) step.
    • Validation rules can be lost when the following steps precede the Validation step: Union, Multi-View, Splice, Validate emails, Chart, Use snapshot and custom steps.
    • Deleting a workflow step immediately preceding a Validation step will cause the validation rules to be lost.
    • A Join step applied immediately after any of the following steps will cause the key columns to be lost, resulting in an invalid join: Find duplicates, Script, Use Snapshot, Union, Multi-View, Splice, Validate emails and Chart.
    • Using a Join step after a step with multiple outputs will only allow you to store one set of key columns. If you wish to perform two different joins on the two outputs, you will have to insert a Transform step in between this step and the Join step(s).
  • Lookups

    • Attempting to edit a column that is already being used in a lookup causes the edit to fail without showing an error (the only indication of this is an exception in the log).
  • Snapshots

    • Renaming a workflow after having taken snapshots in the workflow causes the snapshots to be created in the wrong location and to be inaccessible.
  • User management

    • Creating a new user from the 'Create a new user' button under teams (Configuration > Teams > [Team Name] > Create a new user) will not bring up the 'Create a password' dialog after creating the user. This means that users made here can never be logged in. This does not affect users created under the 'Users' in Configuration.
    • Deleting a user does not delete their uploaded files from the server's file system.
  • Licensing

    • Cannot apply multiple license keys simultaneously.
    • User is logged out after adding license keys/update codes in the pre-release phase.
  • UI

    • Setting the browser zoom to 110% may cause the bottom bar to be cut-off.
    • When mapping sources after having imported a workflow without the source file present, the mapping dialog does not handle long source file names or data source names well.
    • Using the 'Replace' action to replace a source file in a workflow may not update the file to the intended one.
  • JDBC

    • When previewing large tables (above ~5 million rows), preview rows are returned very slowly for some DBMSs (PostgreSQL, Redshift) due to an unnecessary row count query.
    • JDBC Preview Row Count has a maximum value of 1000, but this isn't clear.

Feb 26, 2018

This release includes various improvements and bug fixes.

New features
  • PSV data files (.psv) are now supported
  • The default administrator password has been reset to administrator
  • Password management has been improved (password policy, reset at first login, lockout)
Bug fixes
  • Validation results when a filter is applied now show correct results
  • Moving or copying the database doesn't cause a license error
  • Transforming the result of a join doesn't cause columns to be unmapped
  • Group step doesn't cause columns to be hidden
  • Correct results are now returned after transforming the join column
  • Join output name is now updated after the refresh
  • You will now be prevented from uploading unsupported XML and JSON files
  • Adding more than one column from the transformation menu now works as expected
  • You will now be prompted to set a password on first log in, if enabled

Feb 6, 2018

This release includes various improvements and bug fixes.

New features
  • The SAS data files (.sas7dbat) are now supported
  • Charts (excluding pie charts) can now display up to 1,000 separate data points (previously 100)
  • Made several SDK enhancements:
    • The SDK now has its own exception handler 'SDKException'
    • The SDK GitHub project is now also included in the Aperture installation
  • Made several security enhancements:
    • The default administrator login credentials have been made more secure. It is now Pg994_8FQ2U%VM++
    • The account lockout policy has been implemented to guard against brute force password guessing attacks
Bug fixes
  • Further improvements to the responsiveness when configuring/editing validation rules
  • You are now able to create transformation columns that use the UK National Insurance Number or ISBN business constants
  • Circular references can no longer be created when transforming columns
  • Files are now not duplicated in the 'My files' folder
  • A lookup table can now be selected when transforming from the side menu
  • The Group step no longer causes columns to be hidden
  • The default settings for Redshift JDBC connections have been improved
  • Extracted integers can now be parsed using the 'convert to integer' function
  • Improved Swagger documentation for REST API
  • The Hive JDBC driver has been updated to version 6.0.0.000057 (F000095.U000043)

Jan 8, 2018

This release includes various improvements and bug fixes.

New features
  • You can now load Excel and character delimited files directly from HDFS (using Hadoop API version 3.0.0)
  • Added several enhancements to the SDK/custom step creation. You can now:
    • drag and drop new custom steps into Aperture Data Studio without restarting the server
    • hide custom steps from view``
    • get a row of values from an input in one call
    • use a sample step that illustrates multi-threading
  • A 'Row count' function has been added allowing you to return the row count for the current view
  • You can now group by, filter and create expressions from aggregate columns
  • The installer package has been upgraded to Java JDK 8 update 151
Bug fixes
  • You can now rename columns in preview
  • Changing a country in 'Format phone number' function now works as expected
  • Executing workflows that contain Validation steps now takes less time
  • Correct values are now returned for 'Matches expression' when renaming/duplicating columns
  • Validation rules can now be turned off
  • Increased responsiveness when configuring/editing validation rules
  • You can now group an aggregate column by using another Group step
  • Auditing now tracks REST API actions

Dec 5, 2017

This release includes product name change as well as various improvements and bug fixes.

New features
  • Changed the product name from DataX to Aperture Data Studio
  • Introduced a new 'Duplicate' option allowing you to create copies of saved workflows
  • Made several improvements to the workflow execution behaviour:
    • each Export/Script (R) step now displays an individual progress report
    • the execution of Script (R) steps now runs in parallel to other Export/Script (R) steps
    • all information on Export/Script (R) steps is now displayed in one dialog
    • more detailed information is now provided if a failure occurs
    • you can now download the exported files from the job completion dialog
  • Implemented REST API endpoints to delete, import and export workflows
  • Minified the distributed js/html/css files as part of the build
  • Improved the 'UK Postcode' business constant to cover more UK postcode areas
Bug fixes
  • Aggregate functions no longer return blank values when used on some groupings
  • Fixed a repository corruption when the server is unexpectedly shut down
  • The profile drilldown now handles Unicode values (e.g. accented characters) better
  • The 'Remove noise' function does no longer convert results to upper case
  • The 'Replace' function with a null search value now behaves as expected
  • The default value of the 'End of line precedence' server setting is now 'Off'
  • Editing a script used in a workflow will now behave as expected
  • The Split step will no longer lose its connections when an upstream connection is removed
  • The grid and configure menu will now always show the same number of columns visible
  • Workflow description tooltips will no longer get 'stuck' on screen
  • Other application stability improvements and minor bug fixes

Oct 31, 2017

This is the first release of Aperture Data Studio.

Main features
  • Browser-based application
  • Intuitive user interface
  • Support for local files, Amazon S3, Azure and JDBC data connections
  • Fast discovery and profiling of data
  • Interactive and re-usable workflows
  • Flexible data validation
  • Powerful data transformations
  • Out of the box data visualisation via graphs and charts
  • Custom workflow step creation using an SDK
  • REST API with Swagger documentation