Import and refresh data

Importing data into Data Studio is easy, simply add new or replace an existing Dataset.

Source types that support Auto-refresh sources

Auto-refresh sources will work with External systems where the source table is still available. File uploads and dropzone files are excluded because once a Dataset is created, the origin file is not retained.

Valid data source types for Auto-refresh sources are JDBC, Azure BLOB, HDFS, Amazon S3, and SFTP.

Views will inherit the refresh setting from their base Dataset, so a Workflow that uses a View as source can also be configured to refresh that View's underlying Dataset on execution, even if the view is shared to another Space.

Enable Auto-refresh sources

There are three settings required in order to enable Auto-refresh sources:

When creating or editing a Dataset

These settings have a cumulative effect and you need all three to be enabled in order for data to be refreshed.
When all of these options are enabled, and the Workflow is executed, it will run with the latest data from the origin.

Provided your dataset is from a supported data source type:

  1. Go to Datasets.
  2. Find the Dataset or Add Dataset.
  3. On the Edit details page, tick the setting Allow auto-refresh.

When creating or editing a Workflow

On the Source step, tick Allow auto-refresh. On this step you can also control what the behaviour should be if the source refresh fails during workflow execution. Checking Stop execution when refresh failed will cause a workflow execution to fail if the Dataset could not be refreshed from the External System.

When executing the Workflow

In the Run Workflow dialog or Schedule:

  1. From the Workflow, press Run and tick Refresh sources.
  2. In the edit/ create page for a Schedule tick Refresh sources.

Having this level of control allows for scenarios such as:

  • Ensuring the latest data is always used during scheduled or manual executions of a Workflow.
  • The creation of a Dataset that can never be automatically reloaded (e.g. definitive reference data).
  • The creation of a View that can never be automatically reloaded.
  • A one-off test execution of a Workflow that does not reload Datasets.
  • A Schedule that, regardless of the Workflow configuration, will never reload Datasets.