Configure External systems

An External system is a remote database system or file system which Data Studio can use, either to extract (fetch) data from, or to export data back to.

Data can be loaded from an External system into a Dataset, or the External system can be defined as a target for a workflow's Export step.

The supported systems are:

  • Java Database Connectivity (JDBC)
  • Apache Hadoop HDFS
  • Amazon S3
  • Microsoft Azure Blob Storage (for Azure Data Lake Storage Gen2)
  • SFTP

Create an External system

  1. Navigate to System -> External systems and select add new External System.
  2. Give the system a name and optional description.
  3. Define the connection type. If JDBC is selected, additionally select the DBMS.
  4. Complete the fields necessary to configure the connection.
  5. You will usually need to define user credentials (such as a username and password) to connect to your system. Enter Test settings to confirm that you can successfully connect. These test credentials will not be saved.
  6. Either Finish to save the system configuration, or Finish and add credentials to both save the current system and create one or more sets of Connection credentials, which can be associated with Data Studio users and used to access data in the External system.

When accessing any External system, a set of valid credentials is normally required to ensure that the operation is permitted for the user.

A credential is the access or authentication information for an external system, such as username/password combination, or API access token. The credentials used to access an external system may affect what data is available, either by limiting access to specific tables, or filtering out rows from tables. This means that when different credentials are used to retrieve data, different results may be obtained.

Use credentials

When interacting with an External system (to either fetch data, or push it in) credentials are pinned to a particular usage of the External system, so that the same set of credentials are used consistently.

Load data from an External system

When adding a Dataset from an External system (loading data), you will need to select a credential to be associated with the Dataset. The credentials available for selection are only those that your user has permission to use.

Once set, this credential will be used whenever the Dataset is refreshed by any user, even for those with no direct permission to access the credentials.

Find out how to create a Dataset from an external system.

Space level restrictions will still be enforced, so you will need to ensure that the Space where the Dataset is being added to is suitably secured. Another user with access to the Space can edit the Dataset to change the credential used, for example to one that they have been given permission to use.

Export to an External System from a Workflow

When exporting records to external systems using the Export step in the workflow designer, the user will need to specify the credential to be used. The credential will need to have the necessary permissions to write to the external database.

Another user with access to the Space can modify the credentials used in the workflow's Export step. Once the credential has been set, this credential will be used whenever the workflow is executed (including scheduled execution), even by those with no direct permissions to access the credentials.

Find out how to export data to an external system.

Manage credentials

Create an External system with credentials

To create an external system with credentials:

  1. Go to System > External systems.
  2. Click Add new external system.
  3. Populate connection details for the external system to be connected.
  4. Click Finish and add credentials.
  5. Enter credential details (for example, a username and password that allow access to the external system).
  6. Select users who are granted access to the credential.
  7. Specify the permission type for the user.
  8. Click Apply.

Credential permission types

There are 3 credential permission types in Data Studio as described in the table below.

All permission types allow a user to use the credential to extract (fetch) data from, or to export data back to an associated external system.

Permission type Description
Use The user cannot edit/delete the credential and cannot execute SQL queries when importing a Dataset.
Use with query The user cannot edit/delete the credential but can execute SQL queries when importing a Dataset.
Manage The user can edit/delete the credential and can execute SQL queries when importing a Dataset.

Import and export External system metadata

External system metadata (connection details and credential names) can be exported and imported from a Space.

Configure password and user permissions

After importing an External system's metadata into another environment, you will need to edit the credential to enter the correct password and assign the required user access permissions.

  1. Go to System > External systems.
  2. Click on the imported external system.
  3. Select the imported credential.
  4. Enter credential details (for example the username and password).
  5. Select users who are granted access to the credential.
  6. Specify the permission type for the user.
  7. Click Apply.

Connect to a system using JDBC

Data Studio provides support for connecting to External systems which can be used either as data sources or as the target of an export operation. One of these external system types is a DBMS system accessible via JDBC.

There are two types of DBMS JDBC driver recognized by Data Studio:

  1. Progress DataDirect JDBC drivers that ship with the product. These are supported directly by Experian and are configured in the file datadirectJdbc.json. These jar files all have the prefix "DS", e.g. DSsforce.jar.
  2. Custom JDBC drivers that can be used by Data Studio. These drivers may be downloaded from a database vendor's website, provided by a third party, or developed by an in-house team. Common native drivers (i.e. those supplied by the DBMS vendor) may be configured in the file customJdbc.json. These drivers are not supported directly by Experian.

The two JDBC configuration files, datadirectJdbc.json and customJdbc.json, can be copied into same directory as server.properties (i.e. the server's working directory) and modified in order to customize the driver's connection parameters. These configuration files provide the Data Studio UI with information about the JDBC drivers and the format (and properties) required to build a valid connection string for each DBMS. Drivers that have not been configured in either of these files will require the user to enter a custom connection string when they create an external system that uses the driver.

Changes made to customJdbc.json are retained when the Data Studio is upgraded.

Data Studio includes drivers that allow you connect to a wide range of external systems.

A wide range of JDBC drivers are included in the Data Studio deployment.

You can also download and deploy your own JDBC (Java Database Connectivity) driver, which must be a type 4 driver and be supported by Java 8. Once a custom driver is deployed, you will be able to create and manage External system connections just as you do with the native drivers. To deploy a custom driver:

  1. Download the JDBC driver you want to use. This will typically be a single file with a .jar extension.
  2. Stop the Data Studio Database service.
  3. Copy the driver to the \drivers\jdbc folder in the database root directory. By default this will be C:\ApertureDataStudio\drivers\jdbc. If a driver is comprised of multiple jar files then the auxiliary jars should be placed in the lib subdirectory.
  4. Restart the Data Studio Database service.
  5. Log in to Data Studio as a user that has the "Manage Connections to External Systems" capability, and configure a connection using the new driver in System > External systems > Add new external system.
  6. With the type JDBC selected, the DBMS dropdown should now display the new custom driver, with the word "Custom" prefixing the driver name and version.
    • If the driver isn't configured in either datadirectJdbc.json or customJdbc.json then it will appear in the UI with the name "CUSTOM_N (source jar file name) version" where N is an index of each unknown driver (e.g. "CUSTOM_1 (csvjdbc-1.0-36.jar) 1.0").
    • If the driver is already configured in customJdbc.json then it will appear in the UI with the name "Custom DBMsname version" (e.g. "Custom Snowflake 3.11").
  7. Follow the driver's documentation to configure the connection.

Native drivers

The following DataDirect drivers are included with Data Studio and are supported directly:

DBMS name Driver file name
Amazon Redshift DSredshift.jar
Apache Hive
  • Microsoft Azure HDInsight
  • Hortonworks Distribution for Apache Hadoop
  • Cloudera's Distribution Including Apache Hadoop (CDH)
  • Amazon Elastic MapReduce (Amazon EMR)
  • IBM BigInsights
  • MapR Distribution for Apache Hadoop
  • Pivotal HD Enterprise (PHD)
DShive.jar
Atlassian Jira DSjira.jar
Autonomous Rest (REST API data sources) DSautorest.jar
Cassandra DScassandra.jar
DB2 DSdb2.jar
Google BigQuery DSgooglebigquery.jar
Greenplum
  • Pivotal Greenplum
  • Pivotal HDB (HAWQ)
DSgreenplum.jar
Informix DSinformix.jar
MongoDB DSmongodb.jar
Microsoft Dynamics 365 DSdynamics365.jar
MySql DSmysql.jar
Oracle DSoracle.jar
Oracle Eloqua DSeloqua.jar
Oracle Sales Cloud DSoraclesalescloud.jar
Oracle Service Cloud DSrightnow.jar
PostgreSQL DSpostgresql.jar
Progress OpenEdge DSopenedgewp.jar
Salesforce
  • Salesforce.com
  • Veeva CRM
  • Force.com Applications
  • Financial Force
DSsforce.jar
Spark SQL DSsparksql.jar
SQL Server
  • Microsoft SQL Server
  • Microsoft SQL Azure
DSsqlserver.jar
Sybase DSsybase.jar

Full documentation for these drivers can be found on the Progress DataDirect Connectors site, under the relevant source.

Third party drivers

The following Native drivers are preconfigured by Data Studio in the file customJdbc.json:

DBMS name Driver file name pattern
Oracle (Thin client driver) ojdbc.*
SQL Server (Microsoft driver) (ms)?sql[-]?jdbc.*
MySQL ( Connector/J driver) mysql-connector.*
DB2 db2jcc.*
Informix ifxjdbc.*
PostgreSQL postgresql.*
Apache Hive hive-jdbc.*
MongoDB mongodb.*
Amazon Redshift (Amazon Redshift driver) redshift.*
dBase dbf.*
Derby derby.*
Microsoft Dynamics (CData driver) .*dynamicscrm.*
Vertica vertica.*
MariaDB mariadb.*
HyperSQL hsqldb.*
SQLite .*sqlite.*jar
H2 h2-.*
Presto presto.*
SAPDB sapdbc.*
SAP HANA ngdbc.*
Amazon Athena Athena.*
CIS with Business Objects csjdbc.*
Neo4j neo4j.*
Google BigQuery (Magnitude Simba driver) GoogleBigQueryJDBC.*
Netezza nzjdbc.*
HDP ddhybrid.jar
Snowflake snowflake-jdbc.*

Native drivers that are not configured (i.e. have no configuration details in customJdbc.json) must have the connection string configured manually in the UI.

Create a JDBC External system

In Data Studio, JDBC drivers are associated with an External system. The drivers that are currently known to Data Studio will be listed in the DBMS field when creating an External system with the "JDBC" type.

Once you've selected the DBMS, fields you must supply to create the connection will depend on how the selected DBMS has been configured in datadirectJdbc.json or customJdbc.json. Unconfigured DBMSs will provide a field for the connection string that the driver will use. The format of this string will be provided in the driver vendor's documentation.

Advanced settings

Data Studio will build the connection string for configured DBMSs, although there may be occasions when you may need to override the prebuilt connection string to provide one of your own.

Additional optional fields are available by pressing the Advanced Settings button:

  1. Connection Timeout. This timeout, expressed in seconds, defines how long the driver will wait for a connection request to be accepted. If the remote database is on the end of a slow connection then use this field to increase the timeout.
  2. User Connection Properties. Add any additional driver properties here. Key is the property name, Value is the property value.
  3. Use custom connection string. The connection string is prebuilt in the format appropriate to the specific DBMS driver. Select this checkbox if you need to replace the prebuilt string with your own custom string.
  4. Debug connections. Selecting this checkbox will produce additional debug logging for the connection in the server's main log file.
  5. Schema. Some databases use a default schema. If a schema name is required and isn't the default name then use this field to specify the schema name.
  6. Force JDBC schema name. Tick this box to force the use of a schema name to fully qualify table names in JDBC queries. Use this when the user's default schema name differs from the schema you are attempting to connect to.
  7. Table Pattern. Configure the tables to display to Data Studio users. The default pattern is "%" i.e. accept all table and view names.
  8. Filter regular expression. By default all tables available with the given connection credential are exposed to Data Studio. You can further exclude irrelevant table and view names by specifying a regular expression for acceptable table names.
  9. Filter is case sensitive. Tick this box if the filter is case sensitive.
  10. Include Tables. This checkbox is ticked by default. Tables that exist in the JDBC system will be available.
  11. Include Views. Tick this checkbox to include database views in the list of available tables.

Test the connection

To test the connection, enter the necessary authentication credentials (such as a username and password) and click Test Connection. The test credentials you enter here are only used for testing the connection and are not saved.

If you received a Success message, click on Finish and add credentials to save the settings and proceed to create connection credentials for this external system connection.

JDBC Connection Troubleshooting

  1. Where the database uses a Schema name, if you want to access tables that are on different schemas, we would advise that you create a separate external system connection for each schema.
  2. Any connection errors will be reported in the UI, however more information will be available in the Data Studio server logfile.
  3. The native drivers can provide detailed logging of all operations. See Spy and Snoop logging.
  4. If additional connection properties are required, click on Show advanced settings, and then on the + button at the User connection properties to add a new key value pair row.

In the following documentation, for the sake of brevity, a driver's connection property key/value pair may appear as PropertyName=Value.

Connection parameters

  1. Hostname: Host name of the computer where Oracle Database is installed.
  2. Port: The JDBC port number that the DBMS server uses to listen for connection requests.
  3. SID: The System Id (optional).

The connection string template is: jdbc:experian:oracle://hostname:port;SID=SID;LoginTimeout=15;UseServerSideUpdatableCursors=true. Additional connection properties are detailed in the DataDirect Oracle JDBC driver documentation.

Authentication parameters

  1. Username: Oracle Database username
  2. Password

Using Service Name instead of SID

To connect to an Oracle server using a Service Name rather than SID, leave the SID field blank and specify the Oracle service name as a user connection property, ServiceName=my.server.com. This property is mutually exclusive with the SID property.

Use the service name to specify connections to an Oracle Real Application Cluster (RAC) because the nodes in a RAC system share a common service name. Service name is also typically used to connect to Pluggable databases (PDBs).

Troubleshooting

The specified SID (or service name) was not recognized by the Oracle server

Error messages:

  • ORA-12505 Connection refused, the specified SID () was not recognized by the Oracle server
  • ORA-12514 Connection refused, the specified service Name () was not recognized by the Oracle server

This error can occur if the Oracle database server is configured as RAC (multiple servers, with a load balancer). You may need to use the LoadBalancing=true and AlternateServers=(server2:server2port) user connection properties in the connection configuration.

Invalid identifier

Error:

  • ORA-00904: Invalid identifier

This error can occur when attempting to preview or load data into Data Studio, if the schema has not been defined in the External system connection and the table exists in more than one schema. To resolve, specify the schema in the connection's advanced settings, to avoid using the Oracle user's default schema.

Transliteration failed, reason: invalid UTF8 data

When attempting to load data from a table, no records are shown and the logs contain "Transliteration failed, reason: invalid UTF8 data". This issue occurs while processing the table, which contains data in a codepage other than UTF-8.

To resolve, set the CodePageOverride connection property, or alternatively use the Oracle thin driver as an alternative custom JDBC driver.

Connection parameters

The following fields are required for connection:

  1. Hostname: Hostname of the computer where the SQL server Database is installed. An instance name or port number may also be specified.
  2. Database: Database name.
  3. Schema: Schema name.

The connection string template is: jdbc:experian:sqlserver://;allowPortWithNamedInstance=true;DatabaseName=;LoginTimeout=15;UseServerSideUpdatableCursors=true;SelectMethod=cursor

Additional connection properties are detailed in the DataDirect SQLServer JDBC driver documentation.

Specify port or named instance

The hostname field can be used to define a port and instance name.

  • If using default port (1433) or instance (MSSQLSERVER), just enter the hostname.
  • To specify a port, enter the port number in the format "hostname:port".
  • To specify a named instance, enter the instance name in the format "hostname\instance".

When specifying the location of the SQL server instance, you should never provide both an instance name and port number together.

Azure SQL Server

If you're connecting to Microsoft Azure SQL two additional connection properties need to be added. This is because Microsoft Azure SQL requires communication using SSL.

  1. EncryptionMethod=SSL
  2. ValidateServerCertificate=false. If set to true, the certificate needs to be available in the database server trust store. The certificate must be issued by a trusted Certificate Authority. If set to false, there will be no validation on the certificate.

Authentication parameters

The SQL Server driver supports several different methods for establishing a connection. The default method is SQL Server authentication with a username and password.

To connect using Windows Authentication (NTLMv2), use the connection property AuthenticationMethod=ntlm2java. When adding credentials, enter the Windows credentials that will be used to connect to the Microsoft SQL server instance.

To use Azure Active Directory (Azure AD) authentication, set AuthenticationMethod=ActiveDirectoryPassword. In addition, a value must be specified for the HostNameInCertificate property, and the connection must be configured to use SSL.

Troubleshooting

Cannot open database <"Database name">

If you receive this error when attempting to make a connection, the login failed. This usually occurs either because the database name is incorrect or connection credentials do not have the required permissions.

Ensure that you have the right database name entered and check and ensure valid permissions for the credentials.

View Column Map Failure

The error displays when exporting data from Data Studio to SQL Server. It usually occurs when an export step in a workflow is attempting to INSERT or UPDATE data into a table where the data doesn't match the target schema.

Ensure that the data that you're exporting from Data Studio matches the data types of the target table's schema. You can use the Conversion functions in Data Studio to convert the values in your columns to the intended data type.

Column cannot be modified because it is either a computed column or is the result of a UNION operator

The error displays when exporting data from Data Studio to SQL Server. This error can occur when you're trying to export (INSERT) into a SQL table where one of the columns is a computed column.

The solution is to use a Map to target step before the export and leave any computed columns unmapped.

Invalid object name

Error when loading data from SQL server into Data Studio. The credentials used may not have the necessary permission to preview and load the data for the specified schema.

Ensure that the credentials used have the required permissions to preview and load data (using SELECT) from the relevant table or view in the specified schema. If you're connecting with a user whos default schema is different to the specified schema, check the Force JDBC schema name setting in the external system's configuration.

A user name must be specified when using pure Java NTLM

This authentication error can happen when using the same Windows credentials to connect using two different external systems to different domains on the same SQL Server instance. If the error appears when testing the connection, check that you're using nltm2java as the AuthenticationMethod connection property, and not nltmjava.

Type datetimeoffset loaded as Alphanumeric in Data Studio

By default, the driver loads datetimeoffset values (datetime with timezone offset) as type VARCHAR, which will be alphanumeric data type in Data Studio. To interpret these values as TIMESTAMP and load as dates instead, set the user connection property FetchTSWTZAsTimestamp=true.

Timestamps are converted to UTC on load into Data Studio.

The Microsoft Dynamics 365 driver supports two different methods of authentication. NTLM is typically used when Dynamics is hosted on premise, and requires a username and password to authenticate. OAuth is typically used when Dynamics is hosted in the cloud and requires more configuration.

Connection parameters (NTLM)

  1. Service URL - The url that handles service requests for the database. It should be in the format: https://myonpreminstance.sso3.dynamics365.net/api/data/v9.1/

Authentication parameters (NTLM)

  1. User name - the user name for connecting to the service.
  2. Password - the above user's password.

Connection parameters (OAuth)

These settings can be obtained from the dynamics server:

  1. Service URL - The url that handles service requests for the database. It should be in the format: https://sandbox.api.ds.dynamics.com/api/data/v9.1/.
  2. Authentication URI - The Microsoft endpoint that handles authentication. It should be in the format: https://login.microsoftonline.com/1234567-12345-1234-1234-123456789abc/oauth2/v2.0/authorize.
  3. Token URI - the Microsoft endpoint that handles granting of new authentication tokens. It should be in the format: https://login.microsoftonline.com/1234567-12345-1234-1234-123456789abc/oauth2/v2.0/token.
  4. Scope URI - the Dynamics endpoint that handles the permissions when authenticating. It should be in the format: https://sandbox.api.ds.dynamics.com/user_impersonation offline_access.

Authentication parameters (OAuth)

  1. Client ID - The client id for your application.
  2. Client Secret - The client secret for your application.
  3. Access Token - the value of the access token obtained from external sources.
  4. Refresh Token - the value of the refresh token obtained from external sources.

Troubleshooting

  1. The MS Dynamics 365 driver can take minutes to test. If you find that your Test button is timing out, set the JDBC timeout setting in Aperture Data Studio under Settings->Loading to 360 seconds and try again.
  2. The refresh token is not always necessary - if not entering any value is sufficient.
  3. The Access token and Refresh token will need to be generated externally. The access token can timeout quickly, so check that your token is still valid if you can no longer connect to a database.
  4. For further documentation on this driver see the DataDirect for JDBC for Microsoft Dynamics 365 Driver documentation.

Connection parameters

  1. Hostname: This specifies the base Salesforce URL to use for logging in, for example um5.salesforce.com or experian.my.salesforce.com. It will be the same base URL used for browsing to the Salesforce classic web application. This driver does not support the Lightning URL. For example, um5.lightning.force.com is not a valid hostname for the driver. If you're unsure, you can use the default Hostname login.salesforce.com.
  2. Schema: Optional, default SFORCE.

The template connection string is: jdbc:experian:sforce://;LoginTimeout=30;StmtCallLimit=0.

Additional connection properties are detailed in the DataDirect Salesforce JDBC driver documentation.

Authentication parameters

  1. Username: Specify the username that is used to connect to the Salesforce instance.
  2. Password: Specify the password to use to connect to your Salesforce instance. Typically, the username and password will be the same as the details you use to log into the Salesforce web UI.
  3. Security Token: Specifies the security token required to make a connection to a Salesforce instance that is configured for a security token. The security token is not required when Salesforce has been configured for Trusted IP Ranges and the user is logging in from a trusted IP address.

Troubleshooting

Salesforce schema has changed. Columns or tables not shown

If your Salesforce schema has changed (e.g. columns/tables have been added or removed), the driver will need to recreate the local schema mapping.

If the schema change is a one-off event, set the SchemaMap property to force the driver to create a new schema map the next time a connection is established, while retaining the old mapping for reference.

If the schema is likely to change regularly, you can configure the driver to always refresh the schema map when establishing a connection. To do this, set the following: CreateMap=forceNew. If you get an error "The specified configuration options do not match the options used to initially create the schema map", you'll have to force the driver to create a new schema map using the SchemaMap property. If you've already defined this, change it to a new value to force the creation of new schema map.

Reset your security token

To reset your security token via the Salesforce UI, browse to your personal settings and click on Reset My Security Token. An email will be sent with your new token. Some organizations will only allow administrators to reset users' tokens.

Error reading XMLStreamReader

You may be using the Salesforce lightning app hostname. Change the hostname URL to Salesforce Classic. For example, use ap160.salesforce.com instead of ap160.lightning.force.com.

The configuration options used to open the database do not match the options used to create the database

You may see this when making changes to a connection that require the mapping schema to be updated.

Use the SchemaMap connection property. The value for this property is either a name or the absolute path and name (including the .config extension) of the local schema mapping file.

Error when loading data from the Salesforce instance into Data Studio - Timeout on load (or preview) from Salesforce

There is likely to have been a timeout while the driver waits for a response to the web service request. Add the connection property WSTimeout and/or WSRetryCount.

Control which columns are visible, and column name formatting

The custom connection property ConfigOptions controls how Salesforce table columns are mapped in the driver:

  1. The driver exposes Salesforce audit fields (e.g. ISDELETED) and the master record ID (MASTERECORDID) field. To hide AuditColumns set ConfigOptions to (AuditColumns=none).
  2. The driver exposes the names of system fields as they exist in the Salesforce data model, unchanged. To have the prefix Sys_ for system columns, set ConfigOptions to (MapSystemColumnNames=1).
  3. The driver adds the __c suffix to custom table and column names when mapping the Salesforce data model. To remove the _C suffix for custom tables and column names set ConfigOptions to (CustomSuffix=strip).
  4. To change the casing of the column names so that they're not all uppercase, set ConfigOptions to (UppercaseIdentifiers=false).
  5. These settings can be combined by setting 'ConfigOptions' to (AuditColumns=none;MapSystemColumnNames=1;CustomSuffix=strip; UppercaseIdentifiers=false).

Connection parameters

  1. Hostname: Host name of the computer where the Hive Database is installed.
  2. Port: The Hive connection port number (default 9999).

The template connection string is: jdbc:experian:hive://:9999;LoginTimeout=15. Additional connection properties are detailed in the DataDirect Hive JDBC driver documentation.

Authentication parameters

  1. Username: Specify the username that is used to connect to the Hive instance.
  2. Password: Specify the password to use to connect to your Hive instance.

Troubleshooting

Thrift errors when attempting to Insert or Update

To avoid Thrift errors when attempting to Insert or Update rows in a Hive table, try increasing the value for the connection property ArrayInsertSize.

Connection parameters

  1. Project: The project name associated with your Google BigQuery account.
  2. Dataset: The Dataset name.

The template connection string is: jdbc:experian:googlebigquery:Project=;Dataset=. Additional connection properties are detailed in the DataDirect BigQuery JDBC driver documentation.

The driver supports OAuth 2.0 and service account authentication.

Authentication using OAuth 2.0

To use OAuth 2.0, you will add user connection properties to enter the access and refresh tokens.

  1. Set the AuthenticationMethod = oauth2.
  2. Set at least one of the following properties:
    • AccessToken: Set this to specify the access token you have obtained to authenticate to Google BigQuery.
    • RefreshToken: Set this to specify the refresh token you have obtained to authenticate to Google BigQuery.
  3. Set the ClientID property to specify the consumer key for your application.
  4. Set the ClientSecret property to specify the consumer secret for your application.
  5. Optionally, set the Scope property to specify the OAuth scope. It limits the permissions granted by an access token.

Authentication with a service account

A service account is a type of Google account that represents an application instead of an individual end user. For a successful service account authentication, you need:

  1. Private key file or Private key:
    • The private key file is a .json or .p12 file that contains the key required to authenticate API calls. You can download it from the Google Cloud Platform (GCP) Console.
    • The private key is contained in the private key file downloaded from the GCP Console.
  2. Service account email address: A unique email address that is provisioned while creating a service account.

To configure the driver to use service account authentication, set the following connection properties:

  1. Set AuthenticationMethod=serviceaccount.
  2. Set the ServiceAccountEmail property to specify your service account's email address.
  3. Set either the ServiceAccountKeyContent property or the ServiceAccountPrivateKey property:
    • ServiceAccountKeyContent specifies the private key required to authenticate to Google BigQuery. Use this property if you do not want to persist the private key file in your environment.
    • ServiceAccountPrivateKey specifies the full path to the .json or .p12 file that contains the private key. The driver extracts the private key value from the specified file and uses it to authenticate the user to the database. Use this property if it is preferable to persist the private key file.

Troubleshooting

List of tables shows incorrectly, or "tableā€¦ was not found" on load

This error is typically caused by an out-of-date map of the data model. To force the schema to refresh, add the property RefreshSchema=true or alternatively use CreateMap=session to store the map in memory.

Connection parameters

  1. Hostname: The name of the computer on which the MySQL database server is running.
  2. Port: The MySQL server connection port (default is 3306).
  3. Database The Database name.

The template connection string is: jdbc:experian:mysql://:3306;DatabaseName=;LoginTimeout=15

Additional connection properties are detailed in the DataDirect MySQL JDBC driver documentation.

Authentication parameters

  1. Username
  2. Password

Troubleshooting

Connections to MySQL Community Server are not supported

The Progress DataDirect MySQL JDBC driver shipped with Data Studio does not support MySQL Community Server versions. To connect to Oracle MySQL Community Server, download the MySQL Connector/J driver and deploy the mysql-connector-java-.jar file as a new custom driver.

Connection parameters

  1. Hostname - The name of the computer on which the PostgreSQL database server is running
  2. Port - The PostgreSQL server connection port (default is 5432)
  3. Database- The Database name

The template connection string is: jdbc:experian:postgresql://:5432;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect PostgreSQL JDBC driver documentation.

Authentication parameters

  1. Username
  2. Password

Connection parameters

  1. Hostname - The name of the computer on which the Redshift database server is running
  2. Port - The Redshift server connection port (default is 5439)
  3. Database- The Database name

The template connection string is: jdbc:experian:redshift://:5439;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect Redshift JDBC driver documentation.

Authentication parameters

  1. Username
  2. Password

Connection parameters

  1. Company - The name of the computer on which the Jira database server is running
  2. Schema - The schema name

The template connection string is: jdbc:experian:jira:;User=;Password=
Additional connection properties are detailed in the DataDirect Jira JDBC driver documentation.

Authentication parameters

  1. Username
  2. Password

The Autonomous REST Connector is a JDBC driver that allow Data Studio to codelessly connect to any RESTful API data source to load data.

You can connect to the REST service either directly via and endpoint, or by using a path to a .rest file. If you're using the endpoint directly, default mappings are generated to map JSON structures to relational tables. Alternatively, a .rest file can be created to define endpoints, mappings, and paging.

Connection parameters

One of the following settings must be provided:

  1. REST configuration file: Specifies the name and location of the input REST file that contains a list of endpoints to sample, PUSH request definitions, and configuration information.
  2. REST endpoint: Specifies the endpoint to sample when not using an input REST file.
  3. AuthenticationMethod: Determines which authentication method the driver uses during the course of a session. Options are None, Basic, HttpHeader, OAuth2, UrlParameter and Custom.

The template connection string is: jdbc:experian:autorest:Config=;Sample=. For additional connection properties and more information on creating a REST file and the authentication options, see the DataDirect Autonomous REST JDBC driver documentation.

Authentication parameters

  1. Username
  2. Password

Connection parameters

  1. Hostname - Host name of the computer where the Cassandra Database is installed.
  2. Port - The Cassandra connection port number (default 9042).
  3. Keyspace - The keyspace name

The template connection string is: jdbc:experian:cassandra://:9042;KeyspaceName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect Cassandra JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the Cassandra database.
  2. Password - Specify the password to use to connect to your Cassandra database.

Connection parameters

  1. Hostname - Host name of the computer where the DB2 Database is installed.
  2. Port - The DB2 connection port number (default 50000).
  3. Database - The database name

The template connection string is: jdbc:experian:db2://:50000;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect DB2 JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the DB2 database.
  2. Password - Specify the password to use to connect to your DB2 database.

Connection parameters

  1. Hostname - Host name of the computer where the Greenplum Database is installed.
  2. Port - The Greenplum connection port number (default 5432).
  3. Database - The database name

The template connection string is: jdbc:experian:greenplum://:5432;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect Greenplum JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the Greenplum database.
  2. Password - Specify the password to use to connect to your Greenplum database.

Connection parameters

  1. Hostname - Host name of the computer where the Informix Database is installed.
  2. Port - The Informix connection port number (default 1526).
  3. Server - The name of the Informix server
  4. Database - The database name

The template connection string is: jdbc:experian:informix://:1526;InformixServer=;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect Informix JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the Informix database.
  2. Password - Specify the password to use to connect to your Informix database.

Connection parameters

  1. Hostname - Host name of the computer where the MongoDB Database is installed.
  2. Port - The MongoDB connection port number (default 27017).
  3. Schema Definition -
  4. Database - The database name

The template connection string is: jdbc:experian:mongodb://:27017;SchemaDefinition=;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect MongoDB JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the MongoDB database.
  2. Password - Specify the password to use to connect to your MongoDB database.

Connection parameters

  1. Hostname - Host name of the computer where the Spark SQL Database is installed.
  2. Port - The Spark SQL connection port number (default 9999).
  3. Database - The database name

The template connection string is: jdbc:experian:sparksql://:9999;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect Spark SQL JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the Spark SQL database.
  2. Password - Specify the password to use to connect to your Spark SQL database.

Connection parameters

  1. Hostname - Host name of the computer where the Sybase Database is installed.
  2. Port - The Sybase connection port number (default 5000).
  3. Database - The database name

The template connection string is: jdbc:experian:sybase://:5000;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect Sybase JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the Sybase database.
  2. Password - Specify the password to use to connect to your Sybase database.

Connection parameters

  1. Hostname - Host name of the computer where the OpenEdge Database is installed.
  2. Port - The OpenEdge connection port number (default 2003).
  3. Database - The database name

The template connection string is: jdbc:experian:openedge://:2003;DatabaseName=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect OpenEdge JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the OpenEdge database.
  2. Password - Specify the password to use to connect to your OpenEdge database.

Connection parameters

  1. Company - The company identifier issued by Oracle Eloqua during the registration process.
  2. Username
  3. Password

The template connection string is: jdbc:experian:eloqua:Company=;User=;Password=. Additional connection properties are detailed in the DataDirect Eloqua JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the Eloqua database.
  2. Password - Specify the password to use to connect to your Eloqua database.

Connection parameters

  1. Login Host - the base URL of the Oracle Sales Cloud site

The template connection string is: jdbc:experian:oraclesalescloud//. Additional connection properties are detailed in the DataDirect Sales Cloud JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the Sales Cloud database.
  2. Password - Specify the password to use to connect to your Sales Cloud database.

Connection parameters

  1. Login Host: The base URL of the Oracle Service Cloud site.
  2. Interface name: deprecated.

The template connection string is: jdbc:experian:oracleservicecloud:LoginHost=;InterfaceName=;user=;password=;LoginTimeout=15. Additional connection properties are detailed in the DataDirect Service Cloud JDBC driver documentation.

Authentication parameters

  1. Username - Specify the username that is used to connect to the Service Cloud database.
  2. Password - Specify the password to use to connect to your Service Cloud database.

Connect to Azure Data Lake Gen2

Data Studio provides support for connecting to systems built on Microsoft Azure Blob storage, including Azure Data Lake Storage Gen2.

Data Studio uses access keys to authenticate when making requests to the Azure storage account. To configure a connection to the Data Lake:

  1. Sign in to the Azure portal and locate your storage account. In the Settings section of the storage account overview, select Access keys to view Access keys and the connection information.
  2. Create a new External system with the type Microsoft Azure Blob Storage.
  3. URL combines the storage account name and endpoint suffix, and is usually of the form <storage-account-name>.core.windows.net.
  4. Container should match the name of the container within the storage account.
  5. Select whether to Include Files In Subdirectories of that container.

To create a new credential:

  1. The Account Name should match the storage account name in the Azure portal.
  2. Key 1 and Key 2 will be the two Access keys specified in the Azure portal.

Configure External System dropzone

If the Enable external system dropzone checkbox is checked when a Dataset is being created or edited, an external system folder can be specified to monitor for new file versions of the Dataset.

When a file is uploaded to this external system folder, it is automatically loaded into the corresponding Dataset as a new batch of data.

Each Dataset external system dropzone folder is continuously monitored for the arrival of new files. When a new file arrives, an attempt is made to load it.

Data Studio uses a pull model and polls the external system dropzone folder for new files. The polling interval is one minute. However, if the current load is still in progress (due to large file sizes), Data Studio will not fetch for new files until the load process is complete.

External dropzones are supported for:

  • SFTP
  • Amazon AWS S3
  • Microsoft Azure Blob Storage (for Azure Data Lake Storage Gen2)

The following conditions must be met for a file to be loaded:

  • The Dataset must have originally been created from an External System.
  • The file in the dropzone must have the same format as the original Dataset as determined through the schema. The new file may have a different extension.
  • The new file's column names should match the column names in the existing Dataset, unless missing columns in the file have been marked as Optional in the Dataset. You can set a Dataset's columns as Optional in the Annotate columns screen when you create or edit it. Any additional columns in the new file will be ignored.

Folder structure

The default location for the dropzone folder on the external system would be its root folder.

You can specify the folder to be used through the Folder to use as dropzone text field once the Enable external system dropzone checkbox is checked. Additionally, you can likewise specify the prefix of the file name in the same manner through the Starts with file pattern text field. This does a string match on the filename to filter down the list of files affected.

Load data using a SQL query

When loading data into a Dataset from a JDBC External system, users with the correct Credential permissions can specify a SQL query to determine the data to fetch. The query will be executed on the source system, and can be used to specify the columns or filter rows to fetch, or to join, sort, limit or otherwise manipulate the data to be loaded into Data Studio.

Loading data via a SQL query can be beneficial for many reasons:

  • Improve refresh time and reduce network traffic by only loading the values (rows and columns) you need, rather than an entire table or view.
  • Improve the efficiency of Data Studio Workflows by executing operations such as joining, grouping and sorting on the source system where they are optimized.
  • Avoid the need to have views created in the data source system.
  • Incrementally refresh deltas into a multi-batch Dataset instead of re-loading all the data.

When a Dataset is loaded from a SQL query, the query remains associated with the Dataset, and can be viewed or edited in the Edit details menu option. The query is used whenever the Dataset is refreshed.

Example of use cases for SQL queries when importing a Dataset from a JDBC external system:

  • Specify a few columns from table
  • Specify rows by TOP/LIMIT/FILTER/DISTINCT/GROUPBY/MAX/MIN
  • Join multiple tables and import resulting Dataset

To execute a SQL query when importing a Dataset from a JDBC external system:

  1. Go to Datasets.
  2. Click Add new Dataset.
  3. Select External system as the source type.
  4. Select a system from the available ones. If no systems are listed, follow the steps to create an external system.
  5. Select a credential to be associated with the Dataset.
  6. Select Database query.
  7. Enter your SQL query into the text box.
  8. Tick the Approve query checkbox to verify that the SQL query is safe for execution.
  9. Click Next to see a preview of the data to be loaded.
  10. Continue with importing the Dataset.
  • SELECT * FROM customers WHERE age > 50
  • SELECT * FROM customers LIMIT 100
  • SELECT name FROM customers ORDER BY age
  • SELECT * FROM customers INNER JOIN employees ON customers.person_id = employees.person_id