DatabricksConnector

DatabricksConnector is a connector that connects to a Databricks Unity Catalog instance, extracts metadata and runtime lineage information from the workspace, and transforms and uploads the metadata to a destination application (dataspot) using the upload API.

Licensing

Use of DatabricksConnector requires a separately acquired, commercial license for this specific connector. An active license must be obtained before configuring or executing DatabricksConnector. Use without a valid license is not permitted and may constitute a violation of the licensing terms.

For pricing and trial options, please contact your dataspot representative.

Tooltip

Find DatabricksConnector configuration examples here.

Functionality

DatabricksConnector follows the general connector architecture and workflow.

The metadata is extracted from the workspace and transformed to assets in the destination application (dataspot).

Source	Asset
Catalog	`Collection`
Schema	`Collection`
Table	`UmlClass`
Column	`UmlAttribute`
Foreign key	`UmlAssociation`
Datatype	`UmlDatatype`

DatabricksConnector extracts and transforms runtime data lineage to Transformation and Rule assets.

Did you know?

Runtime data lineage is the automatic capture of data flow metadata during query execution on Databricks - tracking table-to-table and column-level dependencies, along with associated notebooks, jobs, and dashboards, in near real time. Databricks Unity Catalog aggregates this lineage across all attached workspaces into a unified metastore-wide graph.

The transformed metadata is uploaded to the destination application by calling the upload API. The reconciliation options of the upload API specify how uploaded metadata is reconciled with existing metadata. The workflow options of the upload API specify the workflow statuses of inserted, updated, or deleted metadata.

Filters

Across all metadata levels - catalogs, schemas, tables, columns, foreign keys, and datatypes - the filtering mechanism follows the same core principles. Filters specify matching criteria, to determine whether the filter applies to a specific metadata object, as well as options for transforming the metadata to assets in the destination application.

Top-level catalog filters define which catalogs to extract and apply the nested filters to extract schemas, tables, columns, foreign keys, datatypes.

Note

If the top-level catalog filter list is null or empty, no metadata objects at that level (nor their subordinate objects) are extracted.

Filters are nested according to the metadata hierarchy:

Catalog filters contain schema filters.
Schema filters contain table and datatype filters.
Table filters contain column and foreign key filters.

Nested filters only apply to objects within the scope of their parent filter. When a parent filter matches and is applied, the nested filters are used to extract and transform the subordinate metadata objects.

Note

If a nested filter list is null, all metadata objects at that level (and their subordinate objects) are extracted and transformed. If a nested filter list is empty, no metadata objects at that level are extracted.

Filters are evaluated in their declaration order - from top to bottom. For each metadata object, the first filter that matches is applied - the remaining filters at that level are ignored.

Tooltip

Due to the single-pass resolution, where only the first matching filter is applied, filter lists should be structured from most specific to most general. This approach ensures predictable extraction rules allowing to include or exclude precise subsets of the metadata and to customize how each slice of metadata is transformed to assets in the destination application.

Configuration

A DatabricksConnector service is configured by defining its unique name, the service type DatabricksConnector, and the configuration.

Example: DatabricksConnector

services:
  MyService:
    type: DatabricksConnector

Tooltip

While YAML itself doesn't enforce any naming style for property names, multi-word properties (for example, client certificate) are typically specified in lowercase separated by hyphens (for example, client-certificate). This naming style - commonly referred to as kebab-case - is used in the following descriptions and examples. However, all multi-word properties can also be specified in camelCase (for example, clientCertificate).

In additional to the general connector configuration to specify the destination application, DatabricksConnector has the following configuration to specify the source as well as the ingestion filters.

Tooltip

Properties marked with * are required for DatabricksConnector to run.

Source

DatabricksConnector connects to a Databricks Unity Catalog instance using the specified workspace URL and authentication settings.

🔑 Property `source.url` *

The workspace URL of the Databricks Unity Catalog instance.

required

DatabricksConnector connects to the Databricks Unity Catalog instance specified by the workspace URL.

Example: Property source.url

services:
  MyService:
    type: DatabricksConnector
    source:
      url: https://dbc-00182f59-66eb.cloud.databricks.com

🔑 Property `source.warehouse-id` *

The warehouse ID.

required

DatabricksConnector extracts runtime data lineage using the specified warehouse ID.

Example: Property source.warehouse-id

services:
  MyService:
    type: DatabricksConnector
    source:
      url: https://dbc-00182f59-66eb.cloud.databricks.com
      warehouse-id: b079313fa6222089

Authentication

DatabricksConnector can specify the authentication settings of the Databricks Unity Catalog instance.

🔑 Property `source.authentication`

The authentication settings of the Databricks Unity Catalog instance.

optional

The default is null (no authentication).

If an authentication is defined, DatabricksConnector connects to the Databricks Unity Catalog instance with the specified authentication. Otherwise, DatabricksConnector connects without authentication.

🔑 Property `source.authentication.method`

The authentication method.

required

The property is required if source.authentication is specified.

DatabricksConnector supports the following authentication methods:

Authentication method	`method`
Token	`token`
OAuth 2.0	`oauth`

Example: Property source.authentication.method

services:
  MyService:
    type: DatabricksConnector
    source:
      authentication:
        method: token

Token

DatabricksConnector can use a token for connecting to the Databricks Unity Catalog instance.

🔑 Property `source.authentication.token`

The personal access token (PAT).

required

The property can only be specified and is required if source.authentication.method is token.

DatabricksConnector uses the specified token for authentication.

Example: Property source.authentication.token

services:
  MyService:
    type: DatabricksConnector
    source:
      authentication:
        method: token
        token: ${databricks.pat}

OAuth 2.0

DatabricksConnector can use OAuth 2.0 authentication for connecting to the Databricks Unity Catalog instance. The application supports non-interactive (machine to machine) grants to obtain an access token as a client application.

🔑 Property `source.authentication.client-id`

The OAuth 2.0 client ID.

required

The property can only be specified and is required if source.authentication.method is oauth.

DatabricksConnector uses the client ID to authenticate.

Note

A provider URL must not be specified, as it is automatically inferred from the workspace URL.

Example: Property source.authentication.client-id

services:
  MyService:
    type: DatabricksConnector
    source:
      authentication:
        method: oauth
        client-id: 6731de76-14a6-49ae-97bc-6eba6914391e

🔑 Property `source.authentication.credentials.type`

The credentials type.

required

The property can only be specified and is required if source.authentication.method is oauth.

DatabricksConnector supports the following credentials types to obtain an access or ID token:

Credentials type	`type`
Client credentials with client secret	`client-secret`

Example: Property source.authentication.credentials.type

services:
  MyService:
    type: DatabricksConnector
    source:
      authentication:
        method: oauth
        client-id: 6731de76-14a6-49ae-97bc-6eba6914391e
        credentials:
          type: client-secret

Client credentials with client secret

The OAuth 2.0 client credentials grant with a client secret is a non-interactive (machine to machine) authentication. DatabricksConnector authenticates as a client application, rather than as an end-user, to obtain an access token.

🔑 Property `source.authentication.credentials.client-secret`

The client secret.

required

The property can only be specified and is required if source.authentication.credentials.type is client-secret.

DatabricksConnector uses the specified client secret to authenticate.

Example: Property source.authentication.credentials.client-secret

services:
  MyService:
    type: DatabricksConnector
    source:
      authentication:
        method: oauth
        client-id: 6731de76-14a6-49ae-97bc-6eba6914391e
        credentials:
          type: client-secret
          client-secret: ${databricks.client-secret}

Catalogs

DatabricksConnector extracts catalogs from the workspace and transforms them to Collection assets.

Top-level catalog filters define which catalogs to extract and specify their transformation options.

🔑 Property `ingestion.catalogs`

The ordered list of top-level catalog filters that specify which catalogs to extract and their transformation options.

optional

The default is null (don't extract any catalogs).

For each catalog in the workspace, DatabricksConnector evaluates the top-level catalog filters in their declaration order - from top to bottom. The first catalog filter that matches is applied - the remaining catalog filters are ignored.

A catalog matches a catalog filter, if the catalog name matches the catalog filter's names. In this case, the catalog is extracted and transformed to a Collection asset using the transformation options of the catalog filter. The schemas of the catalog are extracted using the schema filters nested in the applied catalog filter.

Attention

The property ingestion.catalogs is a top-level filter. If this list of catalog filters is null or empty, no catalogs are extracted.

Example: Property ingestion.catalogs

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      # extract catalogs 'samples' and 'dataspot'
      catalogs:
        names:
          accept:
            - samples
            - dataspot

Tooltip

If the list of catalog filters is a single-value list, it can be formatted as a single value, rather than as a list with a single value.

🔑 Property `ingestion.catalogs[].names`

The pattern filter applied to filter catalogs based on their catalog name.

optional

The default is null (match all catalog names).

If a pattern filter is defined, DatabricksConnector uses it to match the name of each catalog to determine whether to extract the catalog. Otherwise, the catalogs are not filtered by their catalog name - all catalog names are accepted.

Example: Property ingestion.catalogs[].names

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      # extract catalogs 'samples' and 'dataspot'
      catalogs:
        names:
          accept:
            - samples
            - dataspot

🔑 Property `ingestion.catalogs[].stereotype`

The stereotype of the transformed Collection asset.

optional

The default is null (no stereotype).

If a stereotype is defined, DatabricksConnector sets the stereotype of the transformed Collection asset.

Example: Property ingestion.catalogs[].stereotype

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      # extract catalogs and transform them with stereotype 'catalog'
      catalogs:
        stereotype: catalog

Tooltip

If the specified stereotype doesn't exist in the scheme of the destination application, the stereotype is ignored.

Schemas

DatabricksConnector extracts schemas from the workspace and transforms them to Collection assets.

Schema filters are nested in catalog filters and define which schemas to extract and specify their transformation options.

🔑 Property `ingestion.catalogs[].schemas`

The ordered list of schema filters that specify which schemas to extract and their transformation options.

optional

The default is null (extract all schemas as well as subordinate tables and datatypes).

For each schema in the extracted catalog, DatabricksConnector evaluates the schema filters in their declaration order - from top to bottom. The first schema filter that matches is applied - the remaining schema filters are ignored.

A schema matches a schema filter, if the schema name matches the schema filter's names. In this case, the schema is extracted and transformed to a Collection asset using the transformation options of the schema filter. The tables and datatypes of the schema are extracted using the table and datatype filters nested in the applied schema filter.

Note

If the list of schema filters is null, all schemas (and subordinate tables and datatypes) are extracted. If the list of schema filters is empty, no schemas are extracted.

Example: Property ingestion.catalogs[].schemas

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        names:
          accept: samples
        schemas:
          # extract schemas 'sales' and 'finance' with managed tables and views
          - names:
              accept:
                - sales
                - finance
            tables:
              types:
                accept:
                  - MANAGED
                  - VIEW
          # extract schemas starting with 'kpi_' with views
          - names:
              accept: kpi_.*
            tables:
              types:
                accept: VIEW

Tooltip

If the list of schema filters is a single-value list, it can be formatted as a single value, rather than as a list with a single value.

🔑 Property `ingestion.catalogs[].schemas[].names`

The pattern filter applied to filter schemas based on their schema name.

optional

The default is null (match all schema names).

If a pattern filter is defined, DatabricksConnector uses it to match the name of each schema to determine whether to extract the schema. Otherwise, the schemas are not filtered by their schema name - all schema names are accepted.

Example: Property ingestion.catalogs[].schemas[].names

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        names:
          accept: samples
        # extract schemas 'sales' and 'finance'
        schemas:
          names:
            accept:
              - sales
              - finance

🔑 Property `ingestion.catalogs[].schemas[].stereotype`

The stereotype of the transformed Collection asset.

optional

The default is null (no stereotype).

If a stereotype is defined, DatabricksConnector sets the stereotype of the transformed Collection asset.

Example: Property ingestion.catalogs[].schemas[].stereotype

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        names:
          accept: samples
        schemas:
          # extract schemas starting with 'db_' and transform them without stereotype
          - names:
              accept: db_.*
          # extract other schemas and transform them with stereotype 'schema'
          - stereotype: schema

Tooltip

If the specified stereotype doesn't exist in the scheme of the destination application, the stereotype is ignored.

🔑 Property `ingestion.catalogs[].schemas[].deployment`

The deployment of the transformed assets.

optional

The default is null (no deployment).

For each UmlClass and UmlDatatype asset transformed from the extracted tables and datatypes in the schema, DatabricksConnector creates a Deployment link and sets the specified deployment system, favorite flag, and qualifier.

Example: Property ingestion.catalogs[].schemas[].deployment

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        names:
          accept: samples
        schemas:
          # extract schema 'finance' and transform the assets with deployments to '/Systems/Finance/FDS'
          - names:
              accept: finance
            deployment:
              deployed-in: /Systems/Finance/FDS
              favorite: true
              qualifier: SPOT
          # extract schema 'sales' and transform the assets with deployments to '/Systems/Sales/SDS'
          - names:
              accept: sales
            deployment:
              deployed-in: /Systems/Sales/SDS

🔑 Property `ingestion.catalogs[].schemas[].deployment.deployed-in`

The system in which the asset is deployed for execution or storage purposes.

required

The property is required if ingestion.catalogs[].schemas[].deployment is specified.

🔑 Property `ingestion.catalogs[].schemas[].deployment.favorite`

The flag that specifies if the deployment should be marked as favorite.

optional

The default is false (not favorite).

🔑 Property `ingestion.catalogs[].schemas[].deployment.qualifier`

The additional qualifier characterizing the deployment.

optional

The default is null (no qualifier).

Tables

DatabricksConnector extracts tables from the workspace and transforms them to UmlClass assets.

Table filters are nested in schema filters and define which tables to extract and specify their transformation options.

🔑 Property `ingestion.catalogs[].schemas[].tables`

The ordered list of table filters that specify which tables to extract and their transformation options.

optional

The default is null (extract all tables as well as subordinate columns and foreign keys).

For each table in the extracted schema, DatabricksConnector evaluates the table filters in their declaration order - from top to bottom. The first table filter that matches is applied - the remaining table filters are ignored.

A table matches a table filter, if the table name matches the table filter's names and the table type matches the table filter's types. In this case, the table is extracted and transformed to a UmlClass asset using the transformation options of the table filter. The columns and foreign keys of the table are extracted using the column and foreign key filters nested in the applied table filter.

Note

If the list of table filters is null, all tables (and subordinate columns and foreign keys) are extracted. If the list of table filters is empty, no tables are extracted.

Example: Property ingestion.catalogs[].schemas[].tables

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            # extract tables with type 'MANAGED' or 'VIEW' and transform them with stereotype 'table'
            - types:
                accept:
                  - MANAGED
                  - VIEW
              stereotype: table
            # extract other tables and transform them without a stereotype
            - names:
                accept: .*

Tooltip

If the list of table filters is a single-value list, it can be formatted as a single value, rather than as a list with a single value.

🔑 Property `ingestion.catalogs[].schemas[].tables[].names`

The pattern filter applied to filter tables based on their table name.

optional

The default is null (match all table names).

If a pattern filter is defined, DatabricksConnector uses it to match the name of each table to determine whether to extract the table. Otherwise, the tables are not filtered by their table name - all table names are accepted.

Example: Property ingestion.catalogs[].schemas[].tables[].names

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          # extract tables starting with 'sales_' or 'finance_'
          tables:
            names:
              accept:
                - sales_.*
                - finance_.*

🔑 Property `ingestion.catalogs[].schemas[].tables[].types`

The pattern filter applied to filter tables based on their table type.

optional

The default is null (match all table types).

If a pattern filter is defined, DatabricksConnector uses it to match the type of each table to determine whether to extract the table. Otherwise, the tables are not filtered by their table type - all table types are accepted.

Example: Property ingestion.catalogs[].schemas[].tables[].types

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          # extract tables with type 'MANAGED' or 'VIEW'
          tables:
            types:
              accept:
                - MANAGED
                - VIEW

Note

Databricks Unity Catalog supports the table types EXTERNAL, FOREIGN, MANAGED, MATERIALIZED_VIEW, STREAMING_TABLE, and VIEW. Refer to the documentation for details.

🔑 Property `ingestion.catalogs[].schemas[].tables[].view-definitions`

The pattern filter applied to extract view definitions based on their view name.

optional

The default is null (extract all view definitions).

For each extracted table, DatabricksConnector determines whether the table is a view and, if so, matches the view name with the specified pattern filter to determine whether to process the view definition. In this case, the view definition is extracted and transformed to Derivation links from the underlying table's UmlAttribute assets to the view's UmlAttribute assets.

Example: Property ingestion.catalogs[].schemas[].tables[].view-definitions

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            # don't extract view definitions
            view-definitions:
              reject: .*

🔑 Property `ingestion.catalogs[].schemas[].tables[].stereotype`

The stereotype of the transformed UmlClass asset.

optional

The default is null (no stereotype).

If a stereotype is defined, DatabricksConnector sets the stereotype of the transformed UmlClass asset.

Example: Property ingestion.catalogs[].schemas[].tables[].stereotype

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            # extract tables with type 'MANAGED' and transform them with stereotype 'table'
            - types:
                accept: MANAGED
              stereotype: table
            # extract tables with type 'VIEW' and transform them with stereotype 'view'
            - types:
                accept: VIEW
              stereotype: view

Tooltip

If the specified stereotype doesn't exist in the scheme of the destination application, the stereotype is ignored.

Columns

DatabricksConnector extracts columns from the workspace and transforms them to UmlAttribute assets.

Column filters are nested in table filters and define which columns to extract and specify their transformation options.

🔑 Property `ingestion.catalogs[].schemas[].tables[].columns`

The ordered list of column filters that specify which columns to extract and their transformation options.

optional

The default is null (extract all columns).

For each column in the extracted table, DatabricksConnector evaluates the column filters in their declaration order - from top to bottom. The first column filter that matches is applied - the remaining column filters are ignored.

A column matches a column filter, if the column name matches the column filter's names. In this case, the column is extracted and transformed to a UmlAttribute asset using the transformation options of the column filter. The datatype of the column is extracted using the datatype filters nested in the applied schema filter.

Note

If the list of column filters is null, all columns are extracted. If the list of column filters is empty, no columns are extracted.

If the restrictions of the extracted column are not stored in the UmlDatatype asset corresponding to the column's datatype, the column's restrictions - such as the base type (e.g. STRING, INTEGER, DATE), the minimal and maximal value or length, or the number of integer and decimal digits - are stored in the transformed UmlAttribute asset.

Example: Property ingestion.catalogs[].schemas[].tables[].columns

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            # extract columns and transform them with stereotype 'column'
            columns:
              stereotype: column

Tooltip

If the list of column filters is a single-value list, it can be formatted as a single value, rather than as a list with a single value.

🔑 Property `ingestion.catalogs[].schemas[].tables[].columns[].names`

The pattern filter applied to filter columns based on their column name.

optional

The default is null (match all column names).

If a pattern filter is defined, DatabricksConnector uses it to match the name of each column to determine whether to extract the column. Otherwise, the columns are not filtered by their column name - all column names are accepted.

Example: Property ingestion.catalogs[].schemas[].tables[].columns[].names

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            # don't extract columns starting with 'constraint_' or 'privilege_'
            columns:
              names:
                reject:
                  - constraint_.*
                  - privilege_.*

🔑 Property `ingestion.catalogs[].schemas[].tables[].columns[].stereotype`

The stereotype of the transformed UmlAttribute asset.

optional

The default is null (no stereotype).

If a stereotype is defined, DatabricksConnector sets the stereotype of the transformed UmlAttribute asset.

Example: Property ingestion.catalogs[].schemas[].tables[].columns[].stereotype

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            # extract columns starting with 'constraint_' or 'privilege_' and transform them without stereotype
            columns:
              - names:
                  accept:
                    - constraint_.*
                    - privilege_.*
            # extract other columns and transform them with stereotype 'column'
              - stereotype: column

Tooltip

If the specified stereotype doesn't exist in the scheme of the destination application, the stereotype is ignored.

Foreign keys

DatabricksConnector extracts foreign keys from the workspace and transforms them to UmlAssociation assets.

Foreign key filters are nested in table filters and define which foreign keys to extract and specify their transformation options.

🔑 Property `ingestion.catalogs[].schemas[].tables[].foreign-keys`

The ordered list of foreign key filters that specify which foreign keys to extract and their transformation options.

optional

The default is null (extract all foreign keys).

For each foreign key in the extracted table, DatabricksConnector evaluates the foreign key filters in their declaration order - from top to bottom. The first foreign key filter that matches is applied - the remaining foreign key filters are ignored.

A foreign key matches a foreign key filter, if the foreign key name matches the foreign key filter's names. In this case, the foreign key is extracted and transformed to a UmlAssociation asset using the transformation options of the foreign key filter.

Note

If the list of foreign key filters is null, all foreign keys are extracted. If the list of foreign key filters is empty, no foreign keys are extracted.

Example: Property ingestion.catalogs[].schemas[].tables[].foreign-keys

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            # don't extract foreign keys
            foreign-keys:
              names:
                reject: .*

Tooltip

If the list of foreign key filters is a single-value list, it can be formatted as a single value, rather than as a list with a single value.

🔑 Property `ingestion.catalogs[].schemas[].tables[].foreign-keys[].names`

The pattern filter applied to filter foreign keys based on their foreign key name.

optional

The default is null (match all foreign key names).

If a pattern filter is defined, DatabricksConnector uses it to match the name of each foreign key to determine whether to extract the foreign key. Otherwise, the foreign keys are not filtered by their foreign key name - all foreign key names are accepted.

Example: Property ingestion.catalogs[].schemas[].tables[].foreign-keys[].names

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            # extract foreign keys starting with 'fk_'
            foreign-keys:
              names:
                accept: fk_.*

🔑 Property `ingestion.catalogs[].schemas[].tables[].foreign-keys[].stereotype`

The stereotype of the transformed UmlAssociation asset.

optional

The default is null (no stereotype).

If a stereotype is defined, DatabricksConnector sets the stereotype of the transformed UmlAssociation asset.

Example: Property ingestion.catalogs[].schemas[].tables[].foreign-keys[].stereotype

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          tables:
            foreign-keys:
              # extract foreign keys starting with 'fk_' and transform them with stereotype 'foreign_key'
              - names:
                  accept: fk_.*
                stereotype: foreign_key
              # extract other foreign keys and transform them with stereotype 'key'
              - stereotype: key

Tooltip

If the specified stereotype doesn't exist in the scheme of the destination application, the stereotype is ignored.

Datatypes

DatabricksConnector extracts datatypes from the workspace and transforms them to UmlDatatype assets.

Datatype filters are nested in schema filters and define which datatypes to extract and specify their transformation options.

🔑 Property `ingestion.catalogs[].schemas[].datatypes`

The ordered list of datatype filters that specify which datatypes to extract and their transformation options.

optional

The default is null (extract all datatypes).

For each extracted column in the schema, DatabricksConnector determines whether the column's type is a datatype and, if so, evaluates the datatype filters in their declaration order - from top to bottom. The first datatype filter that matches is applied - the remaining datatype filters are ignored.

The datatype matches a datatype filter, if the datatype name matches the datatype filter's names. In this case, the datatype is extracted and transformed to a UmlDatatype asset using the transformation options of the datatype filter.

Note

If the list of datatype filters is null, all datatypes are extracted. If the list of datatype filters is empty, no datatypes are extracted.

If the restrictions of the extracted column are stored in the transformed UmlDatatype asset, these restrictions are also included in the asset's label, for example VARCHAR(255), DECIMAL(15,6), or BLOB(4000).

Example: Property ingestion.catalogs[].schemas[].datatypes

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          # don't extract datatypes
          datatypes:
            names:
              reject: .*

Tooltip

If the list of datatype filters is a single-value list, it can be formatted as a single value, rather than as a list with a single value.

🔑 Property `ingestion.catalogs[].schemas[].datatypes[].names`

The pattern filter applied to filter datatypes based on their datatype name.

optional

The default is null (match all datatype names).

If a pattern filter is defined, DatabricksConnector uses it to match the name of each datatype to determine whether to extract the datatype. Otherwise, the datatypes are not filtered by their datatype name - all datatype names are accepted.

Example: Property ingestion.catalogs[].schemas[].datatypes[].names

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          # extract datatypes containing 'varchar' or 'decimal'
          datatypes:
            names:
              accept:
                - .*varchar.*
                - .*decimal.*

🔑 Property `ingestion.catalogs[].schemas[].datatypes[].stereotype`

The stereotype of the transformed UmlDatatype asset.

optional

The default is null (no stereotype).

If a stereotype is defined, DatabricksConnector sets the stereotype of the transformed UmlDatatype asset.

Example: Property ingestion.catalogs[].schemas[].datatypes[].stereotype

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          # extract datatypes and transform them with stereotype 'datatype'
          datatypes:
            stereotype: datatype

Tooltip

If the specified stereotype doesn't exist in the scheme of the destination application, the stereotype is ignored.

🔑 Property `ingestion.catalogs[].schemas[].datatypes[].restricted`

The flag that specifies to store the restrictions of the extracted column in the transformed UmlDatatype asset.

optional

The default is true (recommended).

If the flag is true, DatabricksConnector stores the restrictions of the extracted column - such as the base type (e.g. STRING, INTEGER, DATE), the minimal and maximal value or length, or the number of integer and decimal digits - in the transformed UmlDatatype asset. Otherwise, the restrictions are stored in the UmlAttribute asset corresponding to the extracted column.

Example: Property ingestion.catalogs[].schemas[].datatypes[].restricted

services:
  MyService:
    type: DatabricksConnector
    ingestion:
      catalogs:
        schemas:
          datatypes:
            # extract datatypes containing 'varchar' or 'decimal' and store the restrictions in the UmlDatatype
            - names:
                accept:
                  - .*varchar.*
                  - .*decimal.*
            # extract other datatypes and store the restrictions in the UmlAttribute
            - restricted: false

Functionality​

Filters​

Configuration​

Source​

🔑 Property source.url *

🔑 Property source.warehouse-id *

Authentication​

🔑 Property source.authentication

🔑 Property source.authentication.method

Token​

🔑 Property source.authentication.token

OAuth 2.0​

🔑 Property source.authentication.client-id

🔑 Property source.authentication.credentials.type

Client credentials with client secret​

🔑 Property source.authentication.credentials.client-secret

Catalogs​

🔑 Property ingestion.catalogs

🔑 Property ingestion.catalogs[].names

🔑 Property ingestion.catalogs[].stereotype

Schemas​

🔑 Property ingestion.catalogs[].schemas

🔑 Property ingestion.catalogs[].schemas[].names

🔑 Property ingestion.catalogs[].schemas[].stereotype

🔑 Property ingestion.catalogs[].schemas[].deployment

🔑 Property ingestion.catalogs[].schemas[].deployment.deployed-in

🔑 Property ingestion.catalogs[].schemas[].deployment.favorite

🔑 Property ingestion.catalogs[].schemas[].deployment.qualifier

Tables​

🔑 Property ingestion.catalogs[].schemas[].tables

🔑 Property ingestion.catalogs[].schemas[].tables[].names

🔑 Property ingestion.catalogs[].schemas[].tables[].types

🔑 Property ingestion.catalogs[].schemas[].tables[].view-definitions

🔑 Property ingestion.catalogs[].schemas[].tables[].stereotype

Columns​

🔑 Property ingestion.catalogs[].schemas[].tables[].columns

🔑 Property ingestion.catalogs[].schemas[].tables[].columns[].names

🔑 Property ingestion.catalogs[].schemas[].tables[].columns[].stereotype

Foreign keys​

🔑 Property ingestion.catalogs[].schemas[].tables[].foreign-keys

🔑 Property ingestion.catalogs[].schemas[].tables[].foreign-keys[].names

🔑 Property ingestion.catalogs[].schemas[].tables[].foreign-keys[].stereotype

Datatypes​

🔑 Property ingestion.catalogs[].schemas[].datatypes

🔑 Property ingestion.catalogs[].schemas[].datatypes[].names

🔑 Property ingestion.catalogs[].schemas[].datatypes[].stereotype

🔑 Property ingestion.catalogs[].schemas[].datatypes[].restricted

Functionality

Filters

Configuration

Source

🔑 Property `source.url` *

🔑 Property `source.warehouse-id` *

Authentication

🔑 Property `source.authentication`

🔑 Property `source.authentication.method`

Token

🔑 Property `source.authentication.token`

OAuth 2.0

🔑 Property `source.authentication.client-id`

🔑 Property `source.authentication.credentials.type`

Client credentials with client secret

🔑 Property `source.authentication.credentials.client-secret`

Catalogs

🔑 Property `ingestion.catalogs`

🔑 Property `ingestion.catalogs[].names`

🔑 Property `ingestion.catalogs[].stereotype`

Schemas

🔑 Property `ingestion.catalogs[].schemas`

🔑 Property `ingestion.catalogs[].schemas[].names`

🔑 Property `ingestion.catalogs[].schemas[].stereotype`

🔑 Property `ingestion.catalogs[].schemas[].deployment`

🔑 Property `ingestion.catalogs[].schemas[].deployment.deployed-in`

🔑 Property `ingestion.catalogs[].schemas[].deployment.favorite`

🔑 Property `ingestion.catalogs[].schemas[].deployment.qualifier`

Tables

🔑 Property `ingestion.catalogs[].schemas[].tables`

🔑 Property `ingestion.catalogs[].schemas[].tables[].names`

🔑 Property `ingestion.catalogs[].schemas[].tables[].types`

🔑 Property `ingestion.catalogs[].schemas[].tables[].view-definitions`

🔑 Property `ingestion.catalogs[].schemas[].tables[].stereotype`

Columns

🔑 Property `ingestion.catalogs[].schemas[].tables[].columns`

🔑 Property `ingestion.catalogs[].schemas[].tables[].columns[].names`

🔑 Property `ingestion.catalogs[].schemas[].tables[].columns[].stereotype`

Foreign keys

🔑 Property `ingestion.catalogs[].schemas[].tables[].foreign-keys`

🔑 Property `ingestion.catalogs[].schemas[].tables[].foreign-keys[].names`

🔑 Property `ingestion.catalogs[].schemas[].tables[].foreign-keys[].stereotype`

Datatypes

🔑 Property `ingestion.catalogs[].schemas[].datatypes`

🔑 Property `ingestion.catalogs[].schemas[].datatypes[].names`

🔑 Property `ingestion.catalogs[].schemas[].datatypes[].stereotype`

🔑 Property `ingestion.catalogs[].schemas[].datatypes[].restricted`