Mode
Overview
Mode is a business intelligence and analytics platform. Learn more in the official Mode documentation.
The DataHub integration for Mode covers BI entities such as dashboards, charts, datasets, and related ownership context. Depending on module capabilities, it can also capture features such as lineage, usage, profiling, ownership, tags, and stateful deletion detection.
Concept Mapping
While the specific concept mapping is still pending, this shows the generic concept mapping in DataHub.
| Source Concept | DataHub Concept | Notes |
|---|---|---|
| Platform/account/project scope | Platform Instance, Container | Organizes assets within the platform context. |
| Core technical asset (for example table/view/topic/file) | Dataset | Primary ingested technical asset. |
| Schema fields / columns | SchemaField | Included when schema extraction is supported. |
| Ownership and collaboration principals | CorpUser, CorpGroup | Emitted by modules that support ownership and identity metadata. |
| Dependencies and processing relationships | Lineage edges | Available when lineage extraction is supported and enabled. |
Module mode
Important Capabilities
| Capability | Status | Notes |
|---|---|---|
| Asset Containers | ✅ | Enabled by default. |
| Column-level Lineage | ✅ | Supported by default. |
| Descriptions | ✅ | Enabled by default. |
| Detect Deleted Entities | ✅ | Enabled by default via stateful ingestion. |
| Extract Ownership | ✅ | Enabled by default. |
| Platform Instance | ✅ | Enabled by default. |
| Table-Level Lineage | ✅ | Supported by default. |
Overview
The mode module ingests metadata from Mode into DataHub. It is intended for production ingestion workflows and module-specific capabilities are documented below.
Prerequisites
Before running ingestion, ensure network connectivity to the source, valid authentication credentials, and read permissions for metadata APIs required by this module.
Authentication
Generate an API token and password following Mode's Authentication documentation.
Mode requires a user account for authentication (no service accounts). Consider creating a dedicated user for DataHub ingestion.
Permissions
DataHub ingestion requires the user to have the following permissions:
Have at least the "Member" role.
For each Connection, have at least"View" access.
To check Connection permissions, navigate to "Workspace Settings" → "Manage Connections". For each connection in the list, click on the connection → "Permissions". If the default workspace access is "View" or "Query", you're all set for that connection. If it's "Restricted", you'll need to individually grant your ingestion user View access.
For each Space, have at least "View" access.
To check Collection permissions, navigate to the "My Collections" page as an Admin user. For each collection with Workspace Access set to "Restricted" access, the ingestion user must be manually granted the "Viewer" access in the "Manage Access" dialog. Collections with "All Members can View/Edit" do not need to be manually granted access.
Note that if the ingestion user has "Admin" access, then it will automatically have "View" access for all connections and collections.
Install the Plugin
pip install 'acryl-datahub[mode]'
Starter Recipe
Check out the following recipe to get started with ingestion! See below for full configuration options.
For general pointers on writing and running a recipe, see our main recipe guide.
source:
type: mode
config:
# Coordinates
connect_uri: http://app.mode.com
# Credentials
token: token
password: pass
# Options
workspace: "datahub"
default_schema: "public"
owner_username_instead_of_email: False
api_options:
retry_backoff_multiplier: 2
max_retry_interval: 10
max_attempts: 5
sink:
# sink configs
Config Details
- Options
- Schema
Note that a . is used to denote nested fields in the YAML recipe.
| Field | Description |
|---|---|
password ✅ string(password) | When creating workspace API key this is the 'Secret'. |
token ✅ string | When creating workspace API key this is the 'Key ID'. |
workspace ✅ string | The Mode workspace username. If you navigate to Workspace Settings > Details, the url will be https://app.mode.com/organizations/<workspace-username>. This is distinct from the workspace's display name, and should be all lowercase. |
connect_uri string | Mode host URL. Default: https://app.mode.com |
exclude_archived boolean | Exclude archived reports Default: False |
exclude_personal_collections boolean | Exclude personal collections from ingestion using Mode's server-side filter (?filter=custom). When True, only shared/custom collections are fetched from the API. When False, all collections are fetched (space_pattern still applies for client-side filtering). Default: True |
exclude_restricted boolean | Exclude restricted collections Default: False |
ingest_embed_url boolean | Whether to Ingest embed URL for Reports Default: True |
max_threads integer | Maximum number of threads to use for parallel API requests. Increase to speed up ingestion for large workspaces. Setting too high may trigger Mode API rate limiting (429 errors). Default: 1 |
owner_username_instead_of_email boolean | Use username for owner URN instead of Email Default: True |
platform_instance_map One of string, null | A holder for platform -> platform_instance mappings to generate correct dataset urns Default: None |
tag_measures_and_dimensions boolean | Tag measures and dimensions in the schema Default: True |
env string | The environment that all assets produced by this connector belong to Default: PROD |
api_options ModeAPIConfig | |
api_options.max_attempts integer | Maximum number of attempts to retry before failing Default: 10 |
api_options.max_retry_interval One of integer, number | Maximum interval to wait when retrying Default: 60 |
api_options.requests_per_minute integer | Maximum API requests per minute across all threads. Mode's API limit is ~240 req/min (4 req/s). Default of 180 leaves headroom to avoid 429 errors. Default: 180 |
api_options.retry_backoff_multiplier One of integer, number | Multiplier for exponential backoff when waiting to retry Default: 2 |
api_options.timeout integer | Timeout setting, how long to wait for the Mode rest api to send data before giving up Default: 40 |
space_pattern AllowDenyPattern | A class to store allow deny regexes |
space_pattern.ignoreCase One of boolean, null | Whether to ignore case sensitivity during pattern matching. Default: True |
stateful_ingestion One of StatefulStaleMetadataRemovalConfig, null | Default: None |
stateful_ingestion.enabled boolean | Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or datahub_api is specified, otherwise False Default: False |
stateful_ingestion.fail_safe_threshold number | Prevents large amount of soft deletes & the state from committing from accidental changes to the source configuration if the relative change percent in entities compared to the previous state is above the 'fail_safe_threshold'. Default: 75.0 |
stateful_ingestion.remove_stale_metadata boolean | Soft-deletes the entities present in the last successful run but missing in the current run with stateful_ingestion enabled. Default: True |
The JSONSchema for this configuration is inlined below.
{
"$defs": {
"AllowDenyPattern": {
"additionalProperties": false,
"description": "A class to store allow deny regexes",
"properties": {
"allow": {
"default": [
".*"
],
"description": "List of regex patterns to include in ingestion",
"items": {
"type": "string"
},
"title": "Allow",
"type": "array"
},
"deny": {
"default": [],
"description": "List of regex patterns to exclude from ingestion.",
"items": {
"type": "string"
},
"title": "Deny",
"type": "array"
},
"ignoreCase": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": true,
"description": "Whether to ignore case sensitivity during pattern matching.",
"title": "Ignorecase"
}
},
"title": "AllowDenyPattern",
"type": "object"
},
"ModeAPIConfig": {
"additionalProperties": false,
"properties": {
"retry_backoff_multiplier": {
"anyOf": [
{
"type": "integer"
},
{
"type": "number"
}
],
"default": 2,
"description": "Multiplier for exponential backoff when waiting to retry",
"ge": 0,
"title": "Retry Backoff Multiplier"
},
"max_retry_interval": {
"anyOf": [
{
"type": "integer"
},
{
"type": "number"
}
],
"default": 60,
"description": "Maximum interval to wait when retrying",
"ge": 0,
"title": "Max Retry Interval"
},
"max_attempts": {
"default": 10,
"description": "Maximum number of attempts to retry before failing",
"minimum": 1,
"title": "Max Attempts",
"type": "integer"
},
"timeout": {
"default": 40,
"description": "Timeout setting, how long to wait for the Mode rest api to send data before giving up",
"minimum": 1,
"title": "Timeout",
"type": "integer"
},
"requests_per_minute": {
"default": 180,
"description": "Maximum API requests per minute across all threads. Mode's API limit is ~240 req/min (4 req/s). Default of 180 leaves headroom to avoid 429 errors.",
"minimum": 1,
"title": "Requests Per Minute",
"type": "integer"
}
},
"title": "ModeAPIConfig",
"type": "object"
},
"StatefulStaleMetadataRemovalConfig": {
"additionalProperties": false,
"description": "Base specialized config for Stateful Ingestion with stale metadata removal capability.",
"properties": {
"enabled": {
"default": false,
"description": "Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or `datahub_api` is specified, otherwise False",
"title": "Enabled",
"type": "boolean"
},
"remove_stale_metadata": {
"default": true,
"description": "Soft-deletes the entities present in the last successful run but missing in the current run with stateful_ingestion enabled.",
"title": "Remove Stale Metadata",
"type": "boolean"
},
"fail_safe_threshold": {
"default": 75.0,
"description": "Prevents large amount of soft deletes & the state from committing from accidental changes to the source configuration if the relative change percent in entities compared to the previous state is above the 'fail_safe_threshold'.",
"maximum": 100.0,
"minimum": 0.0,
"title": "Fail Safe Threshold",
"type": "number"
}
},
"title": "StatefulStaleMetadataRemovalConfig",
"type": "object"
}
},
"additionalProperties": false,
"properties": {
"env": {
"default": "PROD",
"description": "The environment that all assets produced by this connector belong to",
"title": "Env",
"type": "string"
},
"platform_instance_map": {
"anyOf": [
{
"additionalProperties": {
"type": "string"
},
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"description": "A holder for platform -> platform_instance mappings to generate correct dataset urns",
"title": "Platform Instance Map"
},
"stateful_ingestion": {
"anyOf": [
{
"$ref": "#/$defs/StatefulStaleMetadataRemovalConfig"
},
{
"type": "null"
}
],
"default": null
},
"connect_uri": {
"default": "https://app.mode.com",
"description": "Mode host URL.",
"title": "Connect Uri",
"type": "string"
},
"token": {
"description": "When creating workspace API key this is the 'Key ID'.",
"title": "Token",
"type": "string"
},
"password": {
"description": "When creating workspace API key this is the 'Secret'.",
"format": "password",
"title": "Password",
"type": "string",
"writeOnly": true
},
"exclude_restricted": {
"default": false,
"description": "Exclude restricted collections",
"title": "Exclude Restricted",
"type": "boolean"
},
"exclude_personal_collections": {
"default": true,
"description": "Exclude personal collections from ingestion using Mode's server-side filter (?filter=custom). When True, only shared/custom collections are fetched from the API. When False, all collections are fetched (space_pattern still applies for client-side filtering).",
"title": "Exclude Personal Collections",
"type": "boolean"
},
"workspace": {
"description": "The Mode workspace username. If you navigate to Workspace Settings > Details, the url will be `https://app.mode.com/organizations/<workspace-username>`. This is distinct from the workspace's display name, and should be all lowercase.",
"title": "Workspace",
"type": "string"
},
"space_pattern": {
"$ref": "#/$defs/AllowDenyPattern",
"default": {
"allow": [
".*"
],
"deny": [
"^Personal$"
],
"ignoreCase": true
},
"description": "Regex patterns for mode spaces to filter in ingestion (Spaces named as 'Personal' are filtered by default.) Specify regex to only match the space name. e.g. to only ingest space named analytics, use the regex 'analytics'"
},
"owner_username_instead_of_email": {
"default": true,
"description": "Use username for owner URN instead of Email",
"title": "Owner Username Instead Of Email",
"type": "boolean"
},
"api_options": {
"$ref": "#/$defs/ModeAPIConfig",
"default": {
"retry_backoff_multiplier": 2,
"max_retry_interval": 60,
"max_attempts": 10,
"timeout": 40,
"requests_per_minute": 180
},
"description": "Retry/Wait settings for Mode API to avoid \"Too many Requests\" error. See Mode API Options below"
},
"ingest_embed_url": {
"default": true,
"description": "Whether to Ingest embed URL for Reports",
"title": "Ingest Embed Url",
"type": "boolean"
},
"tag_measures_and_dimensions": {
"default": true,
"description": "Tag measures and dimensions in the schema",
"title": "Tag Measures And Dimensions",
"type": "boolean"
},
"exclude_archived": {
"default": false,
"description": "Exclude archived reports",
"title": "Exclude Archived",
"type": "boolean"
},
"max_threads": {
"default": 1,
"description": "Maximum number of threads to use for parallel API requests. Increase to speed up ingestion for large workspaces. Setting too high may trigger Mode API rate limiting (429 errors).",
"maximum": 50,
"minimum": 1,
"title": "Max Threads",
"type": "integer"
}
},
"required": [
"token",
"password",
"workspace"
],
"title": "ModeConfig",
"type": "object"
}
Capabilities
Use the Important Capabilities table above as the source of truth for supported features and whether additional configuration is required.
Report
Report metadata is sourced from Mode report APIs, including title, description, ownership, and chart associations.
Chart
Chart-level metadata is sourced from Mode chart APIs:
Chart Information
Extracted chart details include chart type, chart title, and chart-specific metadata used to build DataHub chart entities.
Table Information
Table result metadata from report queries is used to identify upstream dataset context and query relationships.
Pivot Table Information
Pivot result metadata is extracted when available to improve chart/dataset relationship coverage for pivot-based analyses.
Limitations
Module behavior is constrained by source APIs, permissions, and metadata exposed by the platform. Refer to capability notes for unsupported or conditional features.
Troubleshooting
If ingestion fails, validate credentials, permissions, connectivity, and scope filters first. Then review ingestion logs for source-specific errors and adjust configuration accordingly.
Code Coordinates
- Class Name:
datahub.ingestion.source.mode.ModeSource - Browse on GitHub
If you've got any questions on configuring ingestion for Mode, feel free to ping us on our Slack.
This page is auto-generated from the underlying source code. To make changes, please edit the relevant source files in the metadata-ingestion directory.
Tip: For quick typo fixes or documentation updates, you can click the ✏️ Edit icon directly in the GitHub UI to open a Pull Request. For larger changes and PR naming conventions, please refer to our Contributing Guide.