Omni
Overview
Omni is a cloud-native business intelligence platform. Learn more in the official Omni documentation.
The DataHub integration for Omni covers BI entities such as dashboards, charts, semantic datasets, and related ownership context. Depending on module capabilities, it can also capture features such as lineage, usage, profiling, ownership, tags, and stateful deletion detection.
Concept Mapping
| Omni Concept | DataHub Concept | Notes |
|---|---|---|
Folder | Container | SubType "Folder" |
Dashboard | Dashboard | Published document with hasDashboard=true |
Tile | Chart | Each query presentation within a dashboard |
Topic | Dataset | SubType "Topic" — the semantic join graph entry point |
View | Dataset | SubType "View" — semantic layer table with dimensions and measures as schema fields |
Workbook | Dataset | SubType "Workbook" — unpublished personal exploration document |
| Warehouse table | Dataset | Native platform entity (e.g. Snowflake, BigQuery); linked as upstream of Omni Views |
| Document owner | User (a.k.a CorpUser) | Propagated as TECHNICAL_OWNER to Dashboard and Chart entities |
Module omni
Important Capabilities
| Capability | Status | Notes |
|---|---|---|
| Column-level Lineage | ✅ | Field-level lineage when include_column_lineage=true. |
| Descriptions | ✅ | Enabled by default. |
| Detect Deleted Entities | ✅ | Enabled by default via stateful ingestion. |
| Extract Ownership | ✅ | Document owner extracted from Omni API. |
| Platform Instance | ✅ | Supported via connection_to_platform_instance config. |
| Schema Metadata | ✅ | Dimensions and measures extracted as schema columns. |
| Table-Level Lineage | ✅ | Dashboard → Tile → Topic → View → DB Table. |
| Test Connection | ✅ | Enabled by default. |
Overview
The omni module ingests metadata from the Omni BI platform into DataHub. It is intended for production ingestion workflows and supports the following:
- Folders (as Containers), Dashboards, and Chart tiles
- Semantic layer: Models, Topics, and Views with schema fields (dimensions and measures)
- Physical warehouse tables with upstream lineage stitched to existing DataHub entities
- Column-level (fine-grained) lineage from semantic view fields back to warehouse columns
- Ownership propagated from the Omni document API
Lineage is emitted as a five-hop chain:
Folder → Dashboard → Chart (tile) → Topic → Semantic View → Physical Table
Prerequisites
Before running ingestion, ensure you have the following:
An Omni Organization API key with read access to models, documents, and connections. Generate API keys in Omni Admin → API Keys.
Connection mapping configuration if you want physical table lineage to stitch with existing warehouse entities in DataHub. You will need to map each Omni connection ID to the corresponding DataHub platform name, platform instance, and database name:
connection_to_platform:
"conn_abc123": "snowflake"
connection_to_platform_instance:
"conn_abc123": "my_snowflake_account"
connection_to_database:
"conn_abc123": "ANALYTICS_PROD"
Connection IDs can be found by calling the Omni /v1/connections API or from the Omni Admin UI.
If the Omni API key does not have permission to list connections (403 Forbidden), the connector will fall back to the connection_to_platform config overrides and continue ingestion without failing.
Install the Plugin
pip install 'acryl-datahub[omni]'
Starter Recipe
Check out the following recipe to get started with ingestion! See below for full configuration options.
For general pointers on writing and running a recipe, see our main recipe guide.
source:
type: omni
config:
# Coordinates
base_url: "https://your-org.omniapp.co/api"
# Credentials
api_key: "${OMNI_API_KEY}"
# Connection → warehouse stitching
# Map Omni connection IDs to DataHub platform names so that physical table
# URNs match what was ingested by your warehouse source connector.
connection_to_platform:
"conn_abc123": "snowflake"
# Optional: map connection IDs to platform instances
# connection_to_platform_instance:
# "conn_abc123": "my_snowflake_account"
# Optional: override the database name inferred from the Omni connection
# connection_to_database:
# "conn_abc123": "ANALYTICS_PROD"
# Optional: include workbook-only documents (not just published dashboards)
# include_workbook_only: false
# Optional: filter which models to ingest
# model_pattern:
# allow:
# - ".*"
# Optional: filter which documents (dashboards/workbooks) to ingest
# document_pattern:
# allow:
# - ".*"
# Optional: disable column-level lineage
# include_column_lineage: true
# Optional: stateful ingestion with stale entity removal
stateful_ingestion:
enabled: true
remove_stale_metadata: true
sink:
# sink configs
Config Details
- Options
- Schema
Note that a . is used to denote nested fields in the YAML recipe.
| Field | Description |
|---|---|
api_key ✅ string(password) | Omni Organization API key (not a Personal Access Token). Generate in Omni Admin → API Keys. The key must have read access to models, documents, and connections. |
base_url ✅ string | Omni instance base URL including the /api suffix, e.g. https://myorg.omniapp.co/api. Found in your Omni organization settings. |
connection_to_database One of string, null | Map Omni connection IDs to canonical database names used in DataHub URNs. Use when the database name in Omni differs from the name registered in DataHub. Default: None |
connection_to_platform One of string, null | Map Omni connection IDs to DataHub platform names. Required when the platform cannot be auto-detected from the connection dialect. Example: {'abc-123': 'snowflake', 'def-456': 'bigquery'} Default: None |
connection_to_platform_instance One of string, null | Map Omni connection IDs to DataHub platform instance names. Must exactly match the platform_instance used when ingesting the warehouse. Example: {'abc-123': 'prod_snowflake'} Default: None |
include_column_lineage boolean | Extract column-level (fine-grained) lineage from dashboard query fields back to Omni semantic view fields. Enables precise field-level impact analysis in DataHub. Default: True |
include_deleted boolean | Include soft-deleted Omni entities (models, documents) where the API supports it. Default: False |
include_workbook_only boolean | Include workbook-only documents that have not been published as a dashboard. When False (default), only documents with hasDashboard=true are ingested. Default: False |
max_requests_per_minute integer | Client-side throttle cap for Omni API requests (requests per minute). Omni's default rate limit is 60 req/min; set lower to leave headroom for other API consumers. Default: 50 |
normalize_snowflake_names boolean | Upper-case database, schema, and table name components in URNs when the resolved platform is Snowflake. Snowflake identifiers are case-insensitive and DataHub's Snowflake connector stores them in upper case by default. Default: True |
page_size integer | Number of records per page for paginated Omni API endpoints. Lower values reduce memory usage; higher values speed up ingestion. Default: 50 |
platform_instance One of string, null | The instance of the platform that all assets produced by this recipe belong to. This should be unique within the platform. See https://docs.datahub.com/docs/platform-instances/ for more details. Default: None |
timeout_seconds integer | HTTP request timeout in seconds for Omni API calls. Default: 30 |
env string | The environment that all assets produced by this connector belong to Default: PROD |
document_pattern AllowDenyPattern | A class to store allow deny regexes |
document_pattern.ignoreCase One of boolean, null | Whether to ignore case sensitivity during pattern matching. Default: True |
model_pattern AllowDenyPattern | A class to store allow deny regexes |
model_pattern.ignoreCase One of boolean, null | Whether to ignore case sensitivity during pattern matching. Default: True |
stateful_ingestion One of StatefulIngestionConfig, null | Stateful Ingestion Config Default: None |
stateful_ingestion.enabled boolean | Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or datahub_api is specified, otherwise False Default: False |
The JSONSchema for this configuration is inlined below.
{
"$defs": {
"AllowDenyPattern": {
"additionalProperties": false,
"description": "A class to store allow deny regexes",
"properties": {
"allow": {
"default": [
".*"
],
"description": "List of regex patterns to include in ingestion",
"items": {
"type": "string"
},
"title": "Allow",
"type": "array"
},
"deny": {
"default": [],
"description": "List of regex patterns to exclude from ingestion.",
"items": {
"type": "string"
},
"title": "Deny",
"type": "array"
},
"ignoreCase": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": true,
"description": "Whether to ignore case sensitivity during pattern matching.",
"title": "Ignorecase"
}
},
"title": "AllowDenyPattern",
"type": "object"
},
"StatefulIngestionConfig": {
"additionalProperties": false,
"description": "Basic Stateful Ingestion Specific Configuration for any source.",
"properties": {
"enabled": {
"default": false,
"description": "Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or `datahub_api` is specified, otherwise False",
"title": "Enabled",
"type": "boolean"
}
},
"title": "StatefulIngestionConfig",
"type": "object"
}
},
"additionalProperties": false,
"description": "Configuration for the Omni BI platform DataHub source.",
"properties": {
"env": {
"default": "PROD",
"description": "The environment that all assets produced by this connector belong to",
"title": "Env",
"type": "string"
},
"platform_instance": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The instance of the platform that all assets produced by this recipe belong to. This should be unique within the platform. See https://docs.datahub.com/docs/platform-instances/ for more details.",
"title": "Platform Instance"
},
"stateful_ingestion": {
"anyOf": [
{
"$ref": "#/$defs/StatefulIngestionConfig"
},
{
"type": "null"
}
],
"default": null,
"description": "Stateful Ingestion Config"
},
"base_url": {
"description": "Omni instance base URL including the /api suffix, e.g. https://myorg.omniapp.co/api. Found in your Omni organization settings.",
"title": "Base Url",
"type": "string"
},
"api_key": {
"description": "Omni Organization API key (not a Personal Access Token). Generate in Omni Admin \u2192 API Keys. The key must have read access to models, documents, and connections.",
"format": "password",
"title": "Api Key",
"type": "string",
"writeOnly": true
},
"page_size": {
"default": 50,
"description": "Number of records per page for paginated Omni API endpoints. Lower values reduce memory usage; higher values speed up ingestion.",
"maximum": 100,
"minimum": 1,
"title": "Page Size",
"type": "integer"
},
"max_requests_per_minute": {
"default": 50,
"description": "Client-side throttle cap for Omni API requests (requests per minute). Omni's default rate limit is 60 req/min; set lower to leave headroom for other API consumers.",
"maximum": 60,
"minimum": 1,
"title": "Max Requests Per Minute",
"type": "integer"
},
"timeout_seconds": {
"default": 30,
"description": "HTTP request timeout in seconds for Omni API calls.",
"maximum": 120,
"minimum": 5,
"title": "Timeout Seconds",
"type": "integer"
},
"include_deleted": {
"default": false,
"description": "Include soft-deleted Omni entities (models, documents) where the API supports it.",
"title": "Include Deleted",
"type": "boolean"
},
"include_workbook_only": {
"default": false,
"description": "Include workbook-only documents that have not been published as a dashboard. When False (default), only documents with hasDashboard=true are ingested.",
"title": "Include Workbook Only",
"type": "boolean"
},
"model_pattern": {
"$ref": "#/$defs/AllowDenyPattern",
"default": {
"allow": [
".*"
],
"deny": [],
"ignoreCase": true
},
"description": "Regex allow/deny patterns applied to Omni model IDs. Use to restrict ingestion to specific models. Example: allow: ['^prod-.*']"
},
"document_pattern": {
"$ref": "#/$defs/AllowDenyPattern",
"default": {
"allow": [
".*"
],
"deny": [],
"ignoreCase": true
},
"description": "Regex allow/deny patterns applied to Omni document identifiers. Use to restrict ingestion to specific dashboards or workbooks."
},
"include_column_lineage": {
"default": true,
"description": "Extract column-level (fine-grained) lineage from dashboard query fields back to Omni semantic view fields. Enables precise field-level impact analysis in DataHub.",
"title": "Include Column Lineage",
"type": "boolean"
},
"connection_to_platform": {
"anyOf": [
{
"additionalProperties": {
"type": "string"
},
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"description": "Map Omni connection IDs to DataHub platform names. Required when the platform cannot be auto-detected from the connection dialect. Example: {'abc-123': 'snowflake', 'def-456': 'bigquery'}",
"title": "Connection To Platform"
},
"connection_to_platform_instance": {
"anyOf": [
{
"additionalProperties": {
"type": "string"
},
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"description": "Map Omni connection IDs to DataHub platform instance names. Must exactly match the platform_instance used when ingesting the warehouse. Example: {'abc-123': 'prod_snowflake'}",
"title": "Connection To Platform Instance"
},
"connection_to_database": {
"anyOf": [
{
"additionalProperties": {
"type": "string"
},
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"description": "Map Omni connection IDs to canonical database names used in DataHub URNs. Use when the database name in Omni differs from the name registered in DataHub.",
"title": "Connection To Database"
},
"normalize_snowflake_names": {
"default": true,
"description": "Upper-case database, schema, and table name components in URNs when the resolved platform is Snowflake. Snowflake identifiers are case-insensitive and DataHub's Snowflake connector stores them in upper case by default.",
"title": "Normalize Snowflake Names",
"type": "boolean"
}
},
"required": [
"base_url",
"api_key"
],
"title": "OmniSourceConfig",
"type": "object"
}
Capabilities
Use the Important Capabilities table above as the source of truth for supported features and whether additional configuration is required.
Physical table lineage
Omni Views reference physical warehouse tables via sql_table_name in model YAML. The connector resolves each reference to a DataHub dataset URN using the connection_to_platform mapping. If normalize_snowflake_names: true (default), database, schema, and table name components are uppercased to match the casing used by the DataHub Snowflake connector.
Column-level lineage
When include_column_lineage: true (default), the connector emits FineGrainedLineage entries by parsing sql expressions in model YAML and matching field references to known view columns. This enables precise field-level impact analysis across the full chain:
physical_table.column → semantic_view.field → dashboard_tile.field
Schema metadata
For each Omni Semantic View, the connector emits a SchemaMetadata aspect containing one SchemaField per dimension and measure defined in model YAML:
- Dimensions: emitted with inferred native type (string, date, timestamp, number, boolean)
- Measures: emitted with aggregation type and native type
NUMBER - Field descriptions are extracted from the YAML
descriptionattribute when present
Model and document filtering
Use model_pattern and document_pattern to restrict ingestion to specific models or dashboards:
model_pattern:
allow:
- "^prod-.*"
deny:
- ".*-dev$"
document_pattern:
allow:
- ".*"
Limitations
- Access Filters, User Attributes, and Cache schedules are not yet ingested.
- Column lineage is limited to fields that appear in model YAML
sqlexpressions; complex or fully derived expressions may not fully resolve. - Large organizations with many models may approach Omni API rate limits; tune
max_requests_per_minuteaccordingly. - True end-to-end integration tests require a live Omni environment; the test suite uses deterministic mock API responses.
Troubleshooting
If ingestion fails, validate credentials, permissions, and connectivity first. Then review the ingestion report and logs for source-specific errors.
Common issues:
| Symptom | Likely Cause | Resolution |
|---|---|---|
403 Forbidden on /v1/connections | API key lacks connection read scope | Ingestion continues with config fallbacks; physical lineage may be incomplete |
| Physical tables not linked to warehouse entities | connection_to_platform not configured | Add connection mapping for each Omni connection ID |
| Snowflake URN mismatch | Case mismatch between Omni and DataHub Snowflake URNs | Ensure normalize_snowflake_names: true (default) |
| Column lineage empty | View YAML has no sql expressions | Expected for views using direct sql_table_name without field-level SQL |
Code Coordinates
- Class Name:
datahub.ingestion.source.omni.omni.OmniSource - Browse on GitHub
If you've got any questions on configuring ingestion for Omni, feel free to ping us on our Slack.
This page is auto-generated from the underlying source code. To make changes, please edit the relevant source files in the metadata-ingestion directory.
Tip: For quick typo fixes or documentation updates, you can click the ✏️ Edit icon directly in the GitHub UI to open a Pull Request. For larger changes and PR naming conventions, please refer to our Contributing Guide.