Skip to main content

Omni

Overview

Omni is a cloud-native business intelligence platform. Learn more in the official Omni documentation.

The DataHub integration for Omni covers BI entities such as dashboards, charts, semantic datasets, and related ownership context. Depending on module capabilities, it can also capture features such as lineage, usage, profiling, ownership, tags, and stateful deletion detection.

Concept Mapping

Omni ConceptDataHub ConceptNotes
FolderContainerSubType "Folder"
DashboardDashboardPublished document with hasDashboard=true
TileChartEach query presentation within a dashboard
TopicDatasetSubType "Topic" — the semantic join graph entry point
ViewDatasetSubType "View" — semantic layer table with dimensions and measures as schema fields
WorkbookDatasetSubType "Workbook" — unpublished personal exploration document
Warehouse tableDatasetNative platform entity (e.g. Snowflake, BigQuery); linked as upstream of Omni Views
Document ownerUser (a.k.a CorpUser)Propagated as TECHNICAL_OWNER to Dashboard and Chart entities

Module omni

Incubating

Important Capabilities

CapabilityStatusNotes
Column-level LineageField-level lineage when include_column_lineage=true.
DescriptionsEnabled by default.
Detect Deleted EntitiesEnabled by default via stateful ingestion.
Extract OwnershipDocument owner extracted from Omni API.
Platform InstanceSupported via connection_to_platform_instance config.
Schema MetadataDimensions and measures extracted as schema columns.
Table-Level LineageDashboard → Tile → Topic → View → DB Table.
Test ConnectionEnabled by default.

Overview

The omni module ingests metadata from the Omni BI platform into DataHub. It is intended for production ingestion workflows and supports the following:

  • Folders (as Containers), Dashboards, and Chart tiles
  • Semantic layer: Models, Topics, and Views with schema fields (dimensions and measures)
  • Physical warehouse tables with upstream lineage stitched to existing DataHub entities
  • Column-level (fine-grained) lineage from semantic view fields back to warehouse columns
  • Ownership propagated from the Omni document API

Lineage is emitted as a five-hop chain:

Folder → Dashboard → Chart (tile) → Topic → Semantic View → Physical Table

Prerequisites

Before running ingestion, ensure you have the following:

  1. An Omni Organization API key with read access to models, documents, and connections. Generate API keys in Omni Admin → API Keys.

  2. Connection mapping configuration if you want physical table lineage to stitch with existing warehouse entities in DataHub. You will need to map each Omni connection ID to the corresponding DataHub platform name, platform instance, and database name:

connection_to_platform:
"conn_abc123": "snowflake"
connection_to_platform_instance:
"conn_abc123": "my_snowflake_account"
connection_to_database:
"conn_abc123": "ANALYTICS_PROD"

Connection IDs can be found by calling the Omni /v1/connections API or from the Omni Admin UI.

note

If the Omni API key does not have permission to list connections (403 Forbidden), the connector will fall back to the connection_to_platform config overrides and continue ingestion without failing.

Install the Plugin

pip install 'acryl-datahub[omni]'

Starter Recipe

Check out the following recipe to get started with ingestion! See below for full configuration options.

For general pointers on writing and running a recipe, see our main recipe guide.

source:
type: omni
config:
# Coordinates
base_url: "https://your-org.omniapp.co/api"

# Credentials
api_key: "${OMNI_API_KEY}"

# Connection → warehouse stitching
# Map Omni connection IDs to DataHub platform names so that physical table
# URNs match what was ingested by your warehouse source connector.
connection_to_platform:
"conn_abc123": "snowflake"

# Optional: map connection IDs to platform instances
# connection_to_platform_instance:
# "conn_abc123": "my_snowflake_account"

# Optional: override the database name inferred from the Omni connection
# connection_to_database:
# "conn_abc123": "ANALYTICS_PROD"

# Optional: include workbook-only documents (not just published dashboards)
# include_workbook_only: false

# Optional: filter which models to ingest
# model_pattern:
# allow:
# - ".*"

# Optional: filter which documents (dashboards/workbooks) to ingest
# document_pattern:
# allow:
# - ".*"

# Optional: disable column-level lineage
# include_column_lineage: true

# Optional: stateful ingestion with stale entity removal
stateful_ingestion:
enabled: true
remove_stale_metadata: true

sink:
# sink configs

Config Details

Note that a . is used to denote nested fields in the YAML recipe.

FieldDescription
api_key 
string(password)
Omni Organization API key (not a Personal Access Token). Generate in Omni Admin → API Keys. The key must have read access to models, documents, and connections.
base_url 
string
Omni instance base URL including the /api suffix, e.g. https://myorg.omniapp.co/api. Found in your Omni organization settings.
connection_to_database
One of string, null
Map Omni connection IDs to canonical database names used in DataHub URNs. Use when the database name in Omni differs from the name registered in DataHub.
Default: None
connection_to_platform
One of string, null
Map Omni connection IDs to DataHub platform names. Required when the platform cannot be auto-detected from the connection dialect. Example: {'abc-123': 'snowflake', 'def-456': 'bigquery'}
Default: None
connection_to_platform_instance
One of string, null
Map Omni connection IDs to DataHub platform instance names. Must exactly match the platform_instance used when ingesting the warehouse. Example: {'abc-123': 'prod_snowflake'}
Default: None
include_column_lineage
boolean
Extract column-level (fine-grained) lineage from dashboard query fields back to Omni semantic view fields. Enables precise field-level impact analysis in DataHub.
Default: True
include_deleted
boolean
Include soft-deleted Omni entities (models, documents) where the API supports it.
Default: False
include_workbook_only
boolean
Include workbook-only documents that have not been published as a dashboard. When False (default), only documents with hasDashboard=true are ingested.
Default: False
max_requests_per_minute
integer
Client-side throttle cap for Omni API requests (requests per minute). Omni's default rate limit is 60 req/min; set lower to leave headroom for other API consumers.
Default: 50
normalize_snowflake_names
boolean
Upper-case database, schema, and table name components in URNs when the resolved platform is Snowflake. Snowflake identifiers are case-insensitive and DataHub's Snowflake connector stores them in upper case by default.
Default: True
page_size
integer
Number of records per page for paginated Omni API endpoints. Lower values reduce memory usage; higher values speed up ingestion.
Default: 50
platform_instance
One of string, null
The instance of the platform that all assets produced by this recipe belong to. This should be unique within the platform. See https://docs.datahub.com/docs/platform-instances/ for more details.
Default: None
timeout_seconds
integer
HTTP request timeout in seconds for Omni API calls.
Default: 30
env
string
The environment that all assets produced by this connector belong to
Default: PROD
document_pattern
AllowDenyPattern
A class to store allow deny regexes
document_pattern.ignoreCase
One of boolean, null
Whether to ignore case sensitivity during pattern matching.
Default: True
model_pattern
AllowDenyPattern
A class to store allow deny regexes
model_pattern.ignoreCase
One of boolean, null
Whether to ignore case sensitivity during pattern matching.
Default: True
stateful_ingestion
One of StatefulIngestionConfig, null
Stateful Ingestion Config
Default: None
stateful_ingestion.enabled
boolean
Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or datahub_api is specified, otherwise False
Default: False

Capabilities

Use the Important Capabilities table above as the source of truth for supported features and whether additional configuration is required.

Physical table lineage

Omni Views reference physical warehouse tables via sql_table_name in model YAML. The connector resolves each reference to a DataHub dataset URN using the connection_to_platform mapping. If normalize_snowflake_names: true (default), database, schema, and table name components are uppercased to match the casing used by the DataHub Snowflake connector.

Column-level lineage

When include_column_lineage: true (default), the connector emits FineGrainedLineage entries by parsing sql expressions in model YAML and matching field references to known view columns. This enables precise field-level impact analysis across the full chain:

physical_table.column → semantic_view.field → dashboard_tile.field

Schema metadata

For each Omni Semantic View, the connector emits a SchemaMetadata aspect containing one SchemaField per dimension and measure defined in model YAML:

  • Dimensions: emitted with inferred native type (string, date, timestamp, number, boolean)
  • Measures: emitted with aggregation type and native type NUMBER
  • Field descriptions are extracted from the YAML description attribute when present

Model and document filtering

Use model_pattern and document_pattern to restrict ingestion to specific models or dashboards:

model_pattern:
allow:
- "^prod-.*"
deny:
- ".*-dev$"

document_pattern:
allow:
- ".*"

Limitations

  • Access Filters, User Attributes, and Cache schedules are not yet ingested.
  • Column lineage is limited to fields that appear in model YAML sql expressions; complex or fully derived expressions may not fully resolve.
  • Large organizations with many models may approach Omni API rate limits; tune max_requests_per_minute accordingly.
  • True end-to-end integration tests require a live Omni environment; the test suite uses deterministic mock API responses.

Troubleshooting

If ingestion fails, validate credentials, permissions, and connectivity first. Then review the ingestion report and logs for source-specific errors.

Common issues:

SymptomLikely CauseResolution
403 Forbidden on /v1/connectionsAPI key lacks connection read scopeIngestion continues with config fallbacks; physical lineage may be incomplete
Physical tables not linked to warehouse entitiesconnection_to_platform not configuredAdd connection mapping for each Omni connection ID
Snowflake URN mismatchCase mismatch between Omni and DataHub Snowflake URNsEnsure normalize_snowflake_names: true (default)
Column lineage emptyView YAML has no sql expressionsExpected for views using direct sql_table_name without field-level SQL

Code Coordinates

  • Class Name: datahub.ingestion.source.omni.omni.OmniSource
  • Browse on GitHub
Questions?

If you've got any questions on configuring ingestion for Omni, feel free to ping us on our Slack.

💡 Contributing to this documentation

This page is auto-generated from the underlying source code. To make changes, please edit the relevant source files in the metadata-ingestion directory.

Tip: For quick typo fixes or documentation updates, you can click the ✏️ Edit icon directly in the GitHub UI to open a Pull Request. For larger changes and PR naming conventions, please refer to our Contributing Guide.