Skip to main content

v2.0.0

Release Changelog

v2.0.0

Release Availability Date

25-June-2026

  • CLI/SDK: 1.6.0.4
  • Remote Executor: v2.0.0-cloud, v1.1.3-cloud, v1.0.3-cloud
  • On-Prem Versions:
    • Helm: 1.6.173
    • API Gateway: 0.7.3
Tested Remote Executor versions

DataHub Cloud v2.0.0 has been internally tested with the following Remote Executor versions:

Remote Executor versionStatusNotes
v2.0.0-cloudTestedRecommended.
v1.1.3-cloudTestedSupported with this release.
v1.0.3-cloudTestedSupported with this release.

New Feature Highlights

  • Search V2.5 is now the default — cross-entity ranking with name-match boosting and diversity promotion, plus latency optimizations for query understanding and multi-match.

  • Bridge-backed semantic search now covers datasets, charts, dashboards, glossary terms, and data products via hidden bridge documents in the document semantic index.

  • Action Workflows v2 — full backend + frontend reworking the authoring surface around the existing Filter (OR-of-ANDs Criterion[]) shape and a new composable DynamicSource (seed + hops + destination) for graph traversals. Adds expression engine, dynamic actors, cancel/quorum support.

  • MCP Audit — new audit surface for MCP tool invocations: GraphQL mcpAudit resolver, dedicated audit tab in Settings → AI, KPI charts and history table.

  • CDE Steward agent — new 10-star governance agent for Critical Data Element compliance and certification workflows.

  • Internationalization (i18n) — first-class infrastructure (feature flag + user settings), end-to-end string extraction across the application (entity tabs, search, settings, governance, ingestion source metadata, etc.), and initial DE translations.

  • Distributed rate limiting across REST, GraphQL, and OpenAPI with per-endpoint token-bucket controls and configurable jitter.

  • Domain propagation — automatic propagation of domain assignment across lineage and containment relationships, with attribution.

  • note_metadata_observation MCP tool — replaces register_feedback and note_sql_anchor_observation, and raises ANNOTATE or POST_ATTACHMENT proposals so agent-recorded observations land in the Context Hub inbox for SME review.

  • Smart-assertion inference V2 — decoupled from DATAHUB_USE_OBSERVE_MODELS via the new DATAHUB_USE_INFERENCE_V2 flag; assertions also record INIT during user-defined exclusion windows instead of producing spurious anomalies.

  • SecretService caller guard — non-system actors (including PATs with MANAGE_SECRETS) can no longer decrypt secrets; hardens the credential-access surface.

  • New ingestion sources: ThoughtSpot, TimescaleDB, Airbyte; production-ready SAP HANA with calc-view lineage, stored procedures, and query usage.

  • All changes in https://github.com/datahub-project/datahub/releases/tag/v1.6.0

Product

  • Action Workflows v2 authoring surface — Filter-shaped step/field/entrypoint conditions, composable DynamicSource (seed + hops + destination), regex-only FieldValidation. Feature-flag-gated (actionWorkflowsV2Enabled). See Data Access Workflows.
  • Workflow form requests accept caller-supplied id. createActionWorkflowFormRequest (and createActionWorkflowFormRequestV2) accept an optional id: String to produce a stable URN urn:li:actionRequest:<id> instead of a UUID.
  • Context Generation Settings — domain is now required for enabling Context Generation; end-to-end backend wiring through the curator agent. Tracking events added for context hub features.
  • Internationalization — i18n infrastructure with feature flag and user-settings backend; first-class string extraction across entity tabs (schema, queries, observe, validations, incidents, documentation, summary), search, settings, governance (domains, glossary, structured properties), identity / permissions, ingestion source metadata, home pages, and shared components. Initial DE translation for settings pages.
  • UI redesign: nav and home hero — fully collapsed nav redesign and home hero toggler.
  • External document UI improvements — inline preview, read-only fields, last-synced indicator.
  • Document anchor pattern UX — anchor patterns flattened into per-metric cards; sort and cap anchor patterns; inline editing of anchor pattern rendered text.
  • Glossary — Redesigned cards and sidebar with semantic tokens. Added support for custom relationships between glossary terms: create a Structured Property and select "Treat as Relationship" to define a new relationship type. Then navigate to the Related Terms tab on any glossary term and choose your custom relationship type when adding a related term.
  • Schema field drawer — column statistics shown in full inline on the About tab.
  • Dataset Summary — Generate Documentation flow now reachable from the Dataset Summary page.
  • Inbox auto-redirect — Task Center inbox auto-redirects to proposals when the tasks tab is empty.
  • Tracking events — context hub feature usage tracked for product analytics.

AI / Ask DataHub

  • MCP Audit — new GraphQL mcpAudit resolver and Settings → AI → MCP Audit tab with feature-flag gate. KPI charts, history table, session and event drawers. MCP_TELEMETRY_CAPTURE_PAYLOADS env flag controls payload capture. McpServerRequest analytics report script.
  • MCP Apps infrastructure — build infrastructure for in-product MCP App surfaces; mcp/mcp_integration/ boundary enforcement for OSS hygiene.
  • note_metadata_observation MCP tool. Replaces register_feedback and note_sql_anchor_observation with a single tool covering both metadata gaps and SQL-anchor quality observations. Update agent prompts that reference the old tool names.
  • note_metadata_observation raises ANNOTATE proposals on context docs. When called with one or more urn:li:document: URNs in related_objects, the tool raises one DOCUMENTS_PROPOSAL ActionRequest per target with proposalType: ANNOTATE. The proposal stages a draft Document carrying an ENTITY_ANNOUNCEMENT Post; until an SME accepts it from the Context Hub inbox the target doc is unchanged. On acceptance, only the Post migrates to the target.
  • note_metadata_observation raises POST_ATTACHMENT_PROPOSAL when the agent cannot pin the gap to an existing doc. Emits two coordinated MCPs: a Post with postType=AI_OBSERVATION and target=null, plus a new POST_ATTACHMENT_PROPOSAL ActionRequest. The Post is invisible across existing UI surfaces by construction; the ActionRequest shows up in the Task Center for SME triage. On accept, the Post becomes a real Comment on the chosen entity.
  • DocumentProposalService.applyDraftDocumentChange respects proposalType: ANNOTATE. ANNOTATE proposals now skip the info / properties copy and only migrate Posts; EDIT, STATE_CHANGE, and CONFLICT acceptance behavior is unchanged.
  • preview_sql_context / save_sql_context use a clean preview-then-commit flow. preview_sql_context builds the MCP App render payload without writing anything; save_sql_context commits on Approve in a single write. Cancel is handled client-side — no tombstone cleanup needed. Draft-based save flow with MCP App preview.
  • find_sql_context improvements — consumes admin-curated overrideSql; deprecates generate_sql_sketch. Looker view-text and dbt model-text fallbacks added. SQL-override prop supported on semantic-anchor docs for human edits / proposals.
  • Semantic-anchor enrichment — dataset bridge documents now include structured properties, field terms/tags, doc links, and domains. Dialect detection and Bedrock retry resilience.
  • Per-pattern question generation — semantic anchors generate distinct questions per pattern; metricKeys lookup property for metric anchors.
  • CDE Steward agent — new 10-star governance agent for Critical Data Element compliance + certification.
  • Agent authoring foundation — 10-star authoring foundation with curator Cloudsmith wiring and runtime env-flag gating for SYSTEM agents. Context Drop and Context Curator agents default to off.
  • Ingestion-agent tooling — Read/Grep/Glob source-browsing tools for connector troubleshooting from inside the agent.
  • LLM telemetry and billingLLMCallEvent per-call telemetry (token-billing primitive), context-local cost accumulation, per-turn usage logging, surface and time-to-first-token captured. New POST /openapi/v1/billing/usage endpoint.
  • Slack bot service-account mappings. Admins can map a Slack bot_id to a DataHub corp user URN under Settings → Platform → AI → Enable Ask DataHub in Slack. When a registered bot @-mentions DataHub, the question is attributed to the mapped service account. Useful for cop-rotation bots, ticketing bots, and other automations.
  • Smart search — surface externalUrl and inject DataHub URLs into results.
  • Confluence connector — HTML → Markdown body conversion via markdownify.

Platform

  • SecretService caller guard. Non-system actors (including human users and PATs with MANAGE_SECRETS) can no longer decrypt secrets. Controlled by SECRET_SERVICE_CALLER_GUARD_MODE (ENFORCE / AUDIT / DISABLED). Components that fetch secrets at runtime use system credentials in standard deployments and are unaffected.
  • Distributed rate limiting across REST, GraphQL, and OpenAPI with per-endpoint token-bucket controls and configurable jitter (RATE_LIMITS_RETRY_AFTER_JITTER_PERCENT).
  • RFC 8693 token exchange for trusted external issuers (OAuth2 token-exchange grant type).
  • Agent lifecycle stages. Disabled SYSTEM agents are now ARCHIVED via Status.lifecycleStage instead of disappearing from the API. A startup reconciler reads agent-flags.yaml and restores the previous non-ARCHIVED stage when the flag flips back on.
  • Domain propagation — automatic propagation of domain assignment across lineage / containment relationships, with attribution.
  • Bridge-backed semantic search for datasets, charts, dashboards, glossary terms, and data products. Deploy GMS and MAE consumer together so source-entity metadata changes create, update, and delete bridge documents consistently. After enabling for a new entity type, run the bridge-document backfill and generate embeddings.
  • Search V2.5 cross-entity ranking — DisMax scoring, name-match signals, diversity promotion across query understanding, multi-match, and focused fields. kNN inner_hits chunk text surfaced for the Cohere reranker.
  • Elasticsearch 8.18+ semantic search — DataHub Cloud now supports semantic search on Elasticsearch 8.18+ deployments alongside OpenSearch. GCP deployments use a managed Vertex AI embedding provider (gemini-embedding-001); AWS deployments continue to use AWS Bedrock with Cohere Embed v3.
  • Per-entity OpenSearch / Elasticsearch mapping ceilings. New ELASTICSEARCH_INDEX_ENTITYMAPPINGLIMITS_<ENTITY>_<LIMIT> env vars configure per-entity mapping limits; the configured value is baked into the index settings at creation/reindex time and pushed to existing live indices on every system update run. Use DEFAULT as the entity name for a fallback.
  • pgQueue alternative messaging transport — pluggable transport abstraction with pgQueue as an alternative to Kafka. Includes metadata-ingestion sink and actions pg_queue event source. Kafka remains the default; transport selection is configuration-driven.
  • Event filtering framework with pre-deserialization MCL optimization — drops uninteresting MCL events before deserialization, reducing GMS/MAE consumer load.
  • Retention policies re-applied on system-update — when retention configuration changes, the system-update job re-applies policies cluster-wide.
  • Configurable retention policy refreshKubernetesController honors KEDA-aware scaling signals.
  • OTel GraphQL operation tracing — OpenTelemetry instrumentation across GraphQL resolvers.
  • RelationshipChange platform event — emitted on relationship changes for downstream consumers.
  • Domain attribution — domain assignment now records the source of the assignment (manual vs. propagated) for audit purposes.
  • GraphQL operationContext threading — per-event operationContext now threads through the MCL single-event hook path.
  • Assertion ownership — assertions support owners; assignment rule UI aligned with ownership semantics.
  • Assertion failure configuration SDK — programmatic configuration of assertion failure behavior.
  • Smart-assertion delta-space bounds in predictions; new DATAHUB_EXECUTOR_ENABLE_DELTA_BOUNDS gate.
  • Severity-escalation broadcast in Slack for incident notifications.
  • Docker — published DataHub Postgres extensions image; image tags overhaul (see Breaking Changes); bundled venv symlinks for ingestion source aliases.

Ingestion

New ingestion sources:

  • ThoughtSpot — new BI source connector.
  • TimescaleDB — new connector supporting self-managed and Tiger Cloud TimescaleDB.
  • Airbyte — new connector for Airbyte metadata.
  • SAP HANA — production-ready connector with calc-view lineage, stored procedures, and query usage.

Connector improvements:

  • Hex — Major in-place upgrade: upstream lineage (table-level and column-level), Project → Component links, run history (lastRefreshed), and optional AI context documents extracted directly from Hex REST APIs. New include_lineage, use_queried_tables_lineage, connection_platform_map, and include_context_documents config options. Hex Components are now ingested as Chart entities (see Breaking Changes).
  • Snowflake — Internal Marketplace support; dynamic-table lineage extracted from DYNAMIC_TABLE_GRAPH_HISTORY; private-link Snowsight base URL override; Sweden Central Azure region mapping; fix for silently-dropped views in batched SHOW VIEWS.
  • Databricks Unity Catalog — extract primary key, foreign key, and partition key constraints; opt-in Metric View ingestion; v1.1 composable lineage with agent metadata; default profiling switched to SQLAlchemy (from Great Expectations); ownership and datasetProperties emitted as standard MCPWs with incremental config; partner/DataHub user-agent for Databricks telemetry.
  • BigQuery — Workload Identity Federation (WIF) auth.
  • Tableau — support for virtual connections; Initial SQL ingested as lineage and custom property.
  • Glue — PATCH mode for dataset properties; column-level Lake Formation tags by default (extract_lakeformation_column_tags); optional propagation of database tags to tables and columns (propagate_lakeformation_tags); inherited tags marked with propagation attribution.
  • Dremio — query lineage and view-parent lineage respect schema_pattern and dataset_pattern (and skip the _accelerator_ reflection schema); platform mappings for BIGQUERY, RESTCATALOG (Polaris OSS, Nessie, AWS Glue Iceberg REST, S3 Tables, Confluent Tableflow, Microsoft OneLake), SAPHANA, SNOWFLAKEOPENCATALOG, and UNITY; domain recipe field now actually emits a Domains aspect; stateful incremental ingestion, incremental lineage / properties, profile-skip; synthetic created = epoch 0 no longer emitted when Dremio doesn't report one; remove_stale_metadata and fail_safe_threshold exposed.
  • Athena — S3 Tables (Iceberg) support.
  • dbtskip_missing_upstreams_in_lineage config; column-level lineage restored for two-tier warehouses (catalog-prefixed SQL + v2 schema fieldPaths); test assertion entities emit an ownership aspect when the dbt test node has explicit owner metadata.
  • GCSworkload_identity auth type for GKE Workload Identity; list_objects_v2 to fix PaginationError on Hive-partitioned paths.
  • MSSQL — consolidated to a single SqlParsingAggregator.
  • Teradata — exponential backoff on transient errors; nullable / autoincrement hydration corrected for CHAR(N)-padded values.
  • Redshift — fixed late-binding view columns silently dropped due to wrong WHERE clause column name.
  • PowerBI — paginated reports with embedded RDL datasources emit lineage; CTE alias no longer leaks as upstream in native SQL lineage.
  • LookML — graceful handling of git clone failures and configurable clone timeout.
  • Modereport_pattern (AllowDenyPattern) config; chart fetch gated on chart_count rather than explorations_count.
  • SAC — query Resources OData endpoint directly instead of via $metadata.
  • Kafka Connect — fix duplicated schema segment in sink lineage URNs.
  • Dataplex — clearer GCP permission error in project-number resolution.
  • Confluence — page-body HTML converted to Markdown via markdownify.

Ingestion infrastructure:

  • Per-connector CLI version matrix with resolution stamp; fall back to default CLI version when the configured version is unset.
  • Patch-based writes for user-editable aspects — finer-grained partial updates from ingestion.
  • Great Expectations profiler is now optional — default profiling switches to SQLAlchemy. acryl-datahub installs the GE extras only when explicitly requested.
  • sqlglot[c] tokenizer restored on 30.8.0 for performance.
  • Two-tier stored procedure ingestion — correct URN format and lineage.
  • Skip empty columns for CLL — avoids spurious column-level lineage entries.

Executor:

  • Opt-in ingestion-log garbage collector. Setting DATAHUB_EXECUTOR_LOG_GC_ENABLED=true makes both the remote executor and the coordinator's embedded worker scan /tmp/datahub/logs/ on an hourly tick and delete per-execution subdirectories older than 14 days. A size cap (default 10 GB) and a 1-hour in-flight grace window apply. Default is false for this release; expect the default to flip in a follow-up release.
  • Smart assertions record INIT during exclusion windows. Surfaces the window name on nativeResults under Exclusion Window so planned freezes, migrations, or backfills no longer produce spurious anomalies.
  • Delta-space assertion bounds gated behind DATAHUB_EXECUTOR_ENABLE_DELTA_BOUNDS (default false). Protects workers pre-dating eval-time delta anchoring during mixed-fleet rollouts.
  • V1 inference path no longer imports observe-models at module load. When DATAHUB_USE_OBSERVE_MODELS=false, slim / stripped / low-RAM executor builds no longer risk an import-time crash loop.
  • Custom SQL assertions can optionally allow stored-procedure CALL statements. DATAHUB_EXECUTOR_ALLOW_CALL_STATEMENTS=true lets a custom SQL assertion's statement be a CALL my_db.my_schema.my_proc() in addition to the read-only query shapes always allowed. Off by default; enabling accepts the risk that the procedure may perform mutations.
  • mTLS client authentication for outbound HTTPS from the executor.
  • Coordinator monitor-request handling — duplicate monitor requests skipped; polling log chatter trimmed.
  • Smart-assertion v1 inference preprocessing & training improvements; exclusion window display names include schedule descriptions.

Breaking Changes

  • Search V2.5 is now enabled by default. Instances that have not set SEARCH_VERSION_V2_5_ENABLED will use V2.5 after upgrading. Existing instances still on legacy V2 may reindex search indices with the V2.5 analyzers during the upgrade. Migration: no action required to use the new default. To temporarily roll back to legacy V2, set SEARCH_VERSION_V2_5_ENABLED=false for GMS and the system-update job.
  • Dataset semantic search is no longer enabled implicitly. Enabling semantic search with the default ELASTICSEARCH_SEMANTIC_SEARCH_ENTITIES=document no longer also bridges datasets; dataset must be listed explicitly (document,dataset). Action: for instances relying on dataset semantic search, set ELASTICSEARCH_SEMANTIC_SEARCH_ENTITIES=document,dataset on GMS and the system-update job.
  • GMS rate limiting renamed. rateLimits.defaultRetryAfterSeconds / RATE_LIMITS_DEFAULT_RETRY_AFTER renamed to minRetryAfterSeconds / RATE_LIMITS_MIN_RETRY_AFTER. The value is now the minimum Retry-After floor; endpoint (token-bucket) denials may return a longer wait. Added retryAfterJitterPercent / RATE_LIMITS_RETRY_AFTER_JITTER_PERCENT (default 10) to spread endpoint retry timing.
  • Airflow plugin: Airflow 2.x dropped. acryl-datahub-airflow-plugin now requires Airflow 3.0+. The plugin always uses apache-airflow-providers-openlineage (>=2.1.0); drop openlineage-airflow from constraints. [airflow2] install extra removed; [airflow3] retained as a no-op. taskinstance URL format and patch_snowflake_schema config removed. Pin acryl-datahub-airflow-plugin <= 1.6.0 for Airflow 2.7–2.10.
  • Prefect plugin: Prefect 3.x required (>=3.0.0,<4.0.0). Entry point group changed from prefect.block to prefect.collections. Re-register the DataHub block before upgrading.
  • PowerBI Report Server: chart_pattern removed — emits a deprecation warning if set; chart-level filtering is not yet implemented for this connector.
  • Hex: Components ingested as Charts instead of Dashboards. A Hex Component defines its own visualisation that importing projects cannot override, so it maps to a Chart (analogous to a Looker Look or PowerBI Tile). Component URNs change entity type — saved views, glossary/tag/ownership assignments, and policies that targeted the old Dashboard-typed Component URNs must be manually reapplied to the new Chart URNs. Stateful-ingestion stale-removal handles most soft-deletes; component-heavy workspaces may need a one-time bulk cleanup.
  • Docker image tags: :head removed; :quickstart and :sha-<short> added. The floating :head tag is no longer published. For Compose / local quickstart: use DATAHUB_VERSION=quickstart. For Kubernetes / production: pin an immutable tag (release v* or sha-<7-char>). Bare short-SHA tags (no prefix) are no longer published; switch to the sha- prefixed form.
  • Docker / local development: legacy compose files removeddocker/docker-compose*.yml, docker/quickstart.sh, docker/dev*.sh, docker/nuke.sh, and old quickstart bundles under docker/quickstart/ (except docker-compose.quickstart-profile.yml). Use datahub docker quickstart, ./gradlew quickstartDebug, or scripts/dev/datahub-dev.sh instead.
  • Document entity: MANAGE_DOCUMENTS privilege required for creation via RestLI / OpenAPI. Updating and deleting still accept the generic EDIT_ENTITY / DELETE_ENTITY privileges (or MANAGE_DOCUMENTS). Existing owners and editors retain access. Custom automation creating documents via the API with only CREATE_ENTITY / EDIT_ENTITY must be granted MANAGE_DOCUMENTS.
  • Executor coordinator env flags. DATAHUB_EXECUTOR_MONITORS_ENABLED and DATAHUB_EXECUTOR_TASKS_ENABLED are now hard opt-outs (skip subsystem wiring and heavy imports), not fetcher-only toggles. Use DATAHUB_EXECUTOR_INGESTION_PIPELINE_ENABLED to disable the Kafka / datahub-actions pipeline.
  • getSecretValues GraphQL query now requires system-level authentication. The MANAGE_SECRETS privilege check remains in place, but SecretService now also enforces system-actor auth. Components that fetch secrets at runtime use system credentials in standard deployments and are unaffected. Customers who configured these services with a user-issued PAT must migrate to system credentials before upgrading.
  • Agent env-flag gating now drives Status.lifecycleStage. A SYSTEM agent whose AI_AGENT_<NAME>_ENABLED env-var resolves to false is now marked ARCHIVED rather than hidden by a read-time filter. Direct URN fetches resolve normally; the existing lifecycle-stage filter keeps ARCHIVED agents out of default search. A startup reconciler restores the previous stage when the flag flips back. Callers that depended on the "null entity" behavior should switch to checking Status.lifecycleStage = ARCHIVED.
  • AIAgentInfo.enablementEnvVar and enablementDefault removed from PDL. These fields were never released. SYSTEM-agent env-flag config is now declared in agent.toml's [flag] block. Drop any reference to these fields.
  • Removed REQUEST_MINIMAL_SLACK_PERMISSIONS feature flag. Replaced by DATAHUB_SLACK_SERVER_SIDE_HISTORY_ENABLED, which marks the Slack :history scopes as optional in the install screen via bot_optional. Admins can deselect them per install rather than the choice being baked in at deploy time.
  • Subscriptions without explicit notification settings now inherit defaults dynamically. GraphQL callers that omit notificationConfig when creating a subscription through syncSubscription, or send notificationConfig without notificationSettings, will use the actor's current notification defaults at delivery time. Callers that intentionally want a no-sink subscription should send notificationConfig.notificationSettings.sinkTypes: [] explicitly.
  • Smart-assertion inference routing split into DATAHUB_USE_INFERENCE_V2. Previously DATAHUB_USE_OBSERVE_MODELS=true both enabled observe-models and routed to V2 training. These are now decoupled. Action required: executors currently running DATAHUB_USE_OBSERVE_MODELS=true to get V2 must also set DATAHUB_USE_INFERENCE_V2=true.
  • Action Workflows v2 — Filter / DynamicSource model. Feature-flag-gated (actionWorkflowsV2Enabled). The legacy v1 singleFieldValueCondition field-visibility shape is removed and on-disk values were rewritten to Filter at the model-migration boundary. FieldValidation is reduced to { pattern, errorMessage } (regex only); the v2-private expression-based validation is removed.

Deprecations

  • Removed ENABLE_BEDROCK_OPTIMIZED_LATENCY. AWS latency-optimized inference is only available for Claude 3.5 Haiku — not the newer model families (Haiku 4.5, Sonnet 4.x, Opus). The flag was a no-op for those models while inflating every Bedrock cost estimate by 25%. Remove the variable from any deployment config.
  • Dynamic ownership reassignment for proposals is now opt-in. Proposals continue to work as expected; existing asset owners still receive and can act on proposals. Workflows that depend on ownership reassignment automatically updating who sees proposals must enable the option in Automations.
  • Hex: legacy lineage recipe fields removedlineage_start_time, lineage_end_time, and datahub_page_size emit a deprecation warning if set. Lineage now comes directly from the Hex REST API. Remove them from your recipe.

Security / Dependencies

  • See updating-datahub.md for the full OSS dependency changelog. SaaS-specific notes:
  • SecretService caller guard (see Platform).
  • Removed REQUEST_MINIMAL_SLACK_PERMISSIONS in favor of DATAHUB_SLACK_SERVER_SIDE_HISTORY_ENABLED for per-install Slack scope opt-out.
  • Sanitized API error responses to prevent CWE-200 information leaks.
  • CVE coverageujson >= 5.12.1 (CVE-2026-44660); idna >= 3.15 (CVE-2026-45409); plus ongoing dep-bump coverage in line with the OSS changelog.

Bug Fixes

  • Fixed dataset semantic search bridge backfill robustness during staged rollouts.
  • Fixed executor read of prefixed DATAHUB_EXECUTOR_EMBEDDED_WORKER_ENABLED env var.
  • Fixed AssertionRunEvent, MonitorSuiteInfo, SubscriptionInfo, and AssertionAssignmentRuleInfo schemaVersion bumps for forward compatibility.
  • Fixed Context Hub — prevent duplicate publish context-doc proposals on multiple context-generation runs.
  • Fixed Action Workflow rendering of human-readable name on selected dynamic-source URN fields.
  • Fixed CSS selector escape in SchemaTable to prevent SyntaxError on special characters.
  • Fixed lifecycle-stage migration on documents.
  • Fixed semantic-index marker migration to patch existing embedded dataset bridge chunks without requiring a Python re-embed.
  • Fixed Display all group relationships on the user profile page.
  • Fixed Ask DataHub bold rendering, mode-selector glow, follow-up suggestions.
  • Fixed built-in agents from seeing system-agent backend tools.
  • Fixed lifecycle-stage issues across glossary terms and documents.
  • Fixed missing applications field on glossary term and data product GraphQL queries.
  • Fixed wrong namespace in EntityDropdown (i18n).
  • Fixed executor WorkEventProducer fail-fast init when a Kafka channel is required.
  • Fixed @KafkaMessagingEnabled annotation on KafkaAdminServiceFactory for pgQueue compatibility.

Known Issues

  • TBD

Environment Variables

  • SECRET_SERVICE_CALLER_GUARD_MODE (default: ENFORCE) — Controls how SecretService responds when a non-system actor attempts to decrypt a secret. Values: ENFORCE (throw SecurityException, recommended for production), AUDIT (allow but log a warning — staged rollout), DISABLED (no enforcement — break-glass).
  • DATAHUB_EXECUTOR_LOG_GC_ENABLED (default false), DATAHUB_EXECUTOR_LOG_DIR (default /tmp/datahub/logs), DATAHUB_EXECUTOR_LOG_GC_INTERVAL_SECONDS (default 3600), DATAHUB_EXECUTOR_LOG_GC_RETENTION_DAYS (default 14), DATAHUB_EXECUTOR_LOG_GC_MAX_DIR_SIZE_MB (default 10000; 0 to disable the size cap), DATAHUB_EXECUTOR_LOG_GC_IN_FLIGHT_GRACE_SECONDS (default 3600): Opt-in in-process ingestion-log garbage collector on remote executors and the coordinator's embedded worker.
  • DATAHUB_EXECUTOR_ALLOW_CALL_STATEMENTS (default false): When true, custom SQL assertions may use a stored-procedure CALL statement in addition to read-only queries. Requires an executor restart.
  • DATAHUB_USE_INFERENCE_V2 (default false): Routes smart-assertion training to the V2 pipeline. Requires DATAHUB_USE_OBSERVE_MODELS=true.
  • DATAHUB_EXECUTOR_ENABLE_DELTA_BOUNDS (default false): Enables differenced (boundsValueSpace=DELTA) prediction bounds in the V1 smart-assertion trainer.
  • ELASTICSEARCH_INDEX_ENTITYMAPPINGLIMITS_<ENTITY>_<LIMIT>: Per-entity OpenSearch / Elasticsearch mapping ceilings. Use DEFAULT as the entity name for a fallback applying to all entity indices.
  • MCP_TELEMETRY_CAPTURE_PAYLOADS: Enables payload capture for MCP tool invocations surfaced in the MCP Audit tab.
  • RATE_LIMITS_MIN_RETRY_AFTER (replaces RATE_LIMITS_DEFAULT_RETRY_AFTER) and RATE_LIMITS_RETRY_AFTER_JITTER_PERCENT (default 10): GMS rate-limit floor and jitter spread.
  • AUTH_GMS_SESSION_COOKIE_NAME (default SESSION): Name of the Spring Security session cookie set by GMS. Set this if your deployment overrides spring.session.servlet.cookie.name.