Domain
Domains are curated, top-level categories for organizing data assets within an organization. They represent logical groupings that typically align with business units, departments, or functional areas. Unlike tags which are informal labels, Domains provide a structured way to organize assets with centralized or distributed management. A data asset can belong to only one Domain at a time.
Identity
Domains are identified by a single piece of information:
- A unique domain id: This is a string identifier that uniquely identifies the domain within DataHub. The id can be either auto-generated by DataHub or manually specified during domain creation. When creating a domain via the UI or API without specifying an id, DataHub will auto-generate a UUID-based identifier. For programmatic access or when human-readable identifiers are desired, you can specify a custom id like "marketing", "engineering", or "finance".
An example of a domain identifier is urn:li:domain:marketing.
For auto-generated domains, the URN might look like urn:li:domain:6289fccc-4af2-4cbb-96ed-051e7d1de93c.
Important Capabilities
Domain Properties
Domain properties are stored in the domainProperties aspect and contain the core metadata about a domain:
- Name: The display name of the domain (e.g., "Marketing", "Platform Engineering")
- Description: An optional detailed description of what the domain represents
- Parent Domain: Domains can be hierarchical, with child domains nested under parent domains. This allows for organizational structures like "Engineering" > "Data Engineering" > "Data Platform"
- Created Timestamp: Audit information about when the domain was created
Here is an example of creating a domain with properties:
Python SDK: Create a domain
# Inlined from /metadata-ingestion/examples/library/domain_create.py
import logging
import os
from datahub.emitter.mce_builder import make_domain_urn
from datahub.emitter.mcp import MetadataChangeProposalWrapper
from datahub.emitter.rest_emitter import DatahubRestEmitter
from datahub.metadata.schema_classes import DomainPropertiesClass
log = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
# Get DataHub connection details from environment
gms_server = os.getenv("DATAHUB_GMS_URL", "http://localhost:8080")
token = os.getenv("DATAHUB_GMS_TOKEN")
domain_urn = make_domain_urn("marketing")
domain_properties_aspect = DomainPropertiesClass(
name="Marketing", description="Entities related to the marketing department"
)
event: MetadataChangeProposalWrapper = MetadataChangeProposalWrapper(
entityUrn=domain_urn,
aspect=domain_properties_aspect,
)
rest_emitter = DatahubRestEmitter(gms_server=gms_server, token=token)
rest_emitter.emit(event)
log.info(f"Created domain {domain_urn}")
Nested Domain Hierarchies
Domains support hierarchical organization through parent-child relationships. This enables representing organizational structures with multiple levels. For example, you might have a top-level "Engineering" domain with child domains for "Data Engineering", "ML Engineering", and "Infrastructure Engineering".
Python SDK: Create a nested domain
# Inlined from /metadata-ingestion/examples/library/domain_create_nested.py
import logging
import os
from datahub.emitter.mce_builder import make_domain_urn
from datahub.emitter.mcp import MetadataChangeProposalWrapper
from datahub.emitter.rest_emitter import DatahubRestEmitter
from datahub.metadata.schema_classes import DomainPropertiesClass
log = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
domain_urn = make_domain_urn("marketing")
domain_properties_aspect = DomainPropertiesClass(
name="Verticals",
description="Entities related to the verticals sub-domain",
parentDomain="urn:li:domain:marketing",
)
event: MetadataChangeProposalWrapper = MetadataChangeProposalWrapper(
entityUrn=domain_urn,
aspect=domain_properties_aspect,
)
# Get DataHub connection details from environment
gms_server = os.getenv("DATAHUB_GMS_URL", "http://localhost:8080")
token = os.getenv("DATAHUB_GMS_TOKEN")
rest_emitter = DatahubRestEmitter(gms_server=gms_server, token=token)
rest_emitter.emit(event)
log.info(f"Created domain {domain_urn}")
Ownership
Like other entities in DataHub, domains can have owners assigned to them using the ownership aspect. Domain owners are typically responsible for:
- Managing which assets belong to the domain
- Maintaining domain metadata and documentation
- Governing data quality standards within the domain
- Serving as points of contact for domain-related questions
Ownership types for domains follow the same patterns as other entities, including TECHNICAL_OWNER, BUSINESS_OWNER, DATA_STEWARD, etc.
Python SDK: Add an owner to a domain
# Inlined from /metadata-ingestion/examples/library/domain_add_owner.py
from datahub.emitter.mcp import MetadataChangeProposalWrapper
from datahub.emitter.rest_emitter import DatahubRestEmitter
from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
from datahub.metadata.schema_classes import (
OwnerClass,
OwnershipClass,
OwnershipTypeClass,
)
from datahub.metadata.urns import CorpUserUrn, DomainUrn
graph = DataHubGraph(DatahubClientConfig(server="http://localhost:8080"))
emitter = DatahubRestEmitter(gms_server="http://localhost:8080")
domain_urn = DomainUrn(id="marketing")
# Get existing ownership
existing_ownership = graph.get_aspect(str(domain_urn), OwnershipClass)
owner_list = (
list(existing_ownership.owners)
if existing_ownership and existing_ownership.owners
else []
)
# Add new owner with the TECHNICAL_OWNER type
owner_list.append(
OwnerClass(owner=str(CorpUserUrn("jdoe")), type=OwnershipTypeClass.TECHNICAL_OWNER)
)
# Emit ownership
emitter.emit_mcp(
MetadataChangeProposalWrapper(
entityUrn=str(domain_urn), aspect=OwnershipClass(owners=owner_list)
)
)
Documentation and Links
Domains support documentation through the institutionalMemory aspect, which allows linking to external resources such as:
- Confluence pages describing the domain's purpose and scope
- Documentation about data governance policies
- Team wikis or handbooks
- Onboarding guides for the domain
Python SDK: Add documentation links to a domain
# Inlined from /metadata-ingestion/examples/library/domain_add_documentation.py
import time
from datahub.emitter.mcp import MetadataChangeProposalWrapper
from datahub.emitter.rest_emitter import DatahubRestEmitter
from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
from datahub.metadata.schema_classes import (
AuditStampClass,
DomainPropertiesClass,
InstitutionalMemoryClass,
InstitutionalMemoryMetadataClass,
)
from datahub.metadata.urns import CorpUserUrn, DomainUrn
graph = DataHubGraph(DatahubClientConfig(server="http://localhost:8080"))
emitter = DatahubRestEmitter(gms_server="http://localhost:8080")
domain_urn = DomainUrn(id="marketing")
# Get existing properties
existing_properties = graph.get_aspect(str(domain_urn), DomainPropertiesClass)
# Update description
if existing_properties:
existing_properties.description = (
"The Marketing domain contains all data assets related to marketing operations, "
"campaigns, customer analytics, and brand management."
)
properties = existing_properties
else:
properties = DomainPropertiesClass(
name="Marketing",
description=(
"The Marketing domain contains all data assets related to marketing operations, "
"campaigns, customer analytics, and brand management."
),
)
# Emit properties
emitter.emit_mcp(
MetadataChangeProposalWrapper(entityUrn=str(domain_urn), aspect=properties)
)
# Get existing institutional memory
existing_memory = graph.get_aspect(str(domain_urn), InstitutionalMemoryClass)
links_list = (
list(existing_memory.elements)
if existing_memory and existing_memory.elements
else []
)
# Add new links
audit_stamp = AuditStampClass(
time=int(time.time() * 1000), actor=str(CorpUserUrn("datahub"))
)
links_list.append(
InstitutionalMemoryMetadataClass(
url="https://wiki.company.com/domains/marketing",
description="Marketing Domain Wiki - Overview and Guidelines",
createStamp=audit_stamp,
)
)
links_list.append(
InstitutionalMemoryMetadataClass(
url="https://confluence.company.com/marketing-data-governance",
description="Marketing Data Governance Policies",
createStamp=audit_stamp,
)
)
# Emit institutional memory
emitter.emit_mcp(
MetadataChangeProposalWrapper(
entityUrn=str(domain_urn), aspect=InstitutionalMemoryClass(elements=links_list)
)
)
Assigning Assets to Domains
The primary purpose of domains is to organize data assets. Assets are assigned to domains using the domains aspect on the asset entity (not on the domain entity itself). This creates a relationship between the asset and the domain.
Python SDK: Assign a dataset to a domain
# Inlined from /metadata-ingestion/examples/library/dataset_add_domain.py
from datahub.metadata.urns import DatasetUrn, DomainUrn
from datahub.sdk import DataHubClient
client = DataHubClient.from_env()
dataset = client.entities.get(DatasetUrn(platform="snowflake", name="example_dataset"))
# If you don't know the domain urn, you can look it up:
# domain_urn = client.resolve.domain(name="marketing")
# NOTE: This will overwrite the existing domain
dataset.set_domain(DomainUrn(id="marketing"))
client.entities.update(dataset)
When you assign an asset to a domain, it will:
- Appear in the domain's entity list in the UI
- Be filterable by domain in search results
- Show the domain badge on the asset's profile page
Querying Domains
You can query domains and their associated entities using both the REST API and GraphQL API.
Fetching Domain Information via REST API
REST API: Get domain by URN
curl 'http://localhost:8080/entities/urn%3Ali%3Adomain%3Amarketing' \
-H 'Authorization: Bearer <token>'
This will return the domain entity with all its aspects, including:
domainKey: The unique identifierdomainProperties: Name, description, parent domainownership: Owners of the domaininstitutionalMemory: Links and documentation
Listing Assets in a Domain
Domains maintain relationships to all assets assigned to them. You can query these relationships to find all entities within a domain.
REST API: Find all assets in a domain
curl 'http://localhost:8080/relationships?direction=INCOMING&urn=urn%3Ali%3Adomain%3Amarketing&types=AssociatedWith' \
-H 'Authorization: Bearer <token>'
This returns all entities that have been associated with the specified domain.
Python SDK: Query domain from a dataset
# Inlined from /metadata-ingestion/examples/library/dataset_query_domain.py
from datahub.sdk import DataHubClient, DatasetUrn
client = DataHubClient.from_env()
dataset = client.entities.get(
DatasetUrn(platform="hive", name="fct_users_created", env="PROD")
)
# Print the dataset domain
print(dataset.domain)
Searching and Filtering by Domain
Once assets are assigned to domains, you can:
- Filter search results by domain using the domain filter
- Search within a specific domain to find assets
- Use domains as part of complex search queries
The domains field on assets is indexed and searchable, making it efficient to filter large datasets by domain membership.
Python SDK: Search for entities in a domain
# Inlined from /metadata-ingestion/examples/library/search_filter_by_domain.py
from datahub.sdk import DataHubClient
from datahub.sdk.search_filters import FilterDsl as F
# search for all assets in the marketing domain
client = DataHubClient.from_env()
results = client.search.get_urns(filter=F.domain("urn:li:domain:marketing"))
Integration Points
Domains integrate with several key DataHub features:
Relationship to Other Entities
Domains have relationships with:
- Data Assets: Datasets, dashboards, charts, ML models, and other data assets can be assigned to domains via the
domainsaspect - Parent Domains: Domains can have parent-child relationships, creating hierarchical organizational structures
- Users and Groups: Domains have owners (via the
ownershipaspect) who are responsible for managing the domain
GraphQL Resolvers
The domain entity is supported by several GraphQL resolvers in the datahub-graphql-core module:
CreateDomainResolver: Creates new domainsSetDomainResolver: Assigns assets to domainsUnsetDomainResolver: Removes assets from domainsListDomainsResolver: Lists all available domainsDeleteDomainResolver: Deletes a domainDomainEntitiesResolver: Retrieves all entities within a domainParentDomainsResolver: Resolves the parent hierarchy of a domainBatchSetDomainResolver: Assigns multiple assets to a domain in one operationMoveDomainResolver: Moves a domain to a different parent
Usage Patterns
Common usage patterns include:
- Data Mesh Organization: Using domains to represent different data product teams or business domains
- Departmental Structure: Organizing assets by company departments (Finance, Marketing, Engineering)
- Product Lines: Grouping assets by product or business line
- Regulatory Boundaries: Separating assets by compliance requirements or data residency rules
- Nested Structures: Creating hierarchical organizations like Region > Country > Business Unit
Integration with Ingestion
During metadata ingestion, domains can be automatically assigned using the domain configuration in ingestion recipes. This allows:
- Bulk assignment of domains based on naming patterns
- Automated domain assignment during discovery
- Consistent domain tagging across similar assets
See the Domains feature guide for detailed ingestion configuration examples.
Notable Exceptions
Single Domain Assignment
Unlike tags and glossary terms which support multiple assignments, an asset can belong to only one domain at a time. If you assign an asset to a new domain, it will automatically be removed from its previous domain.
Domain Resolution During Ingestion
When using bare domain names (like "Marketing") in ingestion recipes, DataHub will attempt to resolve them to provisioned domains. The resolution process checks:
- First, for a domain with URN
urn:li:domain:Marketing - Then, for any domain with the name "Marketing"
If resolution fails, ingestion will fail to ensure data integrity. To avoid resolution issues, you can use fully-qualified domain URNs in ingestion configurations.
Hierarchical Considerations
When organizing domains hierarchically:
- Assets are assigned to the most specific (leaf) domain, not to parent domains
- Parent domains do not automatically inherit assets from child domains
- Domain hierarchies are primarily for organizational clarity in the UI
- Deleting a parent domain does not automatically delete or reassign child domains
Permissions
Managing domains requires the "Manage Domains" platform privilege. This includes:
- Creating new domains
- Modifying domain properties
- Assigning assets to domains
- Deleting domains
Individual asset assignment can also be controlled by "Edit Domain" metadata policies on specific entity types.
Technical Reference Guide
The sections above provide an overview of how to use this entity. The following sections provide detailed technical information about how metadata is stored and represented in DataHub.
Aspects are the individual pieces of metadata that can be attached to an entity. Each aspect contains specific information (like ownership, tags, or properties) and is stored as a separate record, allowing for flexible and incremental metadata updates.
Relationships show how this entity connects to other entities in the metadata graph. These connections are derived from the fields within each aspect and form the foundation of DataHub's knowledge graph.
Reading the Field Tables
Each aspect's field table includes an Annotations column that provides additional metadata about how fields are used:
- ⚠️ Deprecated: This field is deprecated and may be removed in a future version. Check the description for the recommended alternative
- Searchable: This field is indexed and can be searched in DataHub's search interface
- Searchable (fieldname): When the field name in parentheses is shown, it indicates the field is indexed under a different name in the search index. For example,
dashboardToolis indexed astool - → RelationshipName: This field creates a relationship to another entity. The arrow indicates this field contains a reference (URN) to another entity, and the name indicates the type of relationship (e.g.,
→ Contains,→ OwnedBy)
Fields with complex types (like Edge, AuditStamp) link to their definitions in the Common Types section below.
Aspects
domainProperties
Information about a Domain
- Fields
- Raw Schema
| Field | Type | Required | Description | Annotations |
|---|---|---|---|---|
| customProperties | map | ✓ | Custom property bag. | Searchable |
| name | string | ✓ | Display name of the Domain | Searchable |
| description | string | Description of the Domain | Searchable | |
| created | AuditStamp | Created Audit stamp | Searchable | |
| parentDomain | string | Optional: Parent of the domain | Searchable, → IsPartOf |
{
"type": "record",
"Aspect": {
"name": "domainProperties"
},
"name": "DomainProperties",
"namespace": "com.linkedin.domain",
"fields": [
{
"Searchable": {
"/*": {
"fieldType": "TEXT",
"queryByDefault": true
}
},
"type": {
"type": "map",
"values": "string"
},
"name": "customProperties",
"default": {},
"doc": "Custom property bag."
},
{
"Searchable": {
"boostScore": 10.0,
"enableAutocomplete": true,
"fieldNameAliases": [
"_entityName"
],
"fieldType": "WORD_GRAM"
},
"type": "string",
"name": "name",
"doc": "Display name of the Domain"
},
{
"Searchable": {
"fieldType": "TEXT",
"hasValuesFieldName": "hasDescription"
},
"type": [
"null",
"string"
],
"name": "description",
"default": null,
"doc": "Description of the Domain"
},
{
"Searchable": {
"/time": {
"fieldName": "createdTime",
"fieldType": "DATETIME"
}
},
"type": [
"null",
{
"type": "record",
"name": "AuditStamp",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "long",
"name": "time",
"doc": "When did the resource/association/sub-resource move into the specific lifecycle stage represented by this AuditEvent."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "actor",
"doc": "The entity (e.g. a member URN) which will be credited for moving the resource/association/sub-resource into the specific lifecycle stage. It is also the one used to authorize the change."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "impersonator",
"default": null,
"doc": "The entity (e.g. a service URN) which performs the change on behalf of the Actor and must be authorized to act as the Actor."
},
{
"type": [
"null",
"string"
],
"name": "message",
"default": null,
"doc": "Additional context around how DataHub was informed of the particular change. For example: was the change created by an automated process, or manually."
}
],
"doc": "Data captured on a resource/association/sub-resource level giving insight into when that resource/association/sub-resource moved into a particular lifecycle stage, and who acted to move it into that specific lifecycle stage."
}
],
"name": "created",
"default": null,
"doc": "Created Audit stamp"
},
{
"Relationship": {
"entityTypes": [
"domain"
],
"name": "IsPartOf"
},
"Searchable": {
"fieldName": "parentDomain",
"fieldType": "URN",
"hasValuesFieldName": "hasParentDomain"
},
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "parentDomain",
"default": null,
"doc": "Optional: Parent of the domain"
}
],
"doc": "Information about a Domain"
}
institutionalMemory
Institutional memory of an entity. This is a way to link to relevant documentation and provide description of the documentation. Institutional or tribal knowledge is very important for users to leverage the entity.
- Fields
- Raw Schema
| Field | Type | Required | Description | Annotations |
|---|---|---|---|---|
| elements | InstitutionalMemoryMetadata[] | ✓ | List of records that represent institutional memory of an entity. Each record consists of a link,... |
{
"type": "record",
"Aspect": {
"name": "institutionalMemory"
},
"name": "InstitutionalMemory",
"namespace": "com.linkedin.common",
"fields": [
{
"type": {
"type": "array",
"items": {
"type": "record",
"name": "InstitutionalMemoryMetadata",
"namespace": "com.linkedin.common",
"fields": [
{
"java": {
"class": "com.linkedin.common.url.Url",
"coercerClass": "com.linkedin.common.url.UrlCoercer"
},
"type": "string",
"name": "url",
"doc": "Link to an engineering design document or a wiki page."
},
{
"type": "string",
"name": "description",
"doc": "Description of the link."
},
{
"type": {
"type": "record",
"name": "AuditStamp",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "long",
"name": "time",
"doc": "When did the resource/association/sub-resource move into the specific lifecycle stage represented by this AuditEvent."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "actor",
"doc": "The entity (e.g. a member URN) which will be credited for moving the resource/association/sub-resource into the specific lifecycle stage. It is also the one used to authorize the change."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "impersonator",
"default": null,
"doc": "The entity (e.g. a service URN) which performs the change on behalf of the Actor and must be authorized to act as the Actor."
},
{
"type": [
"null",
"string"
],
"name": "message",
"default": null,
"doc": "Additional context around how DataHub was informed of the particular change. For example: was the change created by an automated process, or manually."
}
],
"doc": "Data captured on a resource/association/sub-resource level giving insight into when that resource/association/sub-resource moved into a particular lifecycle stage, and who acted to move it into that specific lifecycle stage."
},
"name": "createStamp",
"doc": "Audit stamp associated with creation of this record"
},
{
"type": [
"null",
"com.linkedin.common.AuditStamp"
],
"name": "updateStamp",
"default": null,
"doc": "Audit stamp associated with updation of this record"
},
{
"type": [
"null",
{
"type": "record",
"name": "InstitutionalMemoryMetadataSettings",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "boolean",
"name": "showInAssetPreview",
"default": false,
"doc": "Show record in asset preview like on entity header and search previews"
}
],
"doc": "Settings related to a record of InstitutionalMemoryMetadata"
}
],
"name": "settings",
"default": null,
"doc": "Settings for this record"
}
],
"doc": "Metadata corresponding to a record of institutional memory."
}
},
"name": "elements",
"doc": "List of records that represent institutional memory of an entity. Each record consists of a link, description, creator and timestamps associated with that record."
}
],
"doc": "Institutional memory of an entity. This is a way to link to relevant documentation and provide description of the documentation. Institutional or tribal knowledge is very important for users to leverage the entity."
}
ownership
Ownership information of an entity.
- Fields
- Raw Schema
| Field | Type | Required | Description | Annotations |
|---|---|---|---|---|
| owners | Owner[] | ✓ | List of owners of the entity. | |
| ownerTypes | map | Ownership type to Owners map, populated via mutation hook. | Searchable | |
| lastModified | AuditStamp | ✓ | Audit stamp containing who last modified the record and when. A value of 0 in the time field indi... |
{
"type": "record",
"Aspect": {
"name": "ownership"
},
"name": "Ownership",
"namespace": "com.linkedin.common",
"fields": [
{
"type": {
"type": "array",
"items": {
"type": "record",
"name": "Owner",
"namespace": "com.linkedin.common",
"fields": [
{
"Relationship": {
"entityTypes": [
"corpuser",
"corpGroup"
],
"name": "OwnedBy"
},
"Searchable": {
"addToFilters": true,
"fieldName": "owners",
"fieldType": "URN",
"filterNameOverride": "Owned By",
"hasValuesFieldName": "hasOwners",
"queryByDefault": false
},
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "owner",
"doc": "Owner URN, e.g. urn:li:corpuser:ldap, urn:li:corpGroup:group_name, and urn:li:multiProduct:mp_name\n(Caveat: only corpuser is currently supported in the frontend.)"
},
{
"deprecated": true,
"type": {
"type": "enum",
"symbolDocs": {
"BUSINESS_OWNER": "A person or group who is responsible for logical, or business related, aspects of the asset.",
"CONSUMER": "A person, group, or service that consumes the data\nDeprecated! Use TECHNICAL_OWNER or BUSINESS_OWNER instead.",
"CUSTOM": "Set when ownership type is unknown or a when new one is specified as an ownership type entity for which we have no\nenum value for. This is used for backwards compatibility",
"DATAOWNER": "A person or group that is owning the data\nDeprecated! Use TECHNICAL_OWNER instead.",
"DATA_STEWARD": "A steward, expert, or delegate responsible for the asset.",
"DELEGATE": "A person or a group that overseas the operation, e.g. a DBA or SRE.\nDeprecated! Use TECHNICAL_OWNER instead.",
"DEVELOPER": "A person or group that is in charge of developing the code\nDeprecated! Use TECHNICAL_OWNER instead.",
"NONE": "No specific type associated to the owner.",
"PRODUCER": "A person, group, or service that produces/generates the data\nDeprecated! Use TECHNICAL_OWNER instead.",
"STAKEHOLDER": "A person or a group that has direct business interest\nDeprecated! Use TECHNICAL_OWNER, BUSINESS_OWNER, or STEWARD instead.",
"TECHNICAL_OWNER": "person or group who is responsible for technical aspects of the asset."
},
"deprecatedSymbols": {
"CONSUMER": true,
"DATAOWNER": true,
"DELEGATE": true,
"DEVELOPER": true,
"PRODUCER": true,
"STAKEHOLDER": true
},
"name": "OwnershipType",
"namespace": "com.linkedin.common",
"symbols": [
"CUSTOM",
"TECHNICAL_OWNER",
"BUSINESS_OWNER",
"DATA_STEWARD",
"NONE",
"DEVELOPER",
"DATAOWNER",
"DELEGATE",
"PRODUCER",
"CONSUMER",
"STAKEHOLDER"
],
"doc": "Asset owner types"
},
"name": "type",
"doc": "The type of the ownership"
},
{
"Relationship": {
"entityTypes": [
"ownershipType"
],
"name": "ownershipType"
},
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "typeUrn",
"default": null,
"doc": "The type of the ownership\nUrn of type O"
},
{
"type": [
"null",
{
"type": "record",
"name": "OwnershipSource",
"namespace": "com.linkedin.common",
"fields": [
{
"type": {
"type": "enum",
"symbolDocs": {
"AUDIT": "Auditing system or audit logs",
"DATABASE": "Database, e.g. GRANTS table",
"FILE_SYSTEM": "File system, e.g. file/directory owner",
"ISSUE_TRACKING_SYSTEM": "Issue tracking system, e.g. Jira",
"MANUAL": "Manually provided by a user",
"OTHER": "Other sources",
"SERVICE": "Other ownership-like service, e.g. Nuage, ACL service etc",
"SOURCE_CONTROL": "SCM system, e.g. GIT, SVN"
},
"name": "OwnershipSourceType",
"namespace": "com.linkedin.common",
"symbols": [
"AUDIT",
"DATABASE",
"FILE_SYSTEM",
"ISSUE_TRACKING_SYSTEM",
"MANUAL",
"SERVICE",
"SOURCE_CONTROL",
"OTHER"
]
},
"name": "type",
"doc": "The type of the source"
},
{
"type": [
"null",
"string"
],
"name": "url",
"default": null,
"doc": "A reference URL for the source"
}
],
"doc": "Source/provider of the ownership information"
}
],
"name": "source",
"default": null,
"doc": "Source information for the ownership"
},
{
"Searchable": {
"/actor": {
"fieldName": "ownerAttributionActors",
"fieldType": "URN",
"queryByDefault": false
},
"/source": {
"fieldName": "ownerAttributionSources",
"fieldType": "URN",
"queryByDefault": false
},
"/time": {
"fieldName": "ownerAttributionDates",
"fieldType": "DATETIME",
"queryByDefault": false
}
},
"type": [
"null",
{
"type": "record",
"name": "MetadataAttribution",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "long",
"name": "time",
"doc": "When this metadata was updated."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "actor",
"doc": "The entity (e.g. a member URN) responsible for applying the assocated metadata. This can\neither be a user (in case of UI edits) or the datahub system for automation."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "source",
"default": null,
"doc": "The DataHub source responsible for applying the associated metadata. This will only be filled out\nwhen a DataHub source is responsible. This includes the specific metadata test urn, the automation urn."
},
{
"type": {
"type": "map",
"values": "string"
},
"name": "sourceDetail",
"default": {},
"doc": "The details associated with why this metadata was applied. For example, this could include\nthe actual regex rule, sql statement, ingestion pipeline ID, etc."
}
],
"doc": "Information about who, why, and how this metadata was applied"
}
],
"name": "attribution",
"default": null,
"doc": "Information about who, why, and how this metadata was applied"
}
],
"doc": "Ownership information"
}
},
"name": "owners",
"doc": "List of owners of the entity."
},
{
"Searchable": {
"/*": {
"fieldType": "MAP_ARRAY",
"queryByDefault": false
}
},
"type": [
{
"type": "map",
"values": {
"type": "array",
"items": "string"
}
},
"null"
],
"name": "ownerTypes",
"default": {},
"doc": "Ownership type to Owners map, populated via mutation hook."
},
{
"type": {
"type": "record",
"name": "AuditStamp",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "long",
"name": "time",
"doc": "When did the resource/association/sub-resource move into the specific lifecycle stage represented by this AuditEvent."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "actor",
"doc": "The entity (e.g. a member URN) which will be credited for moving the resource/association/sub-resource into the specific lifecycle stage. It is also the one used to authorize the change."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "impersonator",
"default": null,
"doc": "The entity (e.g. a service URN) which performs the change on behalf of the Actor and must be authorized to act as the Actor."
},
{
"type": [
"null",
"string"
],
"name": "message",
"default": null,
"doc": "Additional context around how DataHub was informed of the particular change. For example: was the change created by an automated process, or manually."
}
],
"doc": "Data captured on a resource/association/sub-resource level giving insight into when that resource/association/sub-resource moved into a particular lifecycle stage, and who acted to move it into that specific lifecycle stage."
},
"name": "lastModified",
"default": {
"actor": "urn:li:corpuser:unknown",
"impersonator": null,
"time": 0,
"message": null
},
"doc": "Audit stamp containing who last modified the record and when. A value of 0 in the time field indicates missing data."
}
],
"doc": "Ownership information of an entity."
}
structuredProperties
Properties about an entity governed by StructuredPropertyDefinition
- Fields
- Raw Schema
| Field | Type | Required | Description | Annotations |
|---|---|---|---|---|
| properties | StructuredPropertyValueAssignment[] | ✓ | Custom property bag. |
{
"type": "record",
"Aspect": {
"name": "structuredProperties"
},
"name": "StructuredProperties",
"namespace": "com.linkedin.structured",
"fields": [
{
"type": {
"type": "array",
"items": {
"type": "record",
"name": "StructuredPropertyValueAssignment",
"namespace": "com.linkedin.structured",
"fields": [
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "propertyUrn",
"doc": "The property that is being assigned a value."
},
{
"type": {
"type": "array",
"items": [
"string",
"double"
]
},
"name": "values",
"doc": "The value assigned to the property."
},
{
"type": [
"null",
{
"type": "record",
"name": "AuditStamp",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "long",
"name": "time",
"doc": "When did the resource/association/sub-resource move into the specific lifecycle stage represented by this AuditEvent."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "actor",
"doc": "The entity (e.g. a member URN) which will be credited for moving the resource/association/sub-resource into the specific lifecycle stage. It is also the one used to authorize the change."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "impersonator",
"default": null,
"doc": "The entity (e.g. a service URN) which performs the change on behalf of the Actor and must be authorized to act as the Actor."
},
{
"type": [
"null",
"string"
],
"name": "message",
"default": null,
"doc": "Additional context around how DataHub was informed of the particular change. For example: was the change created by an automated process, or manually."
}
],
"doc": "Data captured on a resource/association/sub-resource level giving insight into when that resource/association/sub-resource moved into a particular lifecycle stage, and who acted to move it into that specific lifecycle stage."
}
],
"name": "created",
"default": null,
"doc": "Audit stamp containing who created this relationship edge and when"
},
{
"type": [
"null",
"com.linkedin.common.AuditStamp"
],
"name": "lastModified",
"default": null,
"doc": "Audit stamp containing who last modified this relationship edge and when"
},
{
"Searchable": {
"/actor": {
"fieldName": "structuredPropertyAttributionActors",
"fieldType": "URN",
"queryByDefault": false
},
"/source": {
"fieldName": "structuredPropertyAttributionSources",
"fieldType": "URN",
"queryByDefault": false
},
"/time": {
"fieldName": "structuredPropertyAttributionDates",
"fieldType": "DATETIME",
"queryByDefault": false
}
},
"type": [
"null",
{
"type": "record",
"name": "MetadataAttribution",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "long",
"name": "time",
"doc": "When this metadata was updated."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "actor",
"doc": "The entity (e.g. a member URN) responsible for applying the assocated metadata. This can\neither be a user (in case of UI edits) or the datahub system for automation."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "source",
"default": null,
"doc": "The DataHub source responsible for applying the associated metadata. This will only be filled out\nwhen a DataHub source is responsible. This includes the specific metadata test urn, the automation urn."
},
{
"type": {
"type": "map",
"values": "string"
},
"name": "sourceDetail",
"default": {},
"doc": "The details associated with why this metadata was applied. For example, this could include\nthe actual regex rule, sql statement, ingestion pipeline ID, etc."
}
],
"doc": "Information about who, why, and how this metadata was applied"
}
],
"name": "attribution",
"default": null,
"doc": "Information about who, why, and how this metadata was applied"
}
]
}
},
"name": "properties",
"doc": "Custom property bag."
}
],
"doc": "Properties about an entity governed by StructuredPropertyDefinition"
}
forms
Forms that are assigned to this entity to be filled out
- Fields
- Raw Schema
| Field | Type | Required | Description | Annotations |
|---|---|---|---|---|
| incompleteForms | FormAssociation[] | ✓ | All incomplete forms assigned to the entity. | Searchable |
| completedForms | FormAssociation[] | ✓ | All complete forms assigned to the entity. | Searchable |
| verifications | FormVerificationAssociation[] | ✓ | Verifications that have been applied to the entity via completed forms. | Searchable |
{
"type": "record",
"Aspect": {
"name": "forms"
},
"name": "Forms",
"namespace": "com.linkedin.common",
"fields": [
{
"Searchable": {
"/*/completedPrompts/*/id": {
"fieldName": "incompleteFormsCompletedPromptIds",
"fieldType": "KEYWORD",
"queryByDefault": false
},
"/*/completedPrompts/*/lastModified/time": {
"fieldName": "incompleteFormsCompletedPromptResponseTimes",
"fieldType": "DATETIME",
"queryByDefault": false
},
"/*/incompletePrompts/*/id": {
"fieldName": "incompleteFormsIncompletePromptIds",
"fieldType": "KEYWORD",
"queryByDefault": false
},
"/*/urn": {
"fieldName": "incompleteForms",
"fieldType": "URN",
"queryByDefault": false
}
},
"type": {
"type": "array",
"items": {
"type": "record",
"name": "FormAssociation",
"namespace": "com.linkedin.common",
"fields": [
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "urn",
"doc": "Urn of the applied form"
},
{
"type": {
"type": "array",
"items": {
"type": "record",
"name": "FormPromptAssociation",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "string",
"name": "id",
"doc": "The id for the prompt. This must be GLOBALLY UNIQUE."
},
{
"type": {
"type": "record",
"name": "AuditStamp",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "long",
"name": "time",
"doc": "When did the resource/association/sub-resource move into the specific lifecycle stage represented by this AuditEvent."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "actor",
"doc": "The entity (e.g. a member URN) which will be credited for moving the resource/association/sub-resource into the specific lifecycle stage. It is also the one used to authorize the change."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "impersonator",
"default": null,
"doc": "The entity (e.g. a service URN) which performs the change on behalf of the Actor and must be authorized to act as the Actor."
},
{
"type": [
"null",
"string"
],
"name": "message",
"default": null,
"doc": "Additional context around how DataHub was informed of the particular change. For example: was the change created by an automated process, or manually."
}
],
"doc": "Data captured on a resource/association/sub-resource level giving insight into when that resource/association/sub-resource moved into a particular lifecycle stage, and who acted to move it into that specific lifecycle stage."
},
"name": "lastModified",
"doc": "The last time this prompt was touched for the entity (set, unset)"
},
{
"type": [
"null",
{
"type": "record",
"name": "FormPromptFieldAssociations",
"namespace": "com.linkedin.common",
"fields": [
{
"type": [
"null",
{
"type": "array",
"items": {
"type": "record",
"name": "FieldFormPromptAssociation",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "string",
"name": "fieldPath",
"doc": "The field path on a schema field."
},
{
"type": "com.linkedin.common.AuditStamp",
"name": "lastModified",
"doc": "The last time this prompt was touched for the field on the entity (set, unset)"
}
],
"doc": "Information about the status of a particular prompt for a specific schema field\non an entity."
}
}
],
"name": "completedFieldPrompts",
"default": null,
"doc": "A list of field-level prompt associations that are not yet complete for this form."
},
{
"type": [
"null",
{
"type": "array",
"items": "com.linkedin.common.FieldFormPromptAssociation"
}
],
"name": "incompleteFieldPrompts",
"default": null,
"doc": "A list of field-level prompt associations that are complete for this form."
}
],
"doc": "Information about the field-level prompt associations on a top-level prompt association."
}
],
"name": "fieldAssociations",
"default": null,
"doc": "Optional information about the field-level prompt associations."
}
],
"doc": "Information about the status of a particular prompt.\nNote that this is where we can add additional information about individual responses:\nactor, timestamp, and the response itself."
}
},
"name": "incompletePrompts",
"default": [],
"doc": "A list of prompts that are not yet complete for this form."
},
{
"type": {
"type": "array",
"items": "com.linkedin.common.FormPromptAssociation"
},
"name": "completedPrompts",
"default": [],
"doc": "A list of prompts that have been completed for this form."
}
],
"doc": "Properties of an applied form."
}
},
"name": "incompleteForms",
"doc": "All incomplete forms assigned to the entity."
},
{
"Searchable": {
"/*/completedPrompts/*/id": {
"fieldName": "completedFormsCompletedPromptIds",
"fieldType": "KEYWORD",
"queryByDefault": false
},
"/*/completedPrompts/*/lastModified/time": {
"fieldName": "completedFormsCompletedPromptResponseTimes",
"fieldType": "DATETIME",
"queryByDefault": false
},
"/*/incompletePrompts/*/id": {
"fieldName": "completedFormsIncompletePromptIds",
"fieldType": "KEYWORD",
"queryByDefault": false
},
"/*/urn": {
"fieldName": "completedForms",
"fieldType": "URN",
"queryByDefault": false
}
},
"type": {
"type": "array",
"items": "com.linkedin.common.FormAssociation"
},
"name": "completedForms",
"doc": "All complete forms assigned to the entity."
},
{
"Searchable": {
"/*/form": {
"fieldName": "verifiedForms",
"fieldType": "URN",
"queryByDefault": false
}
},
"type": {
"type": "array",
"items": {
"type": "record",
"name": "FormVerificationAssociation",
"namespace": "com.linkedin.common",
"fields": [
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "form",
"doc": "The urn of the form that granted this verification."
},
{
"type": [
"null",
"com.linkedin.common.AuditStamp"
],
"name": "lastModified",
"default": null,
"doc": "An audit stamp capturing who and when verification was applied for this form."
}
],
"doc": "An association between a verification and an entity that has been granted\nvia completion of one or more forms of type 'VERIFICATION'."
}
},
"name": "verifications",
"default": [],
"doc": "Verifications that have been applied to the entity via completed forms."
}
],
"doc": "Forms that are assigned to this entity to be filled out"
}
testResults
Information about a Test Result
- Fields
- Raw Schema
| Field | Type | Required | Description | Annotations |
|---|---|---|---|---|
| failing | TestResult[] | ✓ | Results that are failing | Searchable, → IsFailing |
| passing | TestResult[] | ✓ | Results that are passing | Searchable, → IsPassing |
{
"type": "record",
"Aspect": {
"name": "testResults"
},
"name": "TestResults",
"namespace": "com.linkedin.test",
"fields": [
{
"Relationship": {
"/*/test": {
"entityTypes": [
"test"
],
"name": "IsFailing"
}
},
"Searchable": {
"/*/test": {
"fieldName": "failingTests",
"fieldType": "URN",
"hasValuesFieldName": "hasFailingTests",
"queryByDefault": false
}
},
"type": {
"type": "array",
"items": {
"type": "record",
"name": "TestResult",
"namespace": "com.linkedin.test",
"fields": [
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "test",
"doc": "The urn of the test"
},
{
"type": {
"type": "enum",
"symbolDocs": {
"FAILURE": " The Test Failed",
"SUCCESS": " The Test Succeeded"
},
"name": "TestResultType",
"namespace": "com.linkedin.test",
"symbols": [
"SUCCESS",
"FAILURE"
]
},
"name": "type",
"doc": "The type of the result"
},
{
"type": [
"null",
"string"
],
"name": "testDefinitionMd5",
"default": null,
"doc": "The md5 of the test definition that was used to compute this result.\nSee TestInfo.testDefinition.md5 for more information."
},
{
"type": [
"null",
{
"type": "record",
"name": "AuditStamp",
"namespace": "com.linkedin.common",
"fields": [
{
"type": "long",
"name": "time",
"doc": "When did the resource/association/sub-resource move into the specific lifecycle stage represented by this AuditEvent."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "actor",
"doc": "The entity (e.g. a member URN) which will be credited for moving the resource/association/sub-resource into the specific lifecycle stage. It is also the one used to authorize the change."
},
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": [
"null",
"string"
],
"name": "impersonator",
"default": null,
"doc": "The entity (e.g. a service URN) which performs the change on behalf of the Actor and must be authorized to act as the Actor."
},
{
"type": [
"null",
"string"
],
"name": "message",
"default": null,
"doc": "Additional context around how DataHub was informed of the particular change. For example: was the change created by an automated process, or manually."
}
],
"doc": "Data captured on a resource/association/sub-resource level giving insight into when that resource/association/sub-resource moved into a particular lifecycle stage, and who acted to move it into that specific lifecycle stage."
}
],
"name": "lastComputed",
"default": null,
"doc": "The audit stamp of when the result was computed, including the actor who computed it."
}
],
"doc": "Information about a Test Result"
}
},
"name": "failing",
"doc": "Results that are failing"
},
{
"Relationship": {
"/*/test": {
"entityTypes": [
"test"
],
"name": "IsPassing"
}
},
"Searchable": {
"/*/test": {
"fieldName": "passingTests",
"fieldType": "URN",
"hasValuesFieldName": "hasPassingTests",
"queryByDefault": false
}
},
"type": {
"type": "array",
"items": "com.linkedin.test.TestResult"
},
"name": "passing",
"doc": "Results that are passing"
}
],
"doc": "Information about a Test Result"
}
displayProperties
Properties related to how the entity is displayed in the Datahub UI
- Fields
- Raw Schema
| Field | Type | Required | Description | Annotations |
|---|---|---|---|---|
| colorHex | string | The color associated with the entity in Hex. For example #FFFFFF. | ||
| icon | IconProperties | The icon associated with the entity |
{
"type": "record",
"Aspect": {
"name": "displayProperties"
},
"name": "DisplayProperties",
"namespace": "com.linkedin.common",
"fields": [
{
"type": [
"null",
"string"
],
"name": "colorHex",
"default": null,
"doc": "The color associated with the entity in Hex. For example #FFFFFF."
},
{
"type": [
"null",
{
"type": "record",
"name": "IconProperties",
"namespace": "com.linkedin.common",
"fields": [
{
"type": {
"type": "enum",
"symbolDocs": {
"MATERIAL": "Material UI"
},
"name": "IconLibrary",
"namespace": "com.linkedin.common",
"symbols": [
"MATERIAL"
],
"doc": "Enum of possible icon sources"
},
"name": "iconLibrary",
"doc": "The source of the icon: e.g. Antd, Material, etc"
},
{
"type": "string",
"name": "name",
"doc": "The name of the icon"
},
{
"type": "string",
"name": "style",
"doc": "Any modifier for the icon, this will be library-specific, e.g. filled/outlined, etc"
}
],
"doc": "Properties describing an icon associated with an entity"
}
],
"name": "icon",
"default": null,
"doc": "The icon associated with the entity"
}
],
"doc": "Properties related to how the entity is displayed in the Datahub UI"
}
assetSettings
Settings associated with this asset
- Fields
- Raw Schema
| Field | Type | Required | Description | Annotations |
|---|---|---|---|---|
| assetSummary | AssetSummarySettings | Information related to the asset summary for this asset |
{
"type": "record",
"Aspect": {
"name": "assetSettings"
},
"name": "AssetSettings",
"namespace": "com.linkedin.settings.asset",
"fields": [
{
"type": [
"null",
{
"type": "record",
"name": "AssetSummarySettings",
"namespace": "com.linkedin.settings.asset",
"fields": [
{
"Relationship": {
"/*/template": {
"entityTypes": [
"dataHubPageTemplate"
],
"name": "HasSummaryTemplate"
}
},
"type": [
{
"type": "array",
"items": {
"type": "record",
"name": "AssetSummarySettingsTemplate",
"namespace": "com.linkedin.settings.asset",
"fields": [
{
"java": {
"class": "com.linkedin.common.urn.Urn"
},
"type": "string",
"name": "template",
"doc": "The urn of the template"
}
],
"doc": "Object containing the template and any additional info for asset summary settings"
}
},
"null"
],
"name": "templates",
"default": [],
"doc": "The list of templates applied to this asset in order. Right now we only expect one."
}
],
"doc": "Information related to the asset summary for this asset"
}
],
"name": "assetSummary",
"default": null,
"doc": "Information related to the asset summary for this asset"
}
],
"doc": "Settings associated with this asset"
}
Common Types
These types are used across multiple aspects in this entity.
AuditStamp
Data captured on a resource/association/sub-resource level giving insight into when that resource/association/sub-resource moved into a particular lifecycle stage, and who acted to move it into that specific lifecycle stage.
Fields:
time(long): When did the resource/association/sub-resource move into the specific lifecyc...actor(string): The entity (e.g. a member URN) which will be credited for moving the resource...impersonator(string?): The entity (e.g. a service URN) which performs the change on behalf of the Ac...message(string?): Additional context around how DataHub was informed of the particular change. ...
FormAssociation
Properties of an applied form.
Fields:
urn(string): Urn of the applied formincompletePrompts(FormPromptAssociation[]): A list of prompts that are not yet complete for this form.completedPrompts(FormPromptAssociation[]): A list of prompts that have been completed for this form.
TestResult
Information about a Test Result
Fields:
test(string): The urn of the testtype(TestResultType): The type of the resulttestDefinitionMd5(string?): The md5 of the test definition that was used to compute this result. See Test...lastComputed(AuditStamp?): The audit stamp of when the result was computed, including the actor who comp...
Relationships
Self
These are the relationships to itself, stored in this entity's aspects
- IsPartOf (via
domainProperties.parentDomain)
Outgoing
These are the relationships stored in this entity's aspects
OwnedBy
- Corpuser via
ownership.owners.owner - CorpGroup via
ownership.owners.owner
- Corpuser via
ownershipType
- OwnershipType via
ownership.owners.typeUrn
- OwnershipType via
IsFailing
- Test via
testResults.failing
- Test via
IsPassing
- Test via
testResults.passing
- Test via
HasSummaryTemplate
- DataHubPageTemplate via
assetSettings.assetSummary.templates
- DataHubPageTemplate via
Incoming
These are the relationships stored in other entity's aspects
AssociatedWith
- Dataset via
domains.domains - DataJob via
domains.domains - DataFlow via
domains.domains - Chart via
domains.domains - Dashboard via
domains.domains - Notebook via
domains.domains
- Dataset via
Global Metadata Model
