Entities
The DataHub SDK provides a set of entities that can be used to interact with DataHub’s metadata.
Dataset
Bases: HasPlatformInstance
, HasSubtype
, HasContainer
, HasOwnership
, HasInstitutionalMemory
, HasTags
, HasTerms
, HasDomain
, HasStructuredProperties
, Entity
Represents a dataset in DataHub.
A dataset represents a collection of data, such as a table, view, or file. This class provides methods for managing dataset metadata including schema, lineage, and various aspects like ownership, tags, and terms.
- Parameters:
- platform (str)
- name (str)
- platform_instance (Optional [str])
- env (str)
- description (Optional [str])
- display_name (Optional [str])
- qualified_name (Optional [str])
- external_url (Optional [str])
- custom_properties (Optional [Dict[str,str] ])
- created (Optional [datetime])
- last_modified (Optional [datetime])
- parent_container (ParentContainerInputType | Unset)
- subtype (Optional [str])
- owners (Optional [OwnersInputType])
- links (Optional [LinksInputType])
- tags (Optional [TagsInputType])
- terms (Optional [TermsInputType])
- domain (Optional [DomainInputType])
- schema (Optional [SchemaFieldsInputType])
- upstreams (Optional [models.UpstreamLineageClass])
- structured_properties (Optional [StructuredPropertyInputType])
- extra_aspects (ExtraAspectsType)
property created : datetime | None
Get the creation timestamp of the dataset.
- Returns: The creation timestamp if set, None otherwise.
property custom_properties : Dict[str, str]
Get the custom properties of the dataset.
- Returns: Dictionary of custom properties.
property description : str | None
Get the description of the dataset.
- Returns: The description if set, None otherwise.
property display_name : str | None
Get the display name of the dataset.
- Returns: The display name if set, None otherwise.
property external_url : str | None
Get the external URL of the dataset.
- Returns: The external URL if set, None otherwise.
classmethod get_urn_type()
Get the URN type for datasets.
- Return type:
Type
[DatasetUrn
] - Returns: The DatasetUrn class.
property last_modified : datetime | None
Get the last modification timestamp of the dataset.
- Returns: The last modification timestamp if set, None otherwise.
property qualified_name : str | None
Get the qualified name of the dataset.
- Returns: The qualified name if set, None otherwise.
property schema : List[SchemaField]
Get the schema fields of the dataset.
- Returns: List of SchemaField objects representing the dataset’s schema.
set_created(created)
Set the creation timestamp of the dataset.
- Parameters:created (
datetime
) – The creation timestamp to set. - Return type:
None
set_custom_properties(custom_properties)
Set the custom properties of the dataset.
- Parameters:custom_properties (
Dict
[str
,str
]) – Dictionary of custom properties to set. - Return type:
None
set_description(description)
Set the description of the dataset.
- Parameters:description (
str
) – The description to set. - Return type:
None
NOTE
If called during ingestion, this will warn if overwriting a non-ingestion description.
set_display_name(display_name)
Set the display name of the dataset.
- Parameters:display_name (
str
) – The display name to set. - Return type:
None
set_external_url(external_url)
Set the external URL of the dataset.
- Parameters:external_url (
str
) – The external URL to set. - Return type:
None
set_last_modified(last_modified)
- Parameters:last_modified (
datetime
) - Return type:
None
set_qualified_name(qualified_name)
Set the qualified name of the dataset.
- Parameters:qualified_name (
str
) – The qualified name to set. - Return type:
None
set_upstreams(upstreams)
- Parameters:upstreams (
Union
[UpstreamLineageClass
,List
[Union
[str
,DatasetUrn
,UpstreamClass
,FineGrainedLineageClass
]],Dict
[Union
[str
,DatasetUrn
],Dict
[str
,List
[str
]]]]) – - Return type:
None
property upstreams : UpstreamLineageClass | None
property urn : DatasetUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
SchemaField
Bases: object
- Parameters:
- parent (
Dataset
) – - field_path (
str
)
- parent (
add_tag(tag)
- Parameters:tag (
Union
[str
,TagUrn
,TagAssociationClass
]) – - Return type:
None
add_term(term)
- Parameters:term (
Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]) – - Return type:
None
property description : str | None
property field_path : str
property mapped_type : SchemaFieldDataTypeClass
property native_type : str
remove_tag(tag)
- Parameters:tag (
Union
[str
,TagUrn
,TagAssociationClass
]) – - Return type:
None
remove_term(term)
- Parameters:term (
Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]) – - Return type:
None
set_description(description)
- Parameters:description (
str
) - Return type:
None
set_tags(tags)
- Parameters:tags (
List
[Union
[str
,TagUrn
,TagAssociationClass
]]) – - Return type:
None
set_terms(terms)
- Parameters:terms (
List
[Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]]) – - Return type:
None
property tags : List[TagAssociationClass] | None
property terms : List[GlossaryTermAssociationClass] | None
parse_cll_mapping
- Parameters:
- upstream (
Union
[str
,DatasetUrn
]) – - downstream (
Union
[str
,DatasetUrn
]) – - cll_mapping (
Dict
[str
,List
[str
]])
- upstream (
- Return type:
List
[FineGrainedLineageClass
]
Container
Bases: HasPlatformInstance
, HasSubtype
, HasContainer
, HasOwnership
, HasInstitutionalMemory
, HasStructuredProperties
, HasTags
, HasTerms
, HasDomain
, Entity
- Parameters:
- container_key (
ContainerKey
) – - display_name (
str
) - qualified_name (
Optional
[str
]) - description (
Optional
[str
]) - external_url (
Optional
[str
]) - extra_properties (
Optional
[Dict
[str
,str
]]) - created (
Optional
[datetime
]) - last_modified (
Optional
[datetime
]) - parent_container (
Union
[Auto
,Container
,ContainerKey
,List
[Union
[Urn
,str
]],None
]) – - subtype (
Optional
[str
]) - owners (
Optional
[List
[Union
[CorpUserUrn
,CorpGroupUrn
,Tuple
[Union
[CorpUserUrn
,CorpGroupUrn
],Union
[str
,OwnershipTypeUrn
]],OwnerClass
]]]) – - links (
Optional
[Sequence
[Union
[str
,Tuple
[str
,str
],InstitutionalMemoryMetadataClass
]]]) – - tags (
Optional
[List
[Union
[str
,TagUrn
,TagAssociationClass
]]]) – - terms (
Optional
[List
[Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]]]) – - domain (
Union
[str
,DomainUrn
,None
]) – - structured_properties (
Optional
[Dict
[Union
[str
,StructuredPropertyUrn
],Sequence
[Union
[str
,float
,int
]]]]) – - extra_aspects (
Optional
[List
[TypeVar
(Aspect
, bound=_Aspect
)]]) –
- container_key (
property created : datetime | None
property custom_properties : Dict[str, str] | None
property description : str | None
property display_name : str
property external_url : str | None
classmethod get_urn_type()
Get the URN type for this entity class.
- Return type:
Type
[ContainerUrn
] - Returns: The URN type class that corresponds to this entity type.
property last_modified : datetime | None
property qualified_name : str | None
set_created(created)
- Parameters:created (
datetime
) - Return type:
None
set_custom_properties(custom_properties)
- Parameters:custom_properties (
Dict
[str
,str
]) - Return type:
None
set_description(description)
- Parameters:description (
str
) - Return type:
None
set_display_name(value)
- Parameters:value (
str
) - Return type:
None
set_external_url(external_url)
- Parameters:external_url (
str
) - Return type:
None
set_last_modified(last_modified)
- Parameters:last_modified (
datetime
) - Return type:
None
set_qualified_name(qualified_name)
- Parameters:qualified_name (
str
) - Return type:
None
MLModel
Bases: HasPlatformInstance
, HasOwnership
, HasInstitutionalMemory
, HasTags
, HasTerms
, HasDomain
, HasVersion
, HasStructuredProperties
, Entity
- Parameters:
- id (
str
) - platform (
str
) - version (
Optional
[str
]) - aliases (
Optional
[List
[str
]]) - platform_instance (
Optional
[str
]) - env (
str
) - name (
Optional
[str
]) - description (
Optional
[str
]) - training_metrics (
Union
[List
[MLMetricClass
],Dict
[str
,Optional
[str
]],None
]) – - hyper_params (
Union
[List
[MLHyperParamClass
],Dict
[str
,Optional
[str
]],None
]) – - external_url (
Optional
[str
]) - custom_properties (
Optional
[Dict
[str
,str
]]) - created (
Optional
[datetime
]) - last_modified (
Optional
[datetime
]) - owners (
Optional
[List
[Union
[CorpUserUrn
,CorpGroupUrn
,Tuple
[Union
[CorpUserUrn
,CorpGroupUrn
],Union
[str
,OwnershipTypeUrn
]],OwnerClass
]]]) – - links (
Optional
[Sequence
[Union
[str
,Tuple
[str
,str
],InstitutionalMemoryMetadataClass
]]]) – - tags (
Optional
[List
[Union
[str
,TagUrn
,TagAssociationClass
]]]) – - terms (
Optional
[List
[Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]]]) – - domain (
Union
[str
,DomainUrn
,None
]) – - model_group (
Union
[str
,MlModelGroupUrn
,None
]) – - training_jobs (
Optional
[Sequence
[Union
[str
,DataProcessInstanceUrn
]]]) – - downstream_jobs (
Optional
[Sequence
[Union
[str
,DataProcessInstanceUrn
]]]) – - structured_properties (
Optional
[Dict
[Union
[str
,StructuredPropertyUrn
],Sequence
[Union
[str
,float
,int
]]]]) – - extra_aspects (
Optional
[List
[TypeVar
(Aspect
, bound=_Aspect
)]]) –
- id (
add_downstream_job(downstream_job)
- Parameters:downstream_job (
Union
[str
,DataProcessInstanceUrn
]) – - Return type:
None
add_hyper_params(params)
- Parameters:params (
Union
[List
[MLHyperParamClass
],Dict
[str
,Optional
[str
]]]) – - Return type:
None
add_training_job(training_job)
- Parameters:training_job (
Union
[str
,DataProcessInstanceUrn
]) – - Return type:
None
add_training_metrics(metrics)
- Parameters:metrics (
Union
[List
[MLMetricClass
],Dict
[str
,Optional
[str
]]]) – - Return type:
None
property created : datetime | None
property custom_properties : Dict[str, str] | None
property description : str | None
property downstream_jobs : List[str] | None
property external_url : str | None
classmethod get_urn_type()
Get the URN type for this entity class.
- Return type:
Type
[MlModelUrn
] - Returns: The URN type class that corresponds to this entity type.
property hyper_params : List[MLHyperParamClass] | None
property last_modified : datetime | None
property model_group : str | None
property name : str | None
remove_downstream_job(downstream_job)
- Parameters:downstream_job (
Union
[str
,DataProcessInstanceUrn
]) – - Return type:
None
remove_training_job(training_job)
- Parameters:training_job (
Union
[str
,DataProcessInstanceUrn
]) – - Return type:
None
set_created(created)
- Parameters:created (
datetime
) - Return type:
None
set_custom_properties(custom_properties)
- Parameters:custom_properties (
Dict
[str
,str
]) - Return type:
None
set_description(description)
- Parameters:description (
str
) - Return type:
None
set_downstream_jobs(downstream_jobs)
- Parameters:downstream_jobs (
Sequence
[Union
[str
,DataProcessInstanceUrn
]]) – - Return type:
None
set_external_url(external_url)
- Parameters:external_url (
str
) - Return type:
None
set_hyper_params(params)
- Parameters:params (
Union
[List
[MLHyperParamClass
],Dict
[str
,Optional
[str
]]]) – - Return type:
None
set_last_modified(last_modified)
- Parameters:last_modified (
datetime
) - Return type:
None
set_model_group(group)
- Parameters:group (
Union
[str
,MlModelGroupUrn
]) – - Return type:
None
set_name(name)
- Parameters:name (
str
) - Return type:
None
set_training_jobs(training_jobs)
- Parameters:training_jobs (
Sequence
[Union
[str
,DataProcessInstanceUrn
]]) – - Return type:
None
set_training_metrics(metrics)
- Parameters:metrics (
Union
[List
[MLMetricClass
],Dict
[str
,Optional
[str
]]]) – - Return type:
None
property training_jobs : List[str] | None
property training_metrics : List[MLMetricClass] | None
property urn : MlModelUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
MLModelGroup
Bases: HasPlatformInstance
, HasOwnership
, HasInstitutionalMemory
, HasTags
, HasTerms
, HasDomain
, HasStructuredProperties
, Entity
- Parameters:
- id (
str
) - platform (
str
) - name (
Optional
[str
]) - platform_instance (
Optional
[str
]) - env (
str
) - description (
Optional
[str
]) - display_name (
Optional
[str
]) - external_url (
Optional
[str
]) - custom_properties (
Optional
[Dict
[str
,str
]]) - created (
Optional
[datetime
]) - last_modified (
Optional
[datetime
]) - owners (
Optional
[List
[Union
[CorpUserUrn
,CorpGroupUrn
,Tuple
[Union
[CorpUserUrn
,CorpGroupUrn
],Union
[str
,OwnershipTypeUrn
]],OwnerClass
]]]) – - links (
Optional
[Sequence
[Union
[str
,Tuple
[str
,str
],InstitutionalMemoryMetadataClass
]]]) – - tags (
Optional
[List
[Union
[str
,TagUrn
,TagAssociationClass
]]]) – - terms (
Optional
[List
[Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]]]) – - domain (
Union
[str
,DomainUrn
,None
]) – - training_jobs (
Optional
[Sequence
[Union
[str
,DataProcessInstanceUrn
]]]) – - downstream_jobs (
Optional
[Sequence
[Union
[str
,DataProcessInstanceUrn
]]]) – - structured_properties (
Optional
[Dict
[Union
[str
,StructuredPropertyUrn
],Sequence
[Union
[str
,float
,int
]]]]) – - extra_aspects (
Optional
[List
[TypeVar
(Aspect
, bound=_Aspect
)]]) –
- id (
add_downstream_job(downstream_job)
- Parameters:downstream_job (
Union
[str
,DataProcessInstanceUrn
]) – - Return type:
None
add_training_job(training_job)
- Parameters:training_job (
Union
[str
,DataProcessInstanceUrn
]) – - Return type:
None
property created : datetime | None
property custom_properties : Dict[str, str] | None
property description : str | None
property downstream_jobs : List[str] | None
property external_url : str | None
classmethod get_urn_type()
Get the URN type for this entity class.
- Return type:
Type
[MlModelGroupUrn
] - Returns: The URN type class that corresponds to this entity type.
property last_modified : datetime | None
property name : str | None
remove_downstream_job(downstream_job)
- Parameters:downstream_job (
Union
[str
,DataProcessInstanceUrn
]) – - Return type:
None
remove_training_job(training_job)
- Parameters:training_job (
Union
[str
,DataProcessInstanceUrn
]) – - Return type:
None
set_created(created)
- Parameters:created (
datetime
) - Return type:
None
set_custom_properties(custom_properties)
- Parameters:custom_properties (
Dict
[str
,str
]) - Return type:
None
set_description(description)
- Parameters:description (
str
) - Return type:
None
set_downstream_jobs(downstream_jobs)
- Parameters:downstream_jobs (
Sequence
[Union
[str
,DataProcessInstanceUrn
]]) – - Return type:
None
set_external_url(external_url)
- Parameters:external_url (
str
) - Return type:
None
set_last_modified(last_modified)
- Parameters:last_modified (
datetime
) - Return type:
None
set_name(display_name)
- Parameters:display_name (
str
) - Return type:
None
set_training_jobs(training_jobs)
- Parameters:training_jobs (
Sequence
[Union
[str
,DataProcessInstanceUrn
]]) – - Return type:
None
property training_jobs : List[str] | None
property urn : MlModelGroupUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
Dashboard
Bases: HasPlatformInstance
, HasSubtype
, HasOwnership
, HasContainer
, HasInstitutionalMemory
, HasTags
, HasTerms
, HasDomain
, Entity
Represents a dashboard in DataHub.
- Parameters:
- name (
str
) - platform (
Union
[str
,DataPlatformUrn
]) – - display_name (
Optional
[str
]) - platform_instance (
Union
[str
,DataPlatformInstanceUrn
,None
]) – - description (
str
) - external_url (
Optional
[str
]) - dashboard_url (
Optional
[str
]) - custom_properties (
Optional
[Dict
[str
,str
]]) - last_modified (
Optional
[datetime
]) - last_refreshed (
Optional
[datetime
]) - input_datasets (
Optional
[List
[Union
[str
,DatasetUrn
,Dataset
]]]) – - charts (
Optional
[List
[Union
[str
,ChartUrn
,Chart
]]]) – - dashboards (
Optional
[List
[Union
[str
,DashboardUrn
,Dashboard
]]]) – - subtype (
Optional
[str
]) - owners (
Optional
[List
[Union
[CorpUserUrn
,CorpGroupUrn
,Tuple
[Union
[CorpUserUrn
,CorpGroupUrn
],Union
[str
,OwnershipTypeUrn
]],OwnerClass
]]]) – - links (
Optional
[Sequence
[Union
[str
,Tuple
[str
,str
],InstitutionalMemoryMetadataClass
]]]) – - tags (
Optional
[List
[Union
[str
,TagUrn
,TagAssociationClass
]]]) – - terms (
Optional
[List
[Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]]]) – - domain (
Union
[str
,DomainUrn
,None
]) – - extra_aspects (
Optional
[List
[TypeVar
(Aspect
, bound=_Aspect
)]]) –
- name (
add_chart(chart)
Add a chart to the dashboard.
add_dashboard(dashboard)
Add a dashboard to the dashboard.
- Parameters:dashboard (
Union
[str
,DashboardUrn
,Dashboard
]) – - Return type:
None
add_input_dataset(input_dataset)
Add an input dataset to the dashboard.
- Parameters:input_dataset (
Union
[str
,DatasetUrn
,Dataset
]) – - Return type:
None
property charts : List[ChartUrn]
Get the charts of the dashboard.
property custom_properties : Dict[str, str]
Get the custom properties of the dashboard.
property dashboard_url : str | None
Get the dashboard URL.
property dashboards : List[DashboardUrn]
Get the dashboards of the dashboard.
property description : str | None
Get the description of the dashboard.
property display_name : str | None
Get the display name of the dashboard.
property external_url : str | None
Get the external URL of the dashboard.
classmethod get_urn_type()
Get the URN type for dashboards.
:rtype: Type
[DashboardUrn
]
:returns: The DashboardUrn class.
- Return type:Type[DashboardUrn]
property input_datasets : List[DatasetUrn]
Get the input datasets of the dashboard.
property last_modified : datetime | None
Get the last modification timestamp of the dashboard.
property last_refreshed : datetime | None
Get the last refresh timestamp of the dashboard.
property name : str
Get the name of the dashboard.
remove_chart(chart)
Remove a chart from the dashboard.
remove_input_dataset(input_dataset)
Remove an input dataset from the dashboard.
- Parameters:input_dataset (
Union
[str
,DatasetUrn
,Dataset
]) – - Return type:
None
set_charts(charts)
Set the charts of the dashboard.
set_custom_properties(custom_properties)
Set the custom properties of the dashboard.
- Parameters:custom_properties (
Dict
[str
,str
]) - Return type:
None
set_dashboard_url(dashboard_url)
Set the dashboard URL.
- Parameters:dashboard_url (
str
) - Return type:
None
set_dashboards(dashboards)
Set the dashboards of the dashboard.
- Parameters:dashboards (
List
[Union
[str
,DashboardUrn
,Dashboard
]]) – - Return type:
None
set_description(description)
Set the description of the dashboard.
- Parameters:description (
str
) - Return type:
None
set_display_name(display_name)
Set the display name of the dashboard.
- Parameters:display_name (
str
) - Return type:
None
set_external_url(external_url)
Set the external URL of the dashboard.
- Parameters:external_url (
str
) - Return type:
None
set_input_datasets(input_datasets)
Set the input datasets of the dashboard.
- Parameters:input_datasets (
List
[Union
[str
,DatasetUrn
,Dataset
]]) – - Return type:
None
set_last_modified(last_modified)
Set the last modification timestamp of the dashboard.
- Parameters:last_modified (
datetime
) - Return type:
None
set_last_refreshed(last_refreshed)
Set the last refresh timestamp of the dashboard.
- Parameters:last_refreshed (
datetime
) - Return type:
None
set_title(title)
Set the title of the dashboard.
- Parameters:title (
str
) - Return type:
None
property title : str
Get the title of the dashboard.
property urn : DashboardUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
Chart
Bases: HasPlatformInstance
, HasSubtype
, HasOwnership
, HasContainer
, HasInstitutionalMemory
, HasTags
, HasTerms
, HasDomain
, Entity
Represents a chart in DataHub.
- Parameters:
- name (
str
) - platform (
Union
[str
,DataPlatformUrn
]) – - display_name (
Optional
[str
]) - platform_instance (
Union
[str
,DataPlatformInstanceUrn
,None
]) – - description (
Optional
[str
]) - external_url (
Optional
[str
]) - chart_url (
Optional
[str
]) - custom_properties (
Optional
[Dict
[str
,str
]]) - last_modified (
Optional
[datetime
]) - last_refreshed (
Optional
[datetime
]) - chart_type (
Union
[str
,ChartTypeClass
,None
]) – - access (
Optional
[str
]) - subtype (
Optional
[str
]) - owners (
Optional
[List
[Union
[CorpUserUrn
,CorpGroupUrn
,Tuple
[Union
[CorpUserUrn
,CorpGroupUrn
],Union
[str
,OwnershipTypeUrn
]],OwnerClass
]]]) – - links (
Optional
[Sequence
[Union
[str
,Tuple
[str
,str
],InstitutionalMemoryMetadataClass
]]]) – - tags (
Optional
[List
[Union
[str
,TagUrn
,TagAssociationClass
]]]) – - terms (
Optional
[List
[Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]]]) – - domain (
Union
[str
,DomainUrn
,None
]) – - input_datasets (
Optional
[List
[Union
[str
,DatasetUrn
,Dataset
]]]) – - extra_aspects (
Optional
[List
[TypeVar
(Aspect
, bound=_Aspect
)]]) –
- name (
property access : str | None
Get the access level of the chart as a string.
add_input_dataset(input_dataset)
Add an input to the chart.
- Parameters:input_dataset (
Union
[str
,DatasetUrn
,Dataset
]) – - Return type:
None
property chart_type : str | None
Get the type of the chart as a string.
property chart_url : str | None
Get the chart URL.
property custom_properties : Dict[str, str]
Get the custom properties of the chart.
property description : str | None
Get the description of the chart.
property display_name : str | None
Get the display name of the chart.
property external_url : str | None
Get the external URL of the chart.
classmethod get_urn_type()
Get the URN type for charts.
:rtype: Type
[ChartUrn
]
:returns: The ChartUrn class.
- Return type:Type[ChartUrn]
property input_datasets : List[DatasetUrn]
Get the input datasets of the chart.
property last_modified : datetime | None
Get the last modification timestamp of the chart.
property last_refreshed : datetime | None
Get the last refresh timestamp of the chart.
property name : str
Get the name of the chart.
remove_input_dataset(input_dataset)
Remove an input from the chart.
- Parameters:input_dataset (
Union
[str
,DatasetUrn
,Dataset
]) – - Return type:
None
set_access(access)
Set the access level of the chart.
- Parameters:access (
Union
[str
,AccessLevelClass
]) – - Return type:
None
set_chart_type(chart_type)
Set the type of the chart.
- Parameters:chart_type (
Union
[str
,ChartTypeClass
]) – - Return type:
None
set_chart_url(chart_url)
Set the chart URL.
- Parameters:chart_url (
str
) - Return type:
None
set_custom_properties(custom_properties)
Set the custom properties of the chart.
- Parameters:custom_properties (
Dict
[str
,str
]) - Return type:
None
set_description(description)
Set the description of the chart.
- Parameters:description (
str
) - Return type:
None
set_display_name(display_name)
Set the display name of the chart.
- Parameters:display_name (
str
) - Return type:
None
set_external_url(external_url)
Set the external URL of the chart.
- Parameters:external_url (
str
) - Return type:
None
set_input_datasets(input_datasets)
Set the input datasets of the chart.
- Parameters:input_datasets (
List
[Union
[str
,DatasetUrn
,Dataset
]]) – - Return type:
None
set_last_modified(last_modified)
Set the last modification timestamp of the chart.
- Parameters:last_modified (
datetime
) - Return type:
None
set_last_refreshed(last_refreshed)
Set the last refresh timestamp of the chart.
- Parameters:last_refreshed (
datetime
) - Return type:
None
set_title(title)
Set the title of the chart.
- Parameters:title (
str
) - Return type:
None
property title : str
Get the title of the chart.
property urn : ChartUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
DataJob
Bases: HasPlatformInstance
, HasSubtype
, HasContainer
, HasOwnership
, HasInstitutionalMemory
, HasTags
, HasTerms
, HasDomain
, HasStructuredProperties
, Entity
Represents a data job in DataHub. A data job is an executable unit of a data pipeline, such as an Airflow task or a Spark job.
- Parameters:
- name (
str
) - flow (
Optional
[DataFlow
]) – - flow_urn (
Union
[str
,DataFlowUrn
,None
]) – - platform_instance (
Optional
[str
]) - display_name (
Optional
[str
]) - description (
Optional
[str
]) - external_url (
Optional
[str
]) - custom_properties (
Optional
[Dict
[str
,str
]]) - created (
Optional
[datetime
]) - last_modified (
Optional
[datetime
]) - subtype (
Optional
[str
]) - owners (
Optional
[List
[Union
[CorpUserUrn
,CorpGroupUrn
,Tuple
[Union
[CorpUserUrn
,CorpGroupUrn
],Union
[str
,OwnershipTypeUrn
]],OwnerClass
]]]) – - links (
Optional
[Sequence
[Union
[str
,Tuple
[str
,str
],InstitutionalMemoryMetadataClass
]]]) – - tags (
Optional
[List
[Union
[str
,TagUrn
,TagAssociationClass
]]]) – - terms (
Optional
[List
[Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]]]) – - domain (
Union
[str
,DomainUrn
,None
]) – - inlets (
Optional
[List
[Union
[str
,DatasetUrn
]]]) – - outlets (
Optional
[List
[Union
[str
,DatasetUrn
]]]) – - structured_properties (
Optional
[Dict
[Union
[str
,StructuredPropertyUrn
],Sequence
[Union
[str
,float
,int
]]]]) – - extra_aspects (
Optional
[List
[TypeVar
(Aspect
, bound=_Aspect
)]]) –
- name (
property created : datetime | None
Get the creation timestamp of the data job.
property custom_properties : Dict[str, str]
Get the custom properties of the data job.
property description : str | None
Get the description of the data job.
property display_name : str | None
Get the display name of the data job.
property external_url : str | None
Get the external URL of the data job.
property flow_urn : DataFlowUrn
Get the data flow associated with the data job.
classmethod get_urn_type()
Get the URN type for data jobs.
- Return type:
Type
[DataJobUrn
]
property inlets : List[DatasetUrn]
Get the inlets of the data job.
property last_modified : datetime | None
Get the last modification timestamp of the data job.
property name : str
Get the name of the data job.
property outlets : List[DatasetUrn]
Get the outlets of the data job.
set_created(created)
Set the creation timestamp of the data job.
- Parameters:created (
datetime
) - Return type:
None
set_custom_properties(custom_properties)
Set the custom properties of the data job.
- Parameters:custom_properties (
Dict
[str
,str
]) - Return type:
None
set_description(description)
Set the description of the data job.
- Parameters:description (
str
) - Return type:
None
set_display_name(display_name)
Set the display name of the data job.
- Parameters:display_name (
str
) - Return type:
None
set_external_url(external_url)
Set the external URL of the data job.
- Parameters:external_url (
str
) - Return type:
None
set_inlets(inlets)
Set the inlets of the data job.
- Parameters:inlets (
List
[Union
[str
,DatasetUrn
]]) – - Return type:
None
set_last_modified(last_modified)
Set the last modification timestamp of the data job.
- Parameters:last_modified (
datetime
) - Return type:
None
set_outlets(outlets)
Set the outlets of the data job.
- Parameters:outlets (
List
[Union
[str
,DatasetUrn
]]) – - Return type:
None
property urn : DataJobUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
DataFlow
Bases: HasPlatformInstance
, HasSubtype
, HasOwnership
, HasContainer
, HasInstitutionalMemory
, HasTags
, HasTerms
, HasDomain
, HasStructuredProperties
, Entity
Represents a dataflow in DataHub. A dataflow represents a collection of data, such as a table, view, or file. This class provides methods for managing dataflow metadata including schema, lineage, and various aspects like ownership, tags, and terms.
- Parameters:
- name (
str
) - platform (
str
) - display_name (
Optional
[str
]) - platform_instance (
Optional
[str
]) - env (
str
) - description (
Optional
[str
]) - external_url (
Optional
[str
]) - custom_properties (
Optional
[Dict
[str
,str
]]) - created (
Optional
[datetime
]) - last_modified (
Optional
[datetime
]) - subtype (
Optional
[str
]) - owners (
Optional
[List
[Union
[CorpUserUrn
,CorpGroupUrn
,Tuple
[Union
[CorpUserUrn
,CorpGroupUrn
],Union
[str
,OwnershipTypeUrn
]],OwnerClass
]]]) – - links (
Optional
[Sequence
[Union
[str
,Tuple
[str
,str
],InstitutionalMemoryMetadataClass
]]]) – - tags (
Optional
[List
[Union
[str
,TagUrn
,TagAssociationClass
]]]) – - terms (
Optional
[List
[Union
[str
,GlossaryTermUrn
,GlossaryTermAssociationClass
]]]) – - domain (
Union
[str
,DomainUrn
,None
]) – - parent_container (
Union
[Container
,ContainerKey
,List
[Union
[Urn
,str
]],Unset
]) – - structured_properties (
Optional
[Dict
[Union
[str
,StructuredPropertyUrn
],Sequence
[Union
[str
,float
,int
]]]]) – - extra_aspects (
Optional
[List
[TypeVar
(Aspect
, bound=_Aspect
)]]) –
- name (
property created : datetime | None
Get the creation timestamp of the dataflow. :returns: The creation timestamp if set, None otherwise.
property custom_properties : Dict[str, str]
Get the custom properties of the dataflow. :returns: Dictionary of custom properties.
property description : str | None
Get the description of the dataflow. :returns: The description if set, None otherwise.
property display_name : str | None
Get the display name of the dataflow. :returns: The display name if set, None otherwise.
property env : str | FabricTypeClass | None
Get the environment of the dataflow.
property external_url : str | None
Get the external URL of the dataflow. :returns: The external URL if set, None otherwise.
classmethod get_urn_type()
Get the URN type for dataflows.
:rtype: Type
[DataFlowUrn
]
:returns: The DataflowUrn class.
- Return type:Type[DataFlowUrn]
property last_modified : datetime | None
Get the last modification timestamp of the dataflow. :returns: The last modification timestamp if set, None otherwise.
property name : str
Get the name of the dataflow. :returns: The name of the dataflow.
set_created(created)
Set the creation timestamp of the dataflow.
:type created: datetime
:param created: The creation timestamp to set.
- Return type:
None
- Parameters:created (datetime)
set_custom_properties(custom_properties)
Set the custom properties of the dataflow.
:type custom_properties: Dict
[str
, str
]
:param custom_properties: Dictionary of custom properties to set.
- Return type:
None
- Parameters:custom_properties (Dict [str,str])
set_description(description)
Set the description of the dataflow.
:type description: str
:param description: The description to set.
:rtype: None
NOTE
If called during ingestion, this will warn if overwriting a non-ingestion description.
- Parameters:description (str)
- Return type: None
set_display_name(display_name)
Set the display name of the dataflow.
:type display_name: str
:param display_name: The display name to set.
- Return type:
None
- Parameters:display_name (str)
set_external_url(external_url)
Set the external URL of the dataflow.
:type external_url: str
:param external_url: The external URL to set.
- Return type:
None
- Parameters:external_url (str)
set_last_modified(last_modified)
- Parameters:last_modified (
datetime
) - Return type:
None
property urn : DataFlowUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.