You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While working on streamlining DP workflows I was looking for reliable ways to tell a dataset fork from its main/parent dataset (or the upstream dataset in Crunch terms) while also being able to return the latter one's id when working with it's fork.
To return the parent id if we are working on a fork:
def parent(dataset):
"""
Return the parent's (upstream) dataset id if the passed dataset is a fork.
"""
try:
upstream = dataset.resource.actions.upstream_delta
except pycrunch.lemonpy.ClientError:
return None
return upstream.value.upstream.id
To simply test if we are working on a fork:
def is_fork(dataset):
"""
Test if the current dataset is a fork.
"""
return parent(dataset) is not None
Adding these two would help greatly. What do you think?
The text was updated successfully, but these errors were encountered:
While working on streamlining DP workflows I was looking for reliable ways to tell a dataset
fork
from its main/parent dataset (or theupstream
dataset in Crunch terms) while also being able to return the latter one's id when working with it's fork.I've already confirmed with Alessandro that we can use the
dataset/{id}/actions/upstream_delta
catalog (see https://docs.crunch.io/feature-guide/feature-versioning.html?highlight=fork), leading to two handy methods (should be properties if added to thescrunch
dataset object):To return the parent id if we are working on a fork:
To simply test if we are working on a fork:
Adding these two would help greatly. What do you think?
The text was updated successfully, but these errors were encountered: