Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add properties: is_fork and parent #366

Open
alextanski opened this issue Jul 16, 2019 · 0 comments
Open

Add properties: is_fork and parent #366

alextanski opened this issue Jul 16, 2019 · 0 comments
Milestone

Comments

@alextanski
Copy link

While working on streamlining DP workflows I was looking for reliable ways to tell a dataset fork from its main/parent dataset (or the upstream dataset in Crunch terms) while also being able to return the latter one's id when working with it's fork.

I've already confirmed with Alessandro that we can use the dataset/{id}/actions/upstream_delta catalog (see https://docs.crunch.io/feature-guide/feature-versioning.html?highlight=fork), leading to two handy methods (should be properties if added to the scrunch dataset object):

To return the parent id if we are working on a fork:

def parent(dataset):
    """
    Return the parent's (upstream) dataset id if the passed dataset is a fork. 
    """
    try:
        upstream = dataset.resource.actions.upstream_delta
    except pycrunch.lemonpy.ClientError:
        return None
    return upstream.value.upstream.id

To simply test if we are working on a fork:

def is_fork(dataset):
    """
    Test if the current dataset is a fork.
    """
    return parent(dataset) is not None

Adding these two would help greatly. What do you think?

@alextanski alextanski added this to the Wishlist milestone Jul 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant