-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add validate_df
task to OutlookToADLS
flow
#788
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls change new parameter to validate_df_dict
viadot/flows/outlook_to_adls.py
Outdated
@@ -29,6 +30,7 @@ def __init__( | |||
limit: int = 10000, | |||
timeout: int = 3600, | |||
if_exists: Literal["append", "replace", "skip"] = "append", | |||
validation_df_dict: dict = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_df_dict: dict = None,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
viadot/flows/outlook_to_adls.py
Outdated
@@ -54,6 +56,8 @@ def __init__( | |||
timeout(int, optional): The amount of time (in seconds) to wait while running this task before | |||
a timeout occurs. Defaults to 3600. | |||
if_exists (Literal['append', 'replace', 'skip'], optional): What to do if the local file already exists. Defaults to "append". | |||
validation_df_dict (dict, optional): An optional dictionary to verify the received dataframe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_df_dict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
viadot/flows/outlook_to_adls.py
Outdated
@@ -65,6 +69,9 @@ def __init__( | |||
self.local_file_path = local_file_path | |||
self.if_exsists = if_exists | |||
|
|||
# Validate DataFrame | |||
self.validation_df_dict = validation_df_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
viadot/flows/outlook_to_adls.py
Outdated
@@ -98,6 +105,11 @@ def gen_flow(self) -> Flow: | |||
dfs = apply_map(self.gen_outlook_df, self.mailbox_list, flow=self) | |||
|
|||
df = union_dfs_task.bind(dfs, flow=self) | |||
|
|||
if self.validation_df_dict: | |||
validation = validate_df(df=df, tests=self.validation_df_dict, flow=self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Summary
validate_df_dict
parameter toOutlookToADLS
flowOutlookToADLS
flowImportance
New functionality to check the data
Checklist
This PR:
CONTRIBUTING.md
CHANGELOG.md