Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from StrictYAML to Pydantic #870

Merged
merged 19 commits into from
Mar 8, 2024

Conversation

sfc-gh-jsikorski
Copy link
Contributor

@sfc-gh-jsikorski sfc-gh-jsikorski commented Mar 5, 2024

Pre-review checklist

  • I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
  • I've added or updated automated unit tests to verify correctness of my new code.
  • I've added or updated integration tests to verify correctness of my new code.
  • I've confirmed that my changes are working by executing CLI's commands manually.
  • I've confirmed that my changes are up-to-date with the target branch.
  • I've described my changes in the release notes.
  • I've described my changes in the section below.

Changes description

Switch to Pydantic from StrictYaml. No we use Models for project definition files validation. This gives us:

  • IDE-supported, safe access to definition fields
  • Better control on types, names etc.
  • Stricter control over input files
  • No need to use hardcoded strings
  • Ability to generate JSON schemas and docs from code

@@ -794,6 +794,7 @@ def test_create_dev_app_create_new_quoted(
- src: app/streamlit/*.py
dest: ui/


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dest:ui/ placed here intentionally? It's an extra value, that is not used in schema? According to design doc, we should not accept extra value, so this causes an error. Should it be deleted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfc-gh-bgoel @sfc-gh-cgorrie can you advise?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an indentation error -- this is part of the last entry of the artifacts collection (i.e. { src: "app/streamlit/*.py", dest: "ui/" }).

RELEASE-NOTES.md Outdated Show resolved Hide resolved
)
streamlit: Optional[StreamlitSchema] = Field(
title="Native app definitions for the project", default=None
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add validator for definition_version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean definition == 1?

Copy link
Contributor

@sfc-gh-cgorrie sfc-gh-cgorrie Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to think about this -- when we get to project definition v2 (design currently in progress) it will look completely different. We should have a way to separate out the definitions by definition_version rather than have to have a complex base model that has to support both. I agree that for now ensuring definition_version == 1 is important, but we should make sure that when we introduce definition_version == 2 it's something that can live alongside this model rather than "within" it, if we have enough changes that it makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so my proposition is to add definition_version==1 to field validation. If user provides another version of definition, it's pointless to proceed with parsing any fields.
Then, when we add new versions, we can change load_project_definition() logic to decide which model should be used for provided data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let's have ==1 for now. We will add more validation once we start working on v2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

DB_SCHEMA_AND_NAME = f"{IDENTIFIER}[.]{IDENTIFIER}[.]{IDENTIFIER}"
SCHEMA_AND_NAME = f"{IDENTIFIER}[.]{IDENTIFIER}"
SCHEMA_AND_NAME = f"{IDENTIFIER}[.]{IDENTIFIER_NO_LENGTH}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a lost artifact of long battle with regexes. Long story.

query_warehouse=streamlit["query_warehouse"],
additional_source_files=streamlit.get("additional_source_files"),
query_warehouse=streamlit.query_warehouse,
additional_source_files=streamlit.additional_source_files,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this change, getting rid of all those hardcoded strings rocks 🪨

Comment on lines 62 to 65
assert (
"Field required [type=missing, input_value={'name': 'underspecified'}, input_type=dict]"
in str(exc_info.value)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information should be as useful as the previous one. In current version I don't know if artifacts are the problem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented

@sfc-gh-jsikorski sfc-gh-jsikorski force-pushed the jsikorski/1163968-Pydantic-PoC-for-Streamlit branch from 92736d8 to 14f6c3b Compare March 6, 2024 14:34
sfc-gh-astus and others added 4 commits March 6, 2024 15:41
This reverts commit 80467b3.

mraba/app-factory: create app with factory (#860)

* mraba/app-factory: create app with factory

SNOW-1043081: Adding support for qualified names for image repositories. (#823)

* SNOW-1043081: Adding support for qualified image repository names

* SNOW-1043081: Fixing test imports

* SNOW-1043081: Adding tests for getting image repository url without db or schema

SNOW-1011771: Added generic REPLACE, IF EXISTS, IF NOT EXISTS flags (#826)

* SNOW-1011771: Adding generic OR REPLACE, IF EXISTS, IF NOT EXISTS flags to flags.py

* SNOW-1011771: Using generic ReplaceOption in snowpark deploy and streamlit deploy

* SNOW-1011771: Using generic IfNotExistsOption in compute pool create and updating unit tests.

* SNOW-1011771: Using generic IfNotExistsOption in service create and updating unit tests

* SNOW-1011771: Using generic ReplaceOption and IfNotExistsOption in image-repository create.

* SNOW-1011771: Fixup

* SNOW-1011771: Update release notes

* SNOW-1011771: Update test_help_messages

* SNOW-1011771: precommit

* SNOW-1011771: Adding validation that only one create mode option can be set at once

* fixup

* SNOW-1011771: Updating tests for REPLACE AND IF NOT EXISTS case on image-repository create to throw error

* SNOW-1011771: Adding snapshots

* SNOW-1011771: Adding a new mutually_exclusive field to OverrideableOption

* formatting

* SNOW-1011771: Adding tests for OverrideableOption

* SNOW-1011771: Fixing test failures due to improperly quoted string

Add snow --help to test_help_messages (#821)

* Add snow --help to test_help_messages

* update snapshot

Avoid plain print, make sure silent is eager flag (#871)

[NADE] Update CODEOWNERS to use NADE team id. (#873)

update to using nade team in codeowners

New workflow to stop running workflows after new commit (#862)

* new workflow

* new workflow

* new workflow

* new workflow

* typo fix

* typo fix

* import fix

* import fix

* import fix

* import fix

* import fix

* import fix

* import fix

* new approach

* new approach

* new approach

* new approach

* new approach

* New approach

* added to test

* Added to more workflows

* Dummy commit

Schemas

adjusting native apps to streamlit

fixing streamlit

fixies after unit tests

fixies after unit tests

fixing for snowflake

fixing for snowflake

Fixes after review

Fixes after review

Fixes after review
@sfc-gh-cgorrie
Copy link
Contributor

@sfc-gh-jsikorski just thinking a bit ahead to IDE integration. It looks like pydantic has the ability to generate JSON schema, which we'd like to consume inside of IDEs. Eventually we should extract the project definition schemas into their own open-source project which can also host these JSON schemas as part of a CD process.

Wanted to confirm this medium-term vision is compatible with your future plans too!

Copy link
Contributor

@sfc-gh-cgorrie sfc-gh-cgorrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial questions and comments while I dive deeper into the code.

@@ -100,5 +101,5 @@ def _user_definition_file_if_available(project_path: Path) -> Optional[Path]:
)

@functools.cached_property
def project_definition(self) -> dict:
def project_definition(self) -> ProjectDefinition:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the type safety Pydantic gives us!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too :)
As i cannot answer the comment above - yes, Pydantic can be used to generate JSON schemas, but i think i'll summon @sfc-gh-turbaszek to answer the long-term vision plans :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we could use description and examples fields of FieldInfo, to have the schema well documented.

)
streamlit: Optional[StreamlitSchema] = Field(
title="Native app definitions for the project", default=None
)
Copy link
Contributor

@sfc-gh-cgorrie sfc-gh-cgorrie Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to think about this -- when we get to project definition v2 (design currently in progress) it will look completely different. We should have a way to separate out the definitions by definition_version rather than have to have a complex base model that has to support both. I agree that for now ensuring definition_version == 1 is important, but we should make sure that when we introduce definition_version == 2 it's something that can live alongside this model rather than "within" it, if we have enough changes that it makes sense.

@@ -794,6 +794,7 @@ def test_create_dev_app_create_new_quoted(
- src: app/streamlit/*.py
dest: ui/


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an indentation error -- this is part of the last entry of the artifacts collection (i.e. { src: "app/streamlit/*.py", dest: "ui/" }).

@sfc-gh-jsikorski sfc-gh-jsikorski marked this pull request as ready for review March 6, 2024 16:42
Copy link
Contributor

@sfc-gh-cgorrie sfc-gh-cgorrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments and suggestions.

Also I have a question, is it intentional that we have a test_pydantic_schemas.py that is empty?

RELEASE-NOTES.md Outdated
@@ -10,6 +10,8 @@
## Fixes and improvements
* Adding `--image-name` option for image name argument in `spcs image-repository list-tags` for consistency with other commands.
* Fixed errors during `spcs image-registry login` not being formatted correctly.
* Project definition no longer accept extra fields. Any extra field will cause an error.
* Project definition no longer accept extra fields. Any extra field will cause an error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sfc-gh-turbaszek
Copy link
Contributor

@sfc-gh-jsikorski just thinking a bit ahead to IDE integration. It looks like pydantic has the ability to generate JSON schema, which we'd like to consume inside of IDEs. Eventually we should extract the project definition schemas into their own open-source project which can also host these JSON schemas as part of a CD process.

Wanted to confirm this medium-term vision is compatible with your future plans too!

The ability to generate JSON schema is one of the main goals (IDE, docs, etc). In terms of vision - extracting this to separate package is not planned yet, but once we have py ecosystem then we can easily do so.

@@ -25,6 +25,7 @@ dependencies = [
"typer==0.9.0",
"urllib3>=1.21.1,<2.3",
"GitPython==3.1.42",
"pydantic==2.6.3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove need for strictyml? Or should it be done in separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Un-done. Left pydantic for some uses not directly related to project definitions

@sfc-gh-jsikorski sfc-gh-jsikorski enabled auto-merge (squash) March 8, 2024 12:24
Copy link
Contributor

@sfc-gh-bgoel sfc-gh-bgoel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main changes LGTM. Left some comments about messages shown to the user which need to be in accordance with our PMM team's guidelines.


class Application(UpdatableModel):
role: Optional[str] = Field(
title="Role to use when creating the application instance and consumer-side objects",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "application object" instead of "application instance"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

default=None,
)
name: Optional[str] = Field(
title="Name of the application created when you run the snow app run command",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... application object created ..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

default=None,
)
warehouse: Optional[str] = IdentifierField(
title="Name of the application created when you run the snow app run command",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, and for line 26 as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@classmethod
def validate_source_stage(cls, input_value: str):
if not re.match(SCHEMA_AND_NAME, input_value):
raise ValueError("Incorrect value for Native Apps source stage value")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can rephrase to:
Incorrect value for source_stage property of native_app.
The goal is to avoid using "Native Apps" as a term, which comes from our marketing teams.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think latter one we can add something like regex_validator(value, pattern) that would cause a generic error like Incorrect value for native_app.source_stage. Allowed values: pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@sfc-gh-bgoel sfc-gh-bgoel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for native app and project changes, thank you!

Copy link
Contributor

@sfc-gh-bgoel sfc-gh-bgoel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-approving :)

@sfc-gh-jsikorski sfc-gh-jsikorski merged commit 99b96c4 into main Mar 8, 2024
18 checks passed
@sfc-gh-jsikorski sfc-gh-jsikorski deleted the jsikorski/1163968-Pydantic-PoC-for-Streamlit branch March 8, 2024 15:58
sfc-gh-sichen pushed a commit that referenced this pull request Oct 17, 2024
* Revert "Bump tomlkit from 0.12.3 to 0.12.4 (#848)" (#863)

This reverts commit a3ed7cc.

mraba/app-factory: create app with factory (#860)

* mraba/app-factory: create app with factory

SNOW-1043081: Adding support for qualified names for image repositories. (#823)

* SNOW-1043081: Adding support for qualified image repository names

* SNOW-1043081: Fixing test imports

* SNOW-1043081: Adding tests for getting image repository url without db or schema

SNOW-1011771: Added generic REPLACE, IF EXISTS, IF NOT EXISTS flags (#826)

* SNOW-1011771: Adding generic OR REPLACE, IF EXISTS, IF NOT EXISTS flags to flags.py

* SNOW-1011771: Using generic ReplaceOption in snowpark deploy and streamlit deploy

* SNOW-1011771: Using generic IfNotExistsOption in compute pool create and updating unit tests.

* SNOW-1011771: Using generic IfNotExistsOption in service create and updating unit tests

* SNOW-1011771: Using generic ReplaceOption and IfNotExistsOption in image-repository create.

* SNOW-1011771: Fixup

* SNOW-1011771: Update release notes

* SNOW-1011771: Update test_help_messages

* SNOW-1011771: precommit

* SNOW-1011771: Adding validation that only one create mode option can be set at once

* fixup

* SNOW-1011771: Updating tests for REPLACE AND IF NOT EXISTS case on image-repository create to throw error

* SNOW-1011771: Adding snapshots

* SNOW-1011771: Adding a new mutually_exclusive field to OverrideableOption

* formatting

* SNOW-1011771: Adding tests for OverrideableOption

* SNOW-1011771: Fixing test failures due to improperly quoted string

Add snow --help to test_help_messages (#821)

* Add snow --help to test_help_messages

* update snapshot

Avoid plain print, make sure silent is eager flag (#871)

[NADE] Update CODEOWNERS to use NADE team id. (#873)

update to using nade team in codeowners

New workflow to stop running workflows after new commit (#862)

* new workflow

* new workflow

* new workflow

* new workflow

* typo fix

* typo fix

* import fix

* import fix

* import fix

* import fix

* import fix

* import fix

* import fix

* new approach

* new approach

* new approach

* new approach

* new approach

* New approach

* added to test

* Added to more workflows

* Dummy commit

Schemas

adjusting native apps to streamlit

fixing streamlit

fixies after unit tests

fixies after unit tests

fixing for snowflake

fixing for snowflake

Fixes after review

Fixes after review

Fixes after review

* Fixes after review

* Implemented error class

* Fixes

* Fixes

* Fixes

* Fixes

* typo fix

* Added unit test

* Added unit test

* Fixes after review

* Fixes after review

* Fixes

* Fixes

* Fixes

---------

Co-authored-by: Adam Stus <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants