From b3d161f696fa0612f8fa2601fa960c0191cfc61b Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Mon, 18 Nov 2024 10:55:54 +0800 Subject: [PATCH 01/20] [Docs] MessagePack IDL, Pydantic Support and Attribute Access Signed-off-by: Future-Outlier --- docs/user_guide/data_types_and_io/dataclass.md | 11 ++++++++++- docs/user_guide/data_types_and_io/index.md | 1 + docs/user_guide/data_types_and_io/pydantic.md | 18 ++++++++++++++++++ 3 files changed, 29 insertions(+), 1 deletion(-) create mode 100644 docs/user_guide/data_types_and_io/pydantic.md diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index 926c9d35b4..fef6e20569 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -12,7 +12,16 @@ Flytekit uses the [Mashumaro library](https://github.com/Fatal1ty/mashumaro) to serialize and deserialize dataclasses. :::{important} -If you're using Flytekit version below v1.11.1, you will need to add `from dataclasses_json import dataclass_json` to your imports and decorate your dataclass with `@dataclass_json`. +If you're using Flytekit version < v1.11.1, you will need to add `from dataclasses_json import dataclass_json` to your imports and decorate your dataclass with `@dataclass_json`. +::: + +:::{important} +Flytekit version < v1.14.0 will produce protobuf struct literal for dataclasses. + +Flytekit version >= v1.14.0 will produce msgpack bytes literal for dataclasses. + +If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for dataclasses, you can +set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. ::: ```{note} diff --git a/docs/user_guide/data_types_and_io/index.md b/docs/user_guide/data_types_and_io/index.md index 3280054696..d8e72dabce 100644 --- a/docs/user_guide/data_types_and_io/index.md +++ b/docs/user_guide/data_types_and_io/index.md @@ -144,6 +144,7 @@ flytefile flytedirectory structureddataset dataclass +pydantic accessing_attributes pytorch_type enum_type diff --git a/docs/user_guide/data_types_and_io/pydantic.md b/docs/user_guide/data_types_and_io/pydantic.md new file mode 100644 index 0000000000..edd60e860d --- /dev/null +++ b/docs/user_guide/data_types_and_io/pydantic.md @@ -0,0 +1,18 @@ +(pydantic)= + +# Pydantic BaseModel + +```{eval-rst} +.. tags:: Basic +``` + +When you've multiple values that you want to send across Flyte entities, and you want them to have, you can use a `pydantic.BaseModel`. +Note: +You can put Dataclass and FlyteTypes (FlyteFile, FlyteDirectory, FlyteSchema, and StructuredDataset) in a pydantic BaseModel. + +:::{important} +Pydantic BaseModle V2 only works when you are using flytekit version >= v1.14.0. + +If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for pydantic basemodels, +you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. +::: From 0a3cdbe594dcd9654ca53355b4e2b8eea4bf711d Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Tue, 19 Nov 2024 10:01:28 +0800 Subject: [PATCH 02/20] support Signed-off-by: Future-Outlier --- docs/user_guide/data_types_and_io/pydantic.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/user_guide/data_types_and_io/pydantic.md b/docs/user_guide/data_types_and_io/pydantic.md index edd60e860d..c126c6d78e 100644 --- a/docs/user_guide/data_types_and_io/pydantic.md +++ b/docs/user_guide/data_types_and_io/pydantic.md @@ -11,8 +11,11 @@ Note: You can put Dataclass and FlyteTypes (FlyteFile, FlyteDirectory, FlyteSchema, and StructuredDataset) in a pydantic BaseModel. :::{important} -Pydantic BaseModle V2 only works when you are using flytekit version >= v1.14.0. +Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. +::: -If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for pydantic basemodels, +:::{important} +If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for pydantic basemodels, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. ::: + From 890639bfe04eee97af891109bb45709cbc5474ad Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Wed, 20 Nov 2024 12:48:34 +0800 Subject: [PATCH 03/20] update Signed-off-by: Future-Outlier --- .../data_types_and_io/accessing_attributes.md | 4 + docs/user_guide/data_types_and_io/pydantic.md | 73 +++++++++++++++++++ 2 files changed, 77 insertions(+) diff --git a/docs/user_guide/data_types_and_io/accessing_attributes.md b/docs/user_guide/data_types_and_io/accessing_attributes.md index f2783afacf..a08f20b79a 100644 --- a/docs/user_guide/data_types_and_io/accessing_attributes.md +++ b/docs/user_guide/data_types_and_io/accessing_attributes.md @@ -11,6 +11,10 @@ Note that while this functionality may appear to be the normal behavior of Pytho Consequently, accessing attributes in this manner is, in fact, a specially implemented feature. This functionality facilitates the direct passing of output attributes within workflows, enhancing the convenience of working with complex data structures. +```{note} +Flytekit version >= v1.14.0 supports Pydantic BaseModel V2, you can do attribute access on Pydantic BaseModel V2 as well. +``` + ```{note} To clone and run the example code on this page, see the [Flytesnacks repo][flytesnacks]. ``` diff --git a/docs/user_guide/data_types_and_io/pydantic.md b/docs/user_guide/data_types_and_io/pydantic.md index c126c6d78e..d75e0faf48 100644 --- a/docs/user_guide/data_types_and_io/pydantic.md +++ b/docs/user_guide/data_types_and_io/pydantic.md @@ -19,3 +19,76 @@ If you're using Flytekit version >= v1.14.0 and you want to produce protobuf str you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. ::: + +```{note} +To clone and run the example code on this page, see the [Flytesnacks repo][flytesnacks]. +``` + +To begin, import the necessary dependencies: + +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py +:caption: data_types_and_io/pydantic_basemodel.py +:lines: 1-9 +``` + +Build your custom image with ImageSpec: +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py +:caption: data_types_and_io/pydantic_basemodel.py +:lines: 11-14 +``` + +## Python types +We define a `dataclass` with `int`, `str` and `dict` as the data types. + +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py +:caption: data_types_and_io/pydantic_basemodel.py +:pyobject: Datum +``` + +You can send a `dataclass` between different tasks written in various languages, and input it through the Flyte console as raw JSON. + +:::{note} +All variables in a data class should be **annotated with their type**. Failure to do should will result in an error. +::: + +Once declared, a dataclass can be returned as an output or accepted as an input. + +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py +:caption: data_types_and_io/pydantic_basemodel.py +:lines: 26-41 +``` + +## Flyte types +We also define a data class that accepts {std:ref}`StructuredDataset `, +{std:ref}`FlyteFile ` and {std:ref}`FlyteDirectory `. + +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py +:caption: data_types_and_io/pydantic_basemodel.py +:lines: 45-86 +``` + +A data class supports the usage of data associated with Python types, data classes, +flyte file, flyte directory and structured dataset. + +We define a workflow that calls the tasks created above. + +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py +:caption: data_types_and_io/pydantic_basemodel.py +:pyobject: basemodel_wf +``` + +You can run the workflow locally as follows: + +```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py +:caption: data_types_and_io/pydantic_basemodel.py +:lines: 99-100 +``` + +To trigger a task that accepts a dataclass as an input with `pyflyte run`, you can provide a JSON file as an input: +``` +pyflyte run \ + https://raw.githubusercontent.com/flyteorg/flytesnacks/b71e01d45037cea883883f33d8d93f258b9a5023/examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py \ + basemodel_wf --x 1 --y 2 +``` + +[flytesnacks]: https://github.com/flyteorg/flytesnacks/tree/master/examples/data_types_and_io/ From 0a793be4d3832fe791d3e831ea8d167b386c34fb Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Wed, 20 Nov 2024 12:52:18 +0800 Subject: [PATCH 04/20] lint Signed-off-by: Future-Outlier --- docs/user_guide/data_types_and_io/accessing_attributes.md | 2 +- docs/user_guide/data_types_and_io/dataclass.md | 2 ++ docs/user_guide/data_types_and_io/pydantic.md | 2 ++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/accessing_attributes.md b/docs/user_guide/data_types_and_io/accessing_attributes.md index a08f20b79a..75dad9ff3d 100644 --- a/docs/user_guide/data_types_and_io/accessing_attributes.md +++ b/docs/user_guide/data_types_and_io/accessing_attributes.md @@ -11,7 +11,7 @@ Note that while this functionality may appear to be the normal behavior of Pytho Consequently, accessing attributes in this manner is, in fact, a specially implemented feature. This functionality facilitates the direct passing of output attributes within workflows, enhancing the convenience of working with complex data structures. -```{note} +```{important} Flytekit version >= v1.14.0 supports Pydantic BaseModel V2, you can do attribute access on Pydantic BaseModel V2 as well. ``` diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index fef6e20569..6ce281450e 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -22,6 +22,8 @@ Flytekit version >= v1.14.0 will produce msgpack bytes literal for dataclasses. If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for dataclasses, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. + +For more details, you can refer the MSGPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md ::: ```{note} diff --git a/docs/user_guide/data_types_and_io/pydantic.md b/docs/user_guide/data_types_and_io/pydantic.md index d75e0faf48..c9647213fd 100644 --- a/docs/user_guide/data_types_and_io/pydantic.md +++ b/docs/user_guide/data_types_and_io/pydantic.md @@ -17,6 +17,8 @@ Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. :::{important} If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for pydantic basemodels, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. + +For more details, you can refer the MSGPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md ::: From 65e50cee78dd144c28ae9e9ea52091c7121683e0 Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Wed, 20 Nov 2024 21:58:53 +0800 Subject: [PATCH 05/20] Trigger CI Signed-off-by: Future-Outlier From 64832387872cf5bd4166afaac67402b86a37c476 Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Wed, 20 Nov 2024 22:21:58 +0800 Subject: [PATCH 06/20] Trigger CI Signed-off-by: Future-Outlier From 40a100ab7dfd54d1581966c1b61f3bbf833f2b97 Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Wed, 20 Nov 2024 22:41:04 +0800 Subject: [PATCH 07/20] lint Signed-off-by: Future-Outlier --- .../data_types_and_io/accessing_attributes.md | 12 ++++++------ docs/user_guide/data_types_and_io/index.md | 4 ++-- .../{pydantic.md => pydantic_basemodel.md} | 17 +++++++++-------- 3 files changed, 17 insertions(+), 16 deletions(-) rename docs/user_guide/data_types_and_io/{pydantic.md => pydantic_basemodel.md} (82%) diff --git a/docs/user_guide/data_types_and_io/accessing_attributes.md b/docs/user_guide/data_types_and_io/accessing_attributes.md index 75dad9ff3d..82b2345ad5 100644 --- a/docs/user_guide/data_types_and_io/accessing_attributes.md +++ b/docs/user_guide/data_types_and_io/accessing_attributes.md @@ -23,7 +23,7 @@ To begin, import the required dependencies and define a common task for subseque ```{literalinclude} /examples/data_types_and_io/data_types_and_io/attribute_access.py :caption: data_types_and_io/attribute_access.py -:lines: 1-10 +:lines: 1-9 ``` ## List @@ -35,7 +35,7 @@ Flyte currently does not support output promise access through list slicing. ```{literalinclude} /examples/data_types_and_io/data_types_and_io/attribute_access.py :caption: data_types_and_io/attribute_access.py -:lines: 14-23 +:lines: 13-22 ``` ## Dictionary @@ -43,7 +43,7 @@ Access the output dictionary by specifying the key. ```{literalinclude} /examples/data_types_and_io/data_types_and_io/attribute_access.py :caption: data_types_and_io/attribute_access.py -:lines: 27-35 +:lines: 26-34 ``` ## Data class @@ -51,7 +51,7 @@ Directly access an attribute of a dataclass. ```{literalinclude} /examples/data_types_and_io/data_types_and_io/attribute_access.py :caption: data_types_and_io/attribute_access.py -:lines: 39-53 +:lines: 38-51 ``` ## Complex type @@ -59,14 +59,14 @@ Combinations of list, dict and dataclass also work effectively. ```{literalinclude} /examples/data_types_and_io/data_types_and_io/attribute_access.py :caption: data_types_and_io/attribute_access.py -:lines: 57-80 +:lines: 55-78 ``` You can run all the workflows locally as follows: ```{literalinclude} /examples/data_types_and_io/data_types_and_io/attribute_access.py :caption: data_types_and_io/attribute_access.py -:lines: 84-88 +:lines: 82-86 ``` ## Failure scenario diff --git a/docs/user_guide/data_types_and_io/index.md b/docs/user_guide/data_types_and_io/index.md index d8e72dabce..c554b08acd 100644 --- a/docs/user_guide/data_types_and_io/index.md +++ b/docs/user_guide/data_types_and_io/index.md @@ -114,7 +114,7 @@ Here's a breakdown of these mappings: - Use ``pyspark.DataFrame`` as a type hint. * - ``pydantic.BaseModel`` - ``Map`` - - To utilize the type, install the ``flytekitplugins-pydantic`` plugin. + - To utilize the type, install the ``pydantic>2`` module. - Use ``pydantic.BaseModel`` as a type hint. * - ``torch.Tensor`` / ``torch.nn.Module`` - File @@ -144,7 +144,7 @@ flytefile flytedirectory structureddataset dataclass -pydantic +pydantic_basemodel accessing_attributes pytorch_type enum_type diff --git a/docs/user_guide/data_types_and_io/pydantic.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md similarity index 82% rename from docs/user_guide/data_types_and_io/pydantic.md rename to docs/user_guide/data_types_and_io/pydantic_basemodel.md index c9647213fd..4191e06c8e 100644 --- a/docs/user_guide/data_types_and_io/pydantic.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -1,4 +1,4 @@ -(pydantic)= +(pydantic_basemodel)= # Pydantic BaseModel @@ -6,21 +6,22 @@ .. tags:: Basic ``` -When you've multiple values that you want to send across Flyte entities, and you want them to have, you can use a `pydantic.BaseModel`. -Note: -You can put Dataclass and FlyteTypes (FlyteFile, FlyteDirectory, FlyteSchema, and StructuredDataset) in a pydantic BaseModel. +When you have multiple values that you want to send across Flyte entities, and you want them to have, you can use a `pydantic.BaseModel`. :::{important} Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. ::: :::{important} -If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for pydantic basemodels, +If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for Pydantic BaseModels, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. -For more details, you can refer the MSGPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md +For more details, you can refer the MESSAGEPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md ::: +```{note} +You can put Dataclass and FlyteTypes (FlyteFile, FlyteDirectory, FlyteSchema, and StructuredDataset) in a pydantic BaseModel. +``` ```{note} To clone and run the example code on this page, see the [Flytesnacks repo][flytesnacks]. @@ -40,14 +41,14 @@ Build your custom image with ImageSpec: ``` ## Python types -We define a `dataclass` with `int`, `str` and `dict` as the data types. +We define a `pydantic basemodel` with `int`, `str` and `dict` as the data types. ```{literalinclude} /examples/data_types_and_io/data_types_and_io/pydantic_basemodel.py :caption: data_types_and_io/pydantic_basemodel.py :pyobject: Datum ``` -You can send a `dataclass` between different tasks written in various languages, and input it through the Flyte console as raw JSON. +You can send a `pydantic basemodel` between different tasks written in various languages, and input it through the Flyte console as raw JSON. :::{note} All variables in a data class should be **annotated with their type**. Failure to do should will result in an error. From 8ad3f64cde6355f0fc67527e8ce7b49ea5bb6f05 Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 08:22:51 +0800 Subject: [PATCH 08/20] Update docs/user_guide/data_types_and_io/dataclass.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/dataclass.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index 6ce281450e..c984c7a909 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -9,7 +9,11 @@ When you've multiple values that you want to send across Flyte entities, you can use a `dataclass`. Flytekit uses the [Mashumaro library](https://github.com/Fatal1ty/mashumaro) -to serialize and deserialize dataclasses. +to serialize and deserialize dataclasses. With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization format for dataclasses, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype, like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to workaround this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses. + +:::{important} + +If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. :::{important} If you're using Flytekit version < v1.11.1, you will need to add `from dataclasses_json import dataclass_json` to your imports and decorate your dataclass with `@dataclass_json`. From 1abb6fe3eb46f01f3a7f399dfde525ca1e64642e Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 08:23:12 +0800 Subject: [PATCH 09/20] Update docs/user_guide/data_types_and_io/pydantic_basemodel.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/pydantic_basemodel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index 4191e06c8e..f2bbaa8d54 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -6,7 +6,7 @@ .. tags:: Basic ``` -When you have multiple values that you want to send across Flyte entities, and you want them to have, you can use a `pydantic.BaseModel`. +`flytekit` version >=1.14 supports natively the `JSON` format that Pydantic `BaseModel` produces, enhancing the interoperability of Pydantic schemas with the Flyte type system. :::{important} Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. From 01276eda880a90b17f3f95b95b811ab2de3d1906 Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 11:48:14 +0800 Subject: [PATCH 10/20] Update docs/user_guide/data_types_and_io/pydantic_basemodel.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/pydantic_basemodel.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index f2bbaa8d54..dca37ea1ec 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -12,7 +12,9 @@ Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. ::: +With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization format for Pydantic `BaseModel`, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to workaround this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses, preserving the types defined in your `BaseModel` class. :::{important} +If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for Pydantic BaseModels, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. From c1de1008f8617a848ed19793d632ee5653956834 Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 11:48:25 +0800 Subject: [PATCH 11/20] Update docs/user_guide/data_types_and_io/pydantic_basemodel.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/pydantic_basemodel.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index dca37ea1ec..08ad7a4de8 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -16,7 +16,6 @@ With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization for :::{important} If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for Pydantic BaseModels, -you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. For more details, you can refer the MESSAGEPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md ::: From 9f7e45275e74aab5749b09b7db65161d141ec62e Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 11:48:35 +0800 Subject: [PATCH 12/20] Update docs/user_guide/data_types_and_io/pydantic_basemodel.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/pydantic_basemodel.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index 08ad7a4de8..4ba92ae7db 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -15,7 +15,6 @@ Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization format for Pydantic `BaseModel`, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to workaround this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses, preserving the types defined in your `BaseModel` class. :::{important} If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. -If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for Pydantic BaseModels, For more details, you can refer the MESSAGEPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md ::: From 1f32c92afe284436483973785dd1c6f240e42b91 Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Thu, 21 Nov 2024 11:55:23 +0800 Subject: [PATCH 13/20] nit Signed-off-by: Future-Outlier --- docs/user_guide/data_types_and_io/dataclass.md | 6 ++++-- docs/user_guide/data_types_and_io/pydantic_basemodel.md | 7 +++++-- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index c984c7a909..bb9372a3f8 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -9,11 +9,13 @@ When you've multiple values that you want to send across Flyte entities, you can use a `dataclass`. Flytekit uses the [Mashumaro library](https://github.com/Fatal1ty/mashumaro) -to serialize and deserialize dataclasses. With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization format for dataclasses, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype, like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to workaround this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses. +to serialize and deserialize dataclasses. With the 1.14 release, `flytekit` adopted `MessagePack` as the +serialization format for dataclasses, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype, like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses. :::{important} -If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. +If you're serializing dataclasses using `flytekit` version >= v1.14.0, and you want to produce Protobuf `struct +literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. :::{important} If you're using Flytekit version < v1.11.1, you will need to add `from dataclasses_json import dataclass_json` to your imports and decorate your dataclass with `@dataclass_json`. diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index 4ba92ae7db..8864421c75 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -6,13 +6,16 @@ .. tags:: Basic ``` -`flytekit` version >=1.14 supports natively the `JSON` format that Pydantic `BaseModel` produces, enhancing the interoperability of Pydantic schemas with the Flyte type system. +`flytekit` version >=1.14 supports natively the `JSON` format that Pydantic `BaseModel` produces, enhancing the +interoperability of Pydantic BaseModels with the Flyte type system. :::{important} Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. ::: -With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization format for Pydantic `BaseModel`, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to workaround this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses, preserving the types defined in your `BaseModel` class. +With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization format for Pydantic `BaseModel`, +overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses, preserving the types defined in your `BaseModel` class. + :::{important} If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. From 00729b6147f0629ba2df16fadb2c52620928b6b2 Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Thu, 21 Nov 2024 11:56:51 +0800 Subject: [PATCH 14/20] nit Signed-off-by: Future-Outlier --- docs/user_guide/data_types_and_io/dataclass.md | 10 ++++++++-- .../user_guide/data_types_and_io/pydantic_basemodel.md | 6 +++++- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index bb9372a3f8..ac2efe08e9 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -9,8 +9,14 @@ When you've multiple values that you want to send across Flyte entities, you can use a `dataclass`. Flytekit uses the [Mashumaro library](https://github.com/Fatal1ty/mashumaro) -to serialize and deserialize dataclasses. With the 1.14 release, `flytekit` adopted `MessagePack` as the -serialization format for dataclasses, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype, like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses. +to serialize and deserialize dataclasses. + +With the 1.14 release, `flytekit` adopted `MessagePack` as the +serialization format for dataclasses, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype, like the previous versions do: + +to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. + +By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses. :::{important} diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index 8864421c75..cdf6500fc4 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -14,7 +14,11 @@ Pydantic BaseModel V2 only works when you are using flytekit version >= v1.14.0. ::: With the 1.14 release, `flytekit` adopted `MessagePack` as the serialization format for Pydantic `BaseModel`, -overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses, preserving the types defined in your `BaseModel` class. +overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype like the previous versions do: + +to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. + +By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses, preserving the types defined in your `BaseModel` class. :::{important} If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. From 0a1c6f7aa47ad489957ecdfd6121aea32e5f9c99 Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 21:40:02 +0800 Subject: [PATCH 15/20] Update docs/user_guide/data_types_and_io/dataclass.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/dataclass.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index ac2efe08e9..6f47eeab1f 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -16,7 +16,6 @@ serialization format for dataclasses, overcoming a major limitation of serializ to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. -By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses. :::{important} From f5b4f1165871b6bfc93d3b5404713ed417ebd15c Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 21:40:11 +0800 Subject: [PATCH 16/20] Update docs/user_guide/data_types_and_io/dataclass.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/dataclass.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index 6f47eeab1f..a26afc4183 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -26,7 +26,6 @@ literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to If you're using Flytekit version < v1.11.1, you will need to add `from dataclasses_json import dataclass_json` to your imports and decorate your dataclass with `@dataclass_json`. ::: -:::{important} Flytekit version < v1.14.0 will produce protobuf struct literal for dataclasses. Flytekit version >= v1.14.0 will produce msgpack bytes literal for dataclasses. From 52511c3dee3af2a39a28ba05a20d8701242c4076 Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 21:40:26 +0800 Subject: [PATCH 17/20] Update docs/user_guide/data_types_and_io/pydantic_basemodel.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/pydantic_basemodel.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index cdf6500fc4..f78057c0d8 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -18,7 +18,6 @@ overcoming a major limitation of serialization into a JSON string within a Proto to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. -By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing dataclasses, preserving the types defined in your `BaseModel` class. :::{important} If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. From ab4f680dfa2346cbd8058936c393fcf80cb168d4 Mon Sep 17 00:00:00 2001 From: "Han-Ru Chen (Future-Outlier)" Date: Thu, 21 Nov 2024 21:40:36 +0800 Subject: [PATCH 18/20] Update docs/user_guide/data_types_and_io/pydantic_basemodel.md Co-authored-by: David Espejo <82604841+davidmirror-ops@users.noreply.github.com> Signed-off-by: Han-Ru Chen (Future-Outlier) --- docs/user_guide/data_types_and_io/pydantic_basemodel.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index f78057c0d8..e0899bd649 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -20,7 +20,8 @@ to store `int` types, Protobuf's `struct` converts them to `float`, forcing user :::{important} -If you're serializing dataclasses using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. +By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing, preserving the types defined in your `BaseModel` class. +If you're serializing `BaseModel` using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. For more details, you can refer the MESSAGEPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md ::: From 621c9ba24173987a65d3c910b3f021284ee1fd4f Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Thu, 21 Nov 2024 21:51:37 +0800 Subject: [PATCH 19/20] format Signed-off-by: Future-Outlier --- docs/user_guide/data_types_and_io/dataclass.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index a26afc4183..f2f5b3f8a4 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -12,20 +12,22 @@ Flytekit uses the [Mashumaro library](https://github.com/Fatal1ty/mashumaro) to serialize and deserialize dataclasses. With the 1.14 release, `flytekit` adopted `MessagePack` as the -serialization format for dataclasses, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype, like the previous versions do: +serialization format for dataclasses, overcoming a major limitation of serialization into a JSON string within a Protobuf `struct` datatype, like the previous versions do: to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. :::{important} - If you're serializing dataclasses using `flytekit` version >= v1.14.0, and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. +::: + :::{important} If you're using Flytekit version < v1.11.1, you will need to add `from dataclasses_json import dataclass_json` to your imports and decorate your dataclass with `@dataclass_json`. ::: +:::{important} Flytekit version < v1.14.0 will produce protobuf struct literal for dataclasses. Flytekit version >= v1.14.0 will produce msgpack bytes literal for dataclasses. From 4515e3160fdfb71ea5ae0c2b8165dc67db289c9e Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Fri, 22 Nov 2024 00:38:56 +0800 Subject: [PATCH 20/20] nit Signed-off-by: Future-Outlier --- docs/user_guide/data_types_and_io/dataclass.md | 11 ++--------- .../data_types_and_io/pydantic_basemodel.md | 3 +-- 2 files changed, 3 insertions(+), 11 deletions(-) diff --git a/docs/user_guide/data_types_and_io/dataclass.md b/docs/user_guide/data_types_and_io/dataclass.md index f2f5b3f8a4..462ba7da3a 100644 --- a/docs/user_guide/data_types_and_io/dataclass.md +++ b/docs/user_guide/data_types_and_io/dataclass.md @@ -16,23 +16,16 @@ serialization format for dataclasses, overcoming a major limitation of serializa to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. - -:::{important} -If you're serializing dataclasses using `flytekit` version >= v1.14.0, and you want to produce Protobuf `struct -literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. -::: - - :::{important} If you're using Flytekit version < v1.11.1, you will need to add `from dataclasses_json import dataclass_json` to your imports and decorate your dataclass with `@dataclass_json`. ::: :::{important} -Flytekit version < v1.14.0 will produce protobuf struct literal for dataclasses. +Flytekit version < v1.14.0 will produce protobuf `struct` literal for dataclasses. Flytekit version >= v1.14.0 will produce msgpack bytes literal for dataclasses. -If you're using Flytekit version >= v1.14.0 and you want to produce protobuf struct literal for dataclasses, you can +If you're using Flytekit version >= v1.14.0 and you want to produce protobuf `struct` literal for dataclasses, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. For more details, you can refer the MSGPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md diff --git a/docs/user_guide/data_types_and_io/pydantic_basemodel.md b/docs/user_guide/data_types_and_io/pydantic_basemodel.md index e0899bd649..be40672534 100644 --- a/docs/user_guide/data_types_and_io/pydantic_basemodel.md +++ b/docs/user_guide/data_types_and_io/pydantic_basemodel.md @@ -18,10 +18,9 @@ overcoming a major limitation of serialization into a JSON string within a Proto to store `int` types, Protobuf's `struct` converts them to `float`, forcing users to write boilerplate code to work around this issue. - :::{important} By default, `flytekit >= 1.14` will produce `msgpack` bytes literals when serializing, preserving the types defined in your `BaseModel` class. -If you're serializing `BaseModel` using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct literal` instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. +If you're serializing `BaseModel` using `flytekit` version >= v1.14.0 and you want to produce Protobuf `struct` literal instead, you can set environment variable `FLYTE_USE_OLD_DC_FORMAT` to `true`. For more details, you can refer the MESSAGEPACK IDL RFC: https://github.com/flyteorg/flyte/blob/master/rfc/system/5741-binary-idl-with-message-pack.md :::