Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Attribute Access Structured Dataset from Dataclass and return it will fail #5956

Open
2 tasks done
Future-Outlier opened this issue Nov 5, 2024 · 1 comment · May be fixed by flyteorg/flytekit#2954
Open
2 tasks done
Assignees
Labels
bug Something isn't working flytekit FlyteKit Python related issue

Comments

@Future-Outlier
Copy link
Member

Describe the bug

example code:

from dataclasses import dataclass, field
from flytekit.types.structured import StructuredDataset
from flytekit import task, workflow, ImageSpec

flytekit_hash = "6e4e53bb89debbeef764d3a0a16e499e0bcd18e2" # from master branch
flytekit = f"git+https://github.com/flyteorg/flytekit.git@{flytekit_hash}"
image = ImageSpec(
    packages=[flytekit,
              "pandas",
              "pyarrow"],
    apt_packages=["git"],
    registry="localhost:30000",
)


@dataclass
class DC:
    sd: StructuredDataset = field(default_factory=lambda: StructuredDataset(uri="s3://my-s3-bucket/s3_flyte_dir/df.parquet", file_format="parquet"))

@task(container_image=image)
def t_sd_attr(sd: StructuredDataset) -> StructuredDataset:
    return sd

@workflow
def wf(dc: DC):
    t_sd_attr(sd=dc.sd)

if __name__ == "__main__":
    from flytekit.clis.sdk_in_container import pyflyte
    from click.testing import CliRunner
    import os

    input_val = '{"dc": {"sd": {"uri": "s3://my-s3-bucket/s3_flyte_dir/df.parquet", "file_format": "parquet"}}}'
    runner = CliRunner()
    path = os.path.realpath(__file__)
    result = runner.invoke(pyflyte.main, ["run", "--remote", path, "wf", "--dc", input_val])
    print("Remote Execution: ", result.output)

error message:

[ak8chhw7598h6x695dvg-n0-0] terminated with exit code (1). Reason [Error]. Message: 
^^^^^^^^^^^^^^^^^
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/bin/entrypoint.py", line 579, in execute_task_cmd
    _execute_task(
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/bin/entrypoint.py", line 454, in _execute_task
    _dispatch_execute(ctx, load_task, inputs, output_prefix)
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/bin/entrypoint.py", line 216, in _dispatch_execute
    utils.write_proto_to_file(v.to_flyte_idl(), os.path.join(ctx.execution_state.engine_dir, k))
                              ^^^^^^^^^^^^^^^^
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/models/literals.py", line 693, in to_flyte_idl
    return _literals_pb2.LiteralMap(literals={k: v.to_flyte_idl() for k, v in self.literals.items()})
                                                 ^^^^^^^^^^^^^^^^
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/models/literals.py", line 989, in to_flyte_idl
    scalar=self.scalar.to_flyte_idl() if self.scalar is not None else None,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/models/literals.py", line 831, in to_flyte_idl
    structured_dataset=self.structured_dataset.to_flyte_idl() if self.structured_dataset is not None else None,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'StructuredDataset' object has no attribute 'to_flyte_idl'

Expected behavior

This should work.
The bug might relate to this place.
The _literal_sd should be literals.StructuredDataset, but it is a StructuredDataset.

https://github.com/flyteorg/flytekit/blob/master/flytekit/types/structured/structured_dataset.py#L672-L679

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@Future-Outlier Future-Outlier added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Nov 5, 2024
@eapolinario eapolinario added flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Nov 7, 2024
@JiangJiaWei1103
Copy link
Contributor

After some survey, I find that StructuredDatasetTransformerEngine's method dict_to_structured_dataset tries to build a Literal which takes in python native structured dataset, instead of literals.StructuredDataset, as can be seen here. This occurs when input literals are translated to python natives during a python task execution.

The fact is that the local run succeeds, but the remote one fails. I'll go on comparing the differences of exec behaviors btw the two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flytekit FlyteKit Python related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants