Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug-4099][admin]bug fix 4 task with variable lineage #4106

Merged
merged 4 commits into from
Jan 3, 2025

Conversation

MactavishCui
Copy link
Contributor

Purpose of the pull request

#4099

Brief change log

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

Before:
5389b2cb63df445d3d4baf043ed7b93

After:
5962505a4cf73365abc86edd38ed9e0

@chenlb
Copy link
Contributor

chenlb commented Jan 2, 2025

昨天也遇到了这个 bug,期待早点合并进来。

@aiwenmo
Copy link
Contributor

aiwenmo commented Jan 2, 2025

Hi. Please refer to StudioServiceImpl.getLineage for the fix.
Before calculating the lineage, it includes five steps: databaseVariable, fragmentVariable, FlinkSQLEnv, RowPermission and UdfRefer.

@MactavishCui
Copy link
Contributor Author

Hi. Please refer to StudioServiceImpl.getLineage for the fix. Before calculating the lineage, it includes five steps: databaseVariable, fragmentVariable, FlinkSQLEnv, RowPermission and UdfRefer.

Thx, I'll update this PR today.

@MactavishCui
Copy link
Contributor Author

MactavishCui commented Jan 2, 2025

Hi. Please refer to StudioServiceImpl.getLineage for the fix. Before calculating the lineage, it includes five steps: databaseVariable, fragmentVariable, FlinkSQLEnv, RowPermission and UdfRefer.

Hi~ @aiwenmo I found that JobInstanceService's lineage build is based on execute History. Before a history being persisted to database, following methods are called:

create FlinkSqlTask -> FlinkSqlTask.getJobManager -> TaskService.buildJobSubmitConfig -> TaskService.buildEnvSql

which means databaseVariable, FlinkSQLEnv, RowPermission, FragmentVariable and UdfRefer are handled, then both statement and config field are updated. As a result we can reuse the config of History rather than create a new JobConfig or use default config like follows, I have also tested that this change can avoid the bug:

Before: return LineageBuilder.getColumnLineageByLogicalPlan(history.getStatement(), ExecutorConfig.DEFAULT);

After: return LineageBuilder.getColumnLineageByLogicalPlan(history.getStatement(), history.getConfigJson());

What is your opinion?
Eager to hear back.

@MactavishCui MactavishCui marked this pull request as draft January 2, 2025 04:40
@aiwenmo
Copy link
Contributor

aiwenmo commented Jan 2, 2025

@MactavishCui I agree with you. Your discovery is wonderful.

@MactavishCui MactavishCui marked this pull request as ready for review January 2, 2025 14:29
Copy link
Contributor

@aiwenmo aiwenmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aiwenmo aiwenmo merged commit 7e747b0 into DataLinkDC:dev Jan 3, 2025
22 checks passed
@aiwenmo aiwenmo added the Bug Something isn't working label Jan 3, 2025
@aiwenmo aiwenmo added this to the 1.3.0 milestone Jan 3, 2025
@aiwenmo aiwenmo linked an issue Jan 3, 2025 that may be closed by this pull request
3 tasks
@MactavishCui MactavishCui deleted the dev-lineageBugFix branch January 4, 2025 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Bug] [Lineage] can't open sql lineage when use global variable
3 participants