Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Only enable query context cache when there are more than one related mvs #54594

Closed
wants to merge 0 commits into from

Conversation

LiShuMing
Copy link
Contributor

@LiShuMing LiShuMing commented Jan 2, 2025

Why I'm doing:

What I'm doing:

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.4
    • 3.3
    • 3.2
    • 3.1
    • 3.0

@LiShuMing LiShuMing requested review from a team as code owners January 2, 2025 09:32
@github-actions github-actions bot added the 3.4 label Jan 2, 2025
Copy link

sonarqubecloud bot commented Jan 2, 2025

return mvTimelinessInfos.get(mv);
}
return mvTimelinessInfos.computeIfAbsent(mv,
ignored -> MvRefreshArbiter.getMVTimelinessUpdateInfo(mv, true));
}

/**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
The getQueryCacheStats() method potentially returns a null reference when the cache should have statistics.

You can modify the code like this:

public QueryCacheStats getQueryCacheStats() {
    return queryCacheStats;
}

This change ensures that getQueryCacheStats() always returns an instance of QueryCacheStats, regardless of whether mvQueryContextCache is initialized or not.


if (queryMaterializationContext.getValidCandidateMVs().size() > 1) {
queryMaterializationContext.setEnableQueryContextCache(true);
}
} catch (Exception e) {
List<String> tableNames = queryTables.stream().map(Table::getName).collect(Collectors.toList());
logMVPrepare(connectContext, "Prepare query tables {} for mv failed:{}", tableNames, e.getMessage());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
There might be an overlooked scenario where the selectedRelatedMVs set remains empty after each filtering step, causing the method to return prematurely without executing subsequent processing steps. However, assuming no logical changes after empty checks, the overall flow could be unintended. There's potentially a redundant check of selectedRelatedMVs after filtering and before conversion, which might incorrectly lead to skips.

You can modify the code like this:

public void prepare(OptExpression queryOptExpression) {
    try {
        // 1. get related mvs for all input tables
        Set<MaterializedView> relatedMVs = getRelatedMVs(queryTables, context.getOptimizerConfig().isRuleBased());
        if (relatedMVs.isEmpty()) {
            return;
        }

        // filter mvs which is set by config: including/excluding mvs
        Set<MaterializedView> selectedRelatedMVs = getRelatedMVsByConfig(relatedMVs);
        logMVPrepare(connectContext, "Choose {}/{} mvs after user config", selectedRelatedMVs.size(), relatedMVs.size());
        // add into queryMaterializationContext for later use
        this.queryMaterializationContext.addRelatedMVs(selectedRelatedMVs);
        if (selectedRelatedMVs.isEmpty()) {
            return;
        }

        // Ensure selectedRelatedMVs is checked/handled only once for emptiness after being fully processed
        Set<MvWithPlanContext> mvWithPlanContexts;
        try (Timer t1 = Tracers.watchScope("MVChooseCandidates")) {
            selectedRelatedMVs = chooseBestRelatedMVs(queryTables, selectedRelatedMVs, queryOptExpression);
            if (selectedRelatedMVs.isEmpty()) {
                return;
            }
        
            // convert to mv with planContext
            try (Timer t2 = Tracers.watchScope("MVGenerateMvPlan")) {
                mvWithPlanContexts = getMvWithPlanContext(selectedRelatedMVs);
            }
            if (mvWithPlanContexts.isEmpty()) {
                return;
            }
        }

        // 4. process related mvs to candidates
        try (Timer t3 = Tracers.watchScope("MVPrepareRelatedMVs")) {
            prepareRelatedMVs(queryTables, mvWithPlanContexts);
        }

        if (queryOptExpression == null) {
            return;
        }

        // 5. apply view-based optimization on candidates
        try (Timer t4 = Tracers.watchScope("MVRewritePlan")) {
            processPlanWithView(queryMaterializationContext, connectContext, queryOptExpression,
                    queryColumnRefFactory, requiredColumns);
        }

        if (queryMaterializationContext.getValidCandidateMVs().size() > 1) {
            queryMaterializationContext.setEnableQueryContextCache(true);
        }
    } catch (Exception e) {
        List<String> tableNames = queryTables.stream().map(Table::getName).collect(Collectors.toList());
        logMVPrepare(connectContext, "Prepare query tables {} for mv failed:{}", tableNames, e.getMessage());
    }
}

dstAtomMaps.putAll(srcDistinctAtoms.stream().collect(Collectors.toMap(x -> x, x -> srcItem)));
PListCell newValue =
new PListCell(srcDistinctAtoms.stream().map(x -> x.getPartitionItem()).collect(Collectors.toList()));
result.put(key, newValue);
}
return result;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
There is a potential ClassCastException when casting PCell to PListCell without checking the type.

You can modify the code like this:

public static Map<String, PCell> diffList(Map<String, PCell> srcListMap,
                                           Map<String, PCell> dstListMap) {
    Map<String, PCell> result = Maps.newTreeMap();
    boolean isDstContainMultiValues = dstListMap.values()
            .stream()
            .anyMatch(x -> (x instanceof PListCell) && ((PListCell) x).getPartitionItems().size() > 1);

    if (isDstContainMultiValues) {
        Map<PListAtom, PListCell> dstAtomMaps = Maps.newHashMap();
        for (PCell cell : dstListMap.values()) {
            Preconditions.checkArgument(cell instanceof PListCell, "PListCell expected");
            PListCell listCell = (PListCell) cell;
            listCell.toAtoms().forEach(x -> dstAtomMaps.put(x, listCell));
        }
        for (Map.Entry<String, PCell> srcEntry : srcListMap.entrySet()) {
            String key = srcEntry.getKey();
            PCell value = srcEntry.getValue();
            if (!(value instanceof PListCell)) {
                continue; // Skip non-PListCell entries safely
            }
            PListCell srcItem = (PListCell) value;
            if (srcItem.equals(dstListMap.get(key))) {
                continue;
            }
            Set<PListAtom> srcAtoms = srcItem.toAtoms();
            List<PListAtom> srcDistinctAtoms = srcAtoms.stream()
                    .filter(x -> !dstAtomMaps.containsKey(x))
                    .collect(Collectors.toList());
            if (srcDistinctAtoms.isEmpty()) {
                continue;
            }
            dstAtomMaps.putAll(srcDistinctAtoms.stream().collect(Collectors.toMap(x -> x, x -> srcItem)));
            PListCell newValue =
                    new PListCell(srcDistinctAtoms.stream().map(x -> x.getPartitionItem()).collect(Collectors.toList()));
            result.put(key, newValue);
        }
    } else {
        for (Map.Entry<String, PCell> srcEntry : srcListMap.entrySet()) {
            String key = srcEntry.getKey();
            PCell value = srcEntry.getValue();
            if (!(value instanceof PListCell)) {
                continue; // Skip non-PListCell entries safely
            }
            PListCell srcItem = (PListCell) value;
            if (srcItem.equals(dstListMap.get(key))) {
                continue;
            }
            result.put(key, srcItem);
        }
    }
    return result;
}

In this revised version, I've added checks to ensure that values are instances of PListCell before casting them. This prevents ClassCastException from occurring.

Copy link

github-actions bot commented Jan 2, 2025

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

github-actions bot commented Jan 2, 2025

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@LiShuMing LiShuMing closed this Jan 2, 2025
@LiShuMing LiShuMing force-pushed the fix_mv_performance_v1 branch from f8be87b to 2bce91f Compare January 2, 2025 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant