-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Only enable query context cache when there are more than one related mvs #54594
Conversation
Quality Gate passedIssues Measures |
return mvTimelinessInfos.get(mv); | ||
} | ||
return mvTimelinessInfos.computeIfAbsent(mv, | ||
ignored -> MvRefreshArbiter.getMVTimelinessUpdateInfo(mv, true)); | ||
} | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
The getQueryCacheStats()
method potentially returns a null reference when the cache should have statistics.
You can modify the code like this:
public QueryCacheStats getQueryCacheStats() {
return queryCacheStats;
}
This change ensures that getQueryCacheStats()
always returns an instance of QueryCacheStats
, regardless of whether mvQueryContextCache
is initialized or not.
|
||
if (queryMaterializationContext.getValidCandidateMVs().size() > 1) { | ||
queryMaterializationContext.setEnableQueryContextCache(true); | ||
} | ||
} catch (Exception e) { | ||
List<String> tableNames = queryTables.stream().map(Table::getName).collect(Collectors.toList()); | ||
logMVPrepare(connectContext, "Prepare query tables {} for mv failed:{}", tableNames, e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
There might be an overlooked scenario where the selectedRelatedMVs set remains empty after each filtering step, causing the method to return prematurely without executing subsequent processing steps. However, assuming no logical changes after empty checks, the overall flow could be unintended. There's potentially a redundant check of selectedRelatedMVs after filtering and before conversion, which might incorrectly lead to skips.
You can modify the code like this:
public void prepare(OptExpression queryOptExpression) {
try {
// 1. get related mvs for all input tables
Set<MaterializedView> relatedMVs = getRelatedMVs(queryTables, context.getOptimizerConfig().isRuleBased());
if (relatedMVs.isEmpty()) {
return;
}
// filter mvs which is set by config: including/excluding mvs
Set<MaterializedView> selectedRelatedMVs = getRelatedMVsByConfig(relatedMVs);
logMVPrepare(connectContext, "Choose {}/{} mvs after user config", selectedRelatedMVs.size(), relatedMVs.size());
// add into queryMaterializationContext for later use
this.queryMaterializationContext.addRelatedMVs(selectedRelatedMVs);
if (selectedRelatedMVs.isEmpty()) {
return;
}
// Ensure selectedRelatedMVs is checked/handled only once for emptiness after being fully processed
Set<MvWithPlanContext> mvWithPlanContexts;
try (Timer t1 = Tracers.watchScope("MVChooseCandidates")) {
selectedRelatedMVs = chooseBestRelatedMVs(queryTables, selectedRelatedMVs, queryOptExpression);
if (selectedRelatedMVs.isEmpty()) {
return;
}
// convert to mv with planContext
try (Timer t2 = Tracers.watchScope("MVGenerateMvPlan")) {
mvWithPlanContexts = getMvWithPlanContext(selectedRelatedMVs);
}
if (mvWithPlanContexts.isEmpty()) {
return;
}
}
// 4. process related mvs to candidates
try (Timer t3 = Tracers.watchScope("MVPrepareRelatedMVs")) {
prepareRelatedMVs(queryTables, mvWithPlanContexts);
}
if (queryOptExpression == null) {
return;
}
// 5. apply view-based optimization on candidates
try (Timer t4 = Tracers.watchScope("MVRewritePlan")) {
processPlanWithView(queryMaterializationContext, connectContext, queryOptExpression,
queryColumnRefFactory, requiredColumns);
}
if (queryMaterializationContext.getValidCandidateMVs().size() > 1) {
queryMaterializationContext.setEnableQueryContextCache(true);
}
} catch (Exception e) {
List<String> tableNames = queryTables.stream().map(Table::getName).collect(Collectors.toList());
logMVPrepare(connectContext, "Prepare query tables {} for mv failed:{}", tableNames, e.getMessage());
}
}
dstAtomMaps.putAll(srcDistinctAtoms.stream().collect(Collectors.toMap(x -> x, x -> srcItem))); | ||
PListCell newValue = | ||
new PListCell(srcDistinctAtoms.stream().map(x -> x.getPartitionItem()).collect(Collectors.toList())); | ||
result.put(key, newValue); | ||
} | ||
return result; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
There is a potential ClassCastException
when casting PCell
to PListCell
without checking the type.
You can modify the code like this:
public static Map<String, PCell> diffList(Map<String, PCell> srcListMap,
Map<String, PCell> dstListMap) {
Map<String, PCell> result = Maps.newTreeMap();
boolean isDstContainMultiValues = dstListMap.values()
.stream()
.anyMatch(x -> (x instanceof PListCell) && ((PListCell) x).getPartitionItems().size() > 1);
if (isDstContainMultiValues) {
Map<PListAtom, PListCell> dstAtomMaps = Maps.newHashMap();
for (PCell cell : dstListMap.values()) {
Preconditions.checkArgument(cell instanceof PListCell, "PListCell expected");
PListCell listCell = (PListCell) cell;
listCell.toAtoms().forEach(x -> dstAtomMaps.put(x, listCell));
}
for (Map.Entry<String, PCell> srcEntry : srcListMap.entrySet()) {
String key = srcEntry.getKey();
PCell value = srcEntry.getValue();
if (!(value instanceof PListCell)) {
continue; // Skip non-PListCell entries safely
}
PListCell srcItem = (PListCell) value;
if (srcItem.equals(dstListMap.get(key))) {
continue;
}
Set<PListAtom> srcAtoms = srcItem.toAtoms();
List<PListAtom> srcDistinctAtoms = srcAtoms.stream()
.filter(x -> !dstAtomMaps.containsKey(x))
.collect(Collectors.toList());
if (srcDistinctAtoms.isEmpty()) {
continue;
}
dstAtomMaps.putAll(srcDistinctAtoms.stream().collect(Collectors.toMap(x -> x, x -> srcItem)));
PListCell newValue =
new PListCell(srcDistinctAtoms.stream().map(x -> x.getPartitionItem()).collect(Collectors.toList()));
result.put(key, newValue);
}
} else {
for (Map.Entry<String, PCell> srcEntry : srcListMap.entrySet()) {
String key = srcEntry.getKey();
PCell value = srcEntry.getValue();
if (!(value instanceof PListCell)) {
continue; // Skip non-PListCell entries safely
}
PListCell srcItem = (PListCell) value;
if (srcItem.equals(dstListMap.get(key))) {
continue;
}
result.put(key, srcItem);
}
}
return result;
}
In this revised version, I've added checks to ensure that values are instances of PListCell
before casting them. This prevents ClassCastException
from occurring.
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
f8be87b
to
2bce91f
Compare
Why I'm doing:
What I'm doing:
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: