-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) #54508
Conversation
return _fill_chunk(chunk); | ||
} | ||
|
||
} // namespace starrocks No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
Memory leak or undefined behavior due to incorrect use of Slice
with StringValue
.
You can modify the code like this:
Status SchemaClusterSnapshotJobsScanner::_fill_chunk(ChunkPtr* chunk) {
auto& slot_id_map = (*chunk)->get_slot_id_to_index_map();
const TClusterSnapshotJobsItem& info = _result.items[_index];
DatumArray datum_array{
Slice(info.snapshot_name.data(), info.snapshot_name.size()),
info.job_id,
info.created_time,
info.finished_time,
Slice(info.state.data(), info.state.size()),
Slice(info.detail_info.data(), info.detail_info.size()),
Slice(info.error_message.data(), info.error_message.size()),
};
for (const auto& [slot_id, index] : slot_id_map) {
Column* column = (*chunk)->get_column_by_slot_id(slot_id).get();
column->append_datum(datum_array[slot_id - 1]);
}
_index++;
return {};
}
In the original code, creating a Slice
from info.snapshot_name
, info.state
, etc. directly assumes these values have valid data
and size
methods or properties, which may not be true or safe, especially if they have not been previously initialized or allocated correctly. Adjusting it as shown explicitly constructs the Slice
with known data pointers and sizes.
.column("STORAGE_PATH", ScalarType.createVarchar(NAME_CHAR_LEN)) | ||
.build(), TSchemaTableType.SCH_CLUSTER_SNAPSHOTS); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
TSchemaTableType.SCH_CLUSTER_SNAPSHOTS
should match the context and might be incorrect if not properly defined elsewhere.
You can modify the code like this:
public class ClusterSnapshotsTable {
public static final String NAME = "cluster_snapshots";
public static SystemTable create() {
return new SystemTable(SystemId.CLUSTER_SNAPSHOTS_ID,
NAME,
Table.TableType.SCHEMA,
builder()
.column("SNAPSHOT_NAME", ScalarType.createVarchar(NAME_CHAR_LEN))
.column("SNAPSHOT_TYPE", ScalarType.createVarchar(NAME_CHAR_LEN))
.column("CREATE_TIME", ScalarType.createType(PrimitiveType.BIGINT))
.column("FINISHED_TIME", ScalarType.createType(PrimitiveType.BIGINT))
.column("FE_JOURNAL_ID", ScalarType.createType(PrimitiveType.BIGINT))
.column("STARMGR_JOURNAL_ID", ScalarType.createType(PrimitiveType.BIGINT))
.column("PROPERTIES", ScalarType.createVarchar(NAME_CHAR_LEN))
.column("STORAGE_VOLUME", ScalarType.createVarchar(NAME_CHAR_LEN))
.column("STORAGE_PATH", ScalarType.createVarchar(NAME_CHAR_LEN))
.build(), TSchemaTableType.SCH_CLUSTER_SNAPSHOTS // Ensure this value is valid for your use case.
);
}
}
Ensure TSchemaTableType.SCH_CLUSTER_SNAPSHOTS
is accurately defined in the context of your application, as it may represent a generic or outdated placeholder that doesn't match your schema expectations.
return _fill_chunk(chunk); | ||
} | ||
|
||
} // namespace starrocks No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
A potential buffer overflow or incorrect memory size calculation for varchar-based fields using sizeof(StringValue)
instead of a proper max length.
You can modify the code like this:
SchemaScanner::ColumnDesc SchemaClusterSnapshotsScanner::_s_columns[] = {
{"snapshot_name", TypeDescriptor::create_varchar_type(MAX_SNAPSHOT_NAME_LENGTH), MAX_SNAPSHOT_NAME_LENGTH, true},
{"snapshot_type", TypeDescriptor::create_varchar_type(MAX_SNAPSHOT_TYPE_LENGTH), MAX_SNAPSHOT_TYPE_LENGTH, true},
{"created_time", TypeDescriptor::from_logical_type(TYPE_BIGINT), sizeof(int64_t), true},
{"finished_time", TypeDescriptor::from_logical_type(TYPE_BIGINT), sizeof(int64_t), true},
{"fe_jouranl_id", TypeDescriptor::from_logical_type(TYPE_BIGINT), sizeof(int64_t), true},
{"starmgr_jouranl_id", TypeDescriptor::from_logical_type(TYPE_BIGINT), sizeof(int64_t), true},
{"properties", TypeDescriptor::create_varchar_type(MAX_PROPERTIES_LENGTH), MAX_PROPERTIES_LENGTH, true},
{"storage_volume", TypeDescriptor::create_varchar_type(MAX_STORAGE_VOLUME_LENGTH), MAX_STORAGE_VOLUME_LENGTH, true},
{"storage_path", TypeDescriptor::create_varchar_type(MAX_STORAGE_PATH_LENGTH), MAX_STORAGE_PATH_LENGTH, true},
};
// Define appropriate constants for maximum lengths
const int MAX_SNAPSHOT_NAME_LENGTH = 256;
const int MAX_SNAPSHOT_TYPE_LENGTH = 256;
const int MAX_PROPERTIES_LENGTH = 1024;
const int MAX_STORAGE_VOLUME_LENGTH = 256;
const int MAX_STORAGE_PATH_LENGTH = 1024;
Explanation: The original code uses sizeof(StringValue)
to define the varchar type size, which is wrong because sizeof(StringValue)
returns the size of the pointer structure but does not allocate sufficient space for the actual content. This can lead to potential overflows or incorrect behavior. Proper constants should be defined and used to represent the maximum allowed sizes for each varchar field.
…snapshot backup (part2) Signed-off-by: srlch <[email protected]>
Signed-off-by: srlch <[email protected]>
ab80d38
to
7ab83ad
Compare
1d89c5b
to
0099769
Compare
Signed-off-by: srlch <[email protected]>
903e908
to
244d215
Compare
Signed-off-by: srlch <[email protected]>
Quality Gate passedIssues Measures |
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 24 / 28 (85.71%) file detail
|
[BE Incremental Coverage Report]❌ fail : 44 / 76 (57.89%) file detail
|
https://github.com/Mergifyio backport branch-3.4 |
✅ Backports have been created
|
…luster snapshot backup (part2) (#54508) Signed-off-by: srlch <[email protected]> (cherry picked from commit f43707f) # Conflicts: # be/src/exec/schema_scanner.cpp # be/src/exec/schema_scanner/schema_helper.cpp # be/src/exec/schema_scanner/schema_helper.h # fe/fe-core/src/main/java/com/starrocks/catalog/system/SystemId.java # fe/fe-core/src/main/java/com/starrocks/catalog/system/information/InfoSchemaDb.java # fe/fe-core/src/main/java/com/starrocks/service/FrontendServiceImpl.java # gensrc/thrift/Descriptors.thrift # gensrc/thrift/FrontendService.thrift
…luster snapshot backup (part2) (StarRocks#54508) Signed-off-by: srlch <[email protected]>
…luster snapshot backup (part2) (backport #54508) (#54650) Signed-off-by: srlch <[email protected]>
What I'm doing:
Introduce two system table called
cluster_snapshots
andcluster_snapshot_jobs
to provide a interface to get the cluster snapshot info.This pr only impl the interface of these system but not involve the actually impl of
ClusterSnapshot
andClusterSnapshotJob
which should be introduced in the next pr.Fixes #53867
#53867
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: