[Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) #54508

srlch · 2024-12-30T11:55:27Z

What I'm doing:

Introduce two system table called cluster_snapshots and cluster_snapshot_jobs to provide a interface to get the cluster snapshot info.
This pr only impl the interface of these system but not involve the actually impl of ClusterSnapshot and ClusterSnapshotJob which should be introduced in the next pr.

Fixes #53867
#53867

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

starrocks-cr · 2024-12-30T11:55:53Z

be/src/exec/schema_scanner/schema_cluster_snapshot_jobs_scanner.cpp

+    return _fill_chunk(chunk);
+}
+
+} // namespace starrocks


The most risky bug in this code is:
Memory leak or undefined behavior due to incorrect use of Slice with StringValue.

You can modify the code like this:

Status SchemaClusterSnapshotJobsScanner::_fill_chunk(ChunkPtr* chunk) { auto& slot_id_map = (*chunk)->get_slot_id_to_index_map(); const TClusterSnapshotJobsItem& info = _result.items[_index]; DatumArray datum_array{ Slice(info.snapshot_name.data(), info.snapshot_name.size()), info.job_id, info.created_time, info.finished_time, Slice(info.state.data(), info.state.size()), Slice(info.detail_info.data(), info.detail_info.size()), Slice(info.error_message.data(), info.error_message.size()), }; for (const auto& [slot_id, index] : slot_id_map) { Column* column = (*chunk)->get_column_by_slot_id(slot_id).get(); column->append_datum(datum_array[slot_id - 1]); } _index++; return {}; }

In the original code, creating a Slice from info.snapshot_name, info.state, etc. directly assumes these values have valid data and size methods or properties, which may not be true or safe, especially if they have not been previously initialized or allocated correctly. Adjusting it as shown explicitly constructs the Slice with known data pointers and sizes.

starrocks-cr · 2024-12-30T11:55:57Z

fe/fe-core/src/main/java/com/starrocks/catalog/system/information/ClusterSnapshotsTable.java

+                        .column("STORAGE_PATH", ScalarType.createVarchar(NAME_CHAR_LEN))
+                        .build(), TSchemaTableType.SCH_CLUSTER_SNAPSHOTS);
+    }
+}


The most risky bug in this code is:
TSchemaTableType.SCH_CLUSTER_SNAPSHOTS should match the context and might be incorrect if not properly defined elsewhere.

You can modify the code like this:

public class ClusterSnapshotsTable { public static final String NAME = "cluster_snapshots"; public static SystemTable create() { return new SystemTable(SystemId.CLUSTER_SNAPSHOTS_ID, NAME, Table.TableType.SCHEMA, builder() .column("SNAPSHOT_NAME", ScalarType.createVarchar(NAME_CHAR_LEN)) .column("SNAPSHOT_TYPE", ScalarType.createVarchar(NAME_CHAR_LEN)) .column("CREATE_TIME", ScalarType.createType(PrimitiveType.BIGINT)) .column("FINISHED_TIME", ScalarType.createType(PrimitiveType.BIGINT)) .column("FE_JOURNAL_ID", ScalarType.createType(PrimitiveType.BIGINT)) .column("STARMGR_JOURNAL_ID", ScalarType.createType(PrimitiveType.BIGINT)) .column("PROPERTIES", ScalarType.createVarchar(NAME_CHAR_LEN)) .column("STORAGE_VOLUME", ScalarType.createVarchar(NAME_CHAR_LEN)) .column("STORAGE_PATH", ScalarType.createVarchar(NAME_CHAR_LEN)) .build(), TSchemaTableType.SCH_CLUSTER_SNAPSHOTS // Ensure this value is valid for your use case. ); } }

Ensure TSchemaTableType.SCH_CLUSTER_SNAPSHOTS is accurately defined in the context of your application, as it may represent a generic or outdated placeholder that doesn't match your schema expectations.

starrocks-cr · 2024-12-30T11:56:05Z

be/src/exec/schema_scanner/schema_cluster_snapshots_scanner.cpp

+    return _fill_chunk(chunk);
+}
+
+} // namespace starrocks


The most risky bug in this code is:
A potential buffer overflow or incorrect memory size calculation for varchar-based fields using sizeof(StringValue) instead of a proper max length.

You can modify the code like this:

SchemaScanner::ColumnDesc SchemaClusterSnapshotsScanner::_s_columns[] = { {"snapshot_name", TypeDescriptor::create_varchar_type(MAX_SNAPSHOT_NAME_LENGTH), MAX_SNAPSHOT_NAME_LENGTH, true}, {"snapshot_type", TypeDescriptor::create_varchar_type(MAX_SNAPSHOT_TYPE_LENGTH), MAX_SNAPSHOT_TYPE_LENGTH, true}, {"created_time", TypeDescriptor::from_logical_type(TYPE_BIGINT), sizeof(int64_t), true}, {"finished_time", TypeDescriptor::from_logical_type(TYPE_BIGINT), sizeof(int64_t), true}, {"fe_jouranl_id", TypeDescriptor::from_logical_type(TYPE_BIGINT), sizeof(int64_t), true}, {"starmgr_jouranl_id", TypeDescriptor::from_logical_type(TYPE_BIGINT), sizeof(int64_t), true}, {"properties", TypeDescriptor::create_varchar_type(MAX_PROPERTIES_LENGTH), MAX_PROPERTIES_LENGTH, true}, {"storage_volume", TypeDescriptor::create_varchar_type(MAX_STORAGE_VOLUME_LENGTH), MAX_STORAGE_VOLUME_LENGTH, true}, {"storage_path", TypeDescriptor::create_varchar_type(MAX_STORAGE_PATH_LENGTH), MAX_STORAGE_PATH_LENGTH, true}, }; // Define appropriate constants for maximum lengths const int MAX_SNAPSHOT_NAME_LENGTH = 256; const int MAX_SNAPSHOT_TYPE_LENGTH = 256; const int MAX_PROPERTIES_LENGTH = 1024; const int MAX_STORAGE_VOLUME_LENGTH = 256; const int MAX_STORAGE_PATH_LENGTH = 1024;

Explanation: The original code uses sizeof(StringValue) to define the varchar type size, which is wrong because sizeof(StringValue) returns the size of the pointer structure but does not allocate sufficient space for the actual content. This can lead to potential overflows or incorrect behavior. Proper constants should be defined and used to represent the maximum allowed sizes for each varchar field.

…snapshot backup (part2) Signed-off-by: srlch <[email protected]>

Signed-off-by: srlch <[email protected]>

sonarqubecloud · 2025-01-02T10:15:09Z

Quality Gate passed

Issues
6 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2025-01-02T11:56:39Z

[Java-Extensions Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

github-actions · 2025-01-02T11:59:47Z

[FE Incremental Coverage Report]

✅ pass : 24 / 28 (85.71%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	com/starrocks/service/FrontendServiceImpl.java	0	2	00.00%	[3180, 3185]
🔵	com/starrocks/catalog/system/information/ClusterSnapshotJobsTable.java	10	11	90.91%	[26]
🔵	com/starrocks/catalog/system/information/ClusterSnapshotsTable.java	12	13	92.31%	[26]
🔵	com/starrocks/catalog/system/information/InfoSchemaDb.java	2	2	100.00%	[]

github-actions · 2025-01-02T12:09:11Z

[BE Incremental Coverage Report]

❌ fail : 44 / 76 (57.89%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	be/src/exec/schema_scanner/schema_cluster_snapshots_scanner.cpp	17	33	51.52%	[53, 54, 55, 56, 57, 59, 61, 62, 63, 64, 65, 67, 68, 69, 79, 80]
🔵	be/src/exec/schema_scanner/schema_cluster_snapshot_jobs_scanner.cpp	17	33	51.52%	[51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 63, 64, 65, 75, 76]
🔵	be/src/exec/schema_scanner.cpp	4	4	100.00%	[]
🔵	be/src/exec/schema_scanner/schema_helper.cpp	6	6	100.00%	[]

srlch · 2025-01-03T02:31:17Z

https://github.com/Mergifyio backport branch-3.4

mergify · 2025-01-03T02:31:26Z

backport branch-3.4

✅ Backports have been created

#54637 [Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) (backport #54508) has been created for branch branch-3.4 but encountered conflicts

…luster snapshot backup (part2) (#54508) Signed-off-by: srlch <[email protected]> (cherry picked from commit f43707f) # Conflicts: # be/src/exec/schema_scanner.cpp # be/src/exec/schema_scanner/schema_helper.cpp # be/src/exec/schema_scanner/schema_helper.h # fe/fe-core/src/main/java/com/starrocks/catalog/system/SystemId.java # fe/fe-core/src/main/java/com/starrocks/catalog/system/information/InfoSchemaDb.java # fe/fe-core/src/main/java/com/starrocks/service/FrontendServiceImpl.java # gensrc/thrift/Descriptors.thrift # gensrc/thrift/FrontendService.thrift

…luster snapshot backup (part2) (StarRocks#54508) Signed-off-by: srlch <[email protected]>

…luster snapshot backup (part2) (backport #54508) (#54650) Signed-off-by: srlch <[email protected]>

srlch requested review from a team as code owners December 30, 2024 11:55

wanpengfei-git added the PROTO-REVIEW label Dec 30, 2024

starrocks-cr bot reviewed Dec 30, 2024

View reviewed changes

wanpengfei-git requested a review from a team December 30, 2024 11:55

starrocks-cr bot reviewed Dec 30, 2024

View reviewed changes

mergify bot assigned srlch Dec 30, 2024

starrocks-cr bot reviewed Dec 30, 2024

View reviewed changes

srlch added 2 commits December 30, 2024 22:03

Feature] Support Cluster Snapshot Backup: support system for cluster …

e095658

…snapshot backup (part2) Signed-off-by: srlch <[email protected]>

fix

7ab83ad

Signed-off-by: srlch <[email protected]>

srlch force-pushed the cluster_snapshot_part_2 branch from ab80d38 to 7ab83ad Compare December 30, 2024 14:04

xiangguangyxg approved these changes Dec 31, 2024

View reviewed changes

srlch force-pushed the cluster_snapshot_part_2 branch from 1d89c5b to 0099769 Compare December 31, 2024 07:26

srlch changed the title ~~[Feature] Support Cluster Snapshot Backup: support system for cluster snapshot backup (part2)~~ [Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) Dec 31, 2024

gengjun-git previously approved these changes Dec 31, 2024

View reviewed changes

trueeyu previously approved these changes Jan 2, 2025

View reviewed changes

srlch dismissed stale reviews from trueeyu and gengjun-git via 5d1a41e January 2, 2025 02:44

fix

244d215

Signed-off-by: srlch <[email protected]>

srlch force-pushed the cluster_snapshot_part_2 branch from 903e908 to 244d215 Compare January 2, 2025 04:55

fix

db593a2

Signed-off-by: srlch <[email protected]>

gengjun-git approved these changes Jan 2, 2025

View reviewed changes

trueeyu approved these changes Jan 3, 2025

View reviewed changes

andyziye merged commit f43707f into StarRocks:main Jan 3, 2025
49 of 52 checks passed

github-actions bot added the version:3.5 label Jan 3, 2025

wanpengfei-git mentioned this pull request Jan 3, 2025

shared-data mode support backup restore though automated snapshot #53867

Open

6 tasks

mergify bot mentioned this pull request Jan 3, 2025

[Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) (backport #54508) #54637

Closed

18 tasks

srlch added a commit to srlch/starrocks that referenced this pull request Jan 3, 2025

[Feature] Support Cluster Snapshot Backup: support system table for c…

3aeb881

…luster snapshot backup (part2) (StarRocks#54508) Signed-off-by: srlch <[email protected]>

gengjun-git pushed a commit that referenced this pull request Jan 3, 2025

[Feature] Support Cluster Snapshot Backup: support system table for c…

5cc60f2

…luster snapshot backup (part2) (backport #54508) (#54650) Signed-off-by: srlch <[email protected]>

github-actions bot added the 3.4-merged label Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) #54508

[Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) #54508

srlch commented Dec 30, 2024 •

edited

Loading

starrocks-cr bot Dec 30, 2024

starrocks-cr bot Dec 30, 2024

starrocks-cr bot Dec 30, 2024

sonarqubecloud bot commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

srlch commented Jan 3, 2025

mergify bot commented Jan 3, 2025 •

edited

Loading

[Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) #54508

[Feature] Support Cluster Snapshot Backup: support system table for cluster snapshot backup (part2) #54508

Conversation

srlch commented Dec 30, 2024 • edited Loading

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

starrocks-cr bot Dec 30, 2024

Choose a reason for hiding this comment

starrocks-cr bot Dec 30, 2024

Choose a reason for hiding this comment

starrocks-cr bot Dec 30, 2024

Choose a reason for hiding this comment

sonarqubecloud bot commented Jan 2, 2025

Quality Gate passed

github-actions bot commented Jan 2, 2025

[Java-Extensions Incremental Coverage Report]

github-actions bot commented Jan 2, 2025

[FE Incremental Coverage Report]

file detail

github-actions bot commented Jan 2, 2025

[BE Incremental Coverage Report]

file detail

srlch commented Jan 3, 2025

mergify bot commented Jan 3, 2025 • edited Loading

✅ Backports have been created

srlch commented Dec 30, 2024 •

edited

Loading

mergify bot commented Jan 3, 2025 •

edited

Loading