[WIP] optimize append_selectivity performance #54261

Seaven · 2024-12-24T07:06:15Z

Why I'm doing:

What I'm doing:

Fixes #issue

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

starrocks-cr · 2024-12-24T07:22:14Z

be/src/column/binary_column.cpp

+        *(dst_offsets + 1) = *(dst_offsets) + str_size;
+        strings::memcpy_inlined(dst_bytes, src_bytes + src_offsets[next_idx], str_size);
+
+        _bytes.resize(*(dst_offsets + 1));
    }

    _slices_cache = false;


The most risky bug in this code is:
Memory access out of bounds due to missing prefetch setup for the last element, leading potentially to undefined behavior when accessing src_offsets[next_idx + 1] without ensuring next_idx is a valid index.

You can modify the code like this:

+ idx = indexes[from + size - 1]; // Correct placement within the loop after proper bounds check or resizing T str_size = src_offsets[idx + 1] - src_offsets[idx]; *(dst_offsets + 1) = *(dst_offsets) + str_size; strings::memcpy_inlined(dst_bytes, src_bytes + src_offsets[idx], str_size); _bytes.resize(*(dst_offsets + 1));

starrocks-cr · 2024-12-24T07:22:17Z

be/src/exec/pipeline/exchange/exchange_sink_operator.cpp

+        std::vector<bool> orders = {true};
+        std::vector<bool> null_fists = {false};
+        _sort_descs = SortDescs(orders, null_fists);
+
        _unique_metrics->add_info_string("ShuffleNumPerChannel", std::to_string(_num_shuffles_per_channel));
        _unique_metrics->add_info_string("TotalShuffleNum", std::to_string(_num_shuffles));
        _unique_metrics->add_info_string("PipelineLevelShuffle", _is_pipeline_level_shuffle ? "Yes" : "No");


The most risky bug in this code is:
Improper handling of the scenario when chunk's size exceeds the configuration's chunk_size, which can lead to sorting and sending a wrong version of chunk.

You can modify the code like this:

if (_chunks[driver_sequence]->num_rows() + size > state->chunk_size()) { if (config::enable_shuffle_sort) { Permutation _sort_permutation; _sort_permutation.resize(0); Columns orderby_columns; for (auto& expr : _parent->_sort_expr_ctxs) { ASSIGN_OR_RETURN(auto col, expr->evaluate(_chunks[driver_sequence].get())); orderby_columns.emplace_back(col); } RETURN_IF_ERROR(sort_and_tie_columns(state->cancelled_ref(), orderby_columns, _parent->_sort_descs, &_sort_permutation)); auto sorted_chunk = _chunks[driver_sequence]->clone_empty_with_slot(_sort_permutation.size()); materialize_by_permutation(sorted_chunk.get(), _chunks[driver_sequence].get(), _sort_permutation); RETURN_IF_ERROR(send_one_chunk(state, sorted_chunk.get(), driver_sequence, false)); } else { RETURN_IF_ERROR(send_one_chunk(state, _chunks[driver_sequence].get(), driver_sequence, false)); } // we only clear column data, because we need to reuse column schema _chunks[driver_sequence]->set_num_rows(0); }

Explanation: The call to materialize_by_permutation should be operating on _chunks[driver_sequence] instead of chunk to correctly sort and send the accumulated chunks. This change ensures that the chunks are processed in their entirety, respecting the constraints of chunk size.

starrocks-cr · 2024-12-24T07:22:18Z

be/src/column/fixed_length_column_base.cpp

+    }
+#endif
+    // Handle the remaining elements
+    for (; i < size; ++i) {
        _data[orig_size + i] = src_data[indexes[from + i]];
    }
 }


The most risky bug in this code is:
There is a potential type mismatch error in the AVX2 part where integer types are cast to int* for fsrc_data, potentially causing incorrect memory access or data corruption when T is not int32_t.

You can modify the code like this:

void FixedLengthColumnBase<T>::append_selective(const Column& src, const uint32_t* indexes, size_t from, size_t size){ const T* src_data = reinterpret_cast<const T*>(src.raw_data()); size_t orig_size = _data.size(); _data.resize(orig_size + size); size_t i = 0; #ifdef __AVX2__ if (config::enable_avx_gather) { T* store = _data.data() + orig_size; if constexpr (std::is_same_v<T, int32_t> || std::is_same_v<T, uint8_t> || std::is_same_v<T, int8_t> || std::is_same_v<T, int16_t>) { using IntType = std::conditional_t<std::is_same_v<T, int32_t>, int, int>; const IntType* fsrc_data = reinterpret_cast<const IntType*>(src_data); if constexpr (std::is_same_v<T, int32_t>) { for (; i + 7 < size; i += 8) { __m256i index_vec = _mm256_loadu_si256(reinterpret_cast<const __m256i*>(indexes + from + i)); __m256i data_vec = _mm256_i32gather_epi32(fsrc_data, index_vec, 4); _mm256_storeu_si256(reinterpret_cast<__m256i*>(store + i), data_vec); } } else { constexpr int type_size = sizeof(T); int temp[8] = {0}; for (; i + 7 < size; i += 8) { __m256i index_vec = _mm256_loadu_si256(reinterpret_cast<const __m256i*>(indexes + from + i)); __m256i data_vec = _mm256_i32gather_epi32(fsrc_data, index_vec, type_size); _mm256_storeu_si256(reinterpret_cast<__m256i*>(temp), data_vec); for (int j = 0; j < 8; j++) { store[i + j] = temp[j]; } } } } else if constexpr (std::is_same_v<T, int64_t>) { const long long int* fsrc_data = (const long long int*)src_data; for (; i + 3 < size; i += 4) { __m128i index_vec = _mm_loadu_si128(reinterpret_cast<const __m128i*>(indexes + from + i)); __m256i data_vec = _mm256_i32gather_epi64(fsrc_data, index_vec, 8); _mm256_storeu_si256(reinterpret_cast<__m256i*>(store + i), data_vec); } } } #endif for (; i < size; ++i) { _data[orig_size + i] = src_data[indexes[from + i]]; } }

Signed-off-by: Seaven <[email protected]>

github-actions · 2025-01-03T09:12:39Z

[Java-Extensions Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

github-actions · 2025-01-03T09:13:02Z

[FE Incremental Coverage Report]

✅ pass : 7 / 7 (100.00%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	com/starrocks/sql/plan/PlanFragmentBuilder.java	1	1	100.00%	[]
🔵	com/starrocks/planner/PlanNode.java	6	6	100.00%	[]

github-actions · 2025-01-03T09:14:04Z

[BE Incremental Coverage Report]

❌ fail : 32 / 83 (38.55%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	src/exec/aggregate/agg_profile.h	0	2	00.00%	[38, 39]
🔵	src/exec/aggregate/agg_hash_set.h	0	11	00.00%	[105, 106, 108, 109, 111, 112, 113, 114, 182, 283, 390]
🔵	src/exec/pipeline/exchange/exchange_sink_operator.cpp	0	12	00.00%	[203, 204, 207, 209, 210, 213, 215, 217, 218, 446, 449, 450]
🔵	src/column/binary_column.cpp	21	40	52.50%	[93, 95, 96, 97, 99, 100, 102, 103, 104, 105, 106, 107, 109, 111, 112, 113, 114, 115, 119]
🔵	src/column/fixed_length_column_base.cpp	8	14	57.14%	[57, 59, 60, 61, 62, 63]
🔵	src/exec/aggregator.cpp	3	4	75.00%	[713]

Seaven requested review from a team as code owners December 24, 2024 07:06

github-actions bot added the title needs [type] label Dec 24, 2024

mergify bot assigned Seaven Dec 24, 2024

starrocks-cr bot reviewed Dec 24, 2024

View reviewed changes

Seaven added 3 commits December 26, 2024 15:31

optimizer

f7aa5b9

Signed-off-by: Seaven <[email protected]>

update

3e9996c

Signed-off-by: Seaven <[email protected]>

pp

6a55fc2

Signed-off-by: Seaven <[email protected]>

Seaven force-pushed the opt branch from 99efac8 to 6a55fc2 Compare December 26, 2024 07:31

Seaven added 3 commits December 27, 2024 10:47

upppps

9960e7d

Signed-off-by: Seaven <[email protected]>

xxxxx

86e4a80

Signed-off-by: Seaven <[email protected]>

reserve

3099c16

Signed-off-by: Seaven <[email protected]>

wanpengfei-git added the PROTO-REVIEW label Jan 2, 2025

wanpengfei-git requested a review from a team January 2, 2025 07:04

reserve

a841ff3

Signed-off-by: Seaven <[email protected]>

Seaven force-pushed the opt branch from 86f65d8 to a841ff3 Compare January 3, 2025 09:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] optimize append_selectivity performance #54261

[WIP] optimize append_selectivity performance #54261

Seaven commented Dec 24, 2024

starrocks-cr bot Dec 24, 2024

starrocks-cr bot Dec 24, 2024

starrocks-cr bot Dec 24, 2024

github-actions bot commented Jan 3, 2025

github-actions bot commented Jan 3, 2025

github-actions bot commented Jan 3, 2025

[WIP] optimize append_selectivity performance #54261

Are you sure you want to change the base?

[WIP] optimize append_selectivity performance #54261

Conversation

Seaven commented Dec 24, 2024

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

starrocks-cr bot Dec 24, 2024

Choose a reason for hiding this comment

starrocks-cr bot Dec 24, 2024

Choose a reason for hiding this comment

starrocks-cr bot Dec 24, 2024

Choose a reason for hiding this comment

github-actions bot commented Jan 3, 2025

[Java-Extensions Incremental Coverage Report]

github-actions bot commented Jan 3, 2025

[FE Incremental Coverage Report]

file detail

github-actions bot commented Jan 3, 2025

[BE Incremental Coverage Report]

file detail