Whisper: Fix decoder inputs for static pipeline #1469

eshiryae · 2025-01-03T10:38:56Z

Tickets:

CVS-159064

ilya-lavrenov · 2025-01-03T11:57:57Z

src/cpp/src/whisper_pipeline_static.hpp

@@ -25,6 +25,7 @@ class WhisperPipeline::StaticWhisperPipeline : public WhisperPipeline::WhisperPi

 private:
    WhisperInitializedModels m_models;
+    std::shared_ptr<ov::Model> m_decoder_model;


why do we need to store model? once model is compiled, we need to release ov::Model to free memory consumed by its weights

We can't compile this model until generate() called

TolyaTalamanov

Well, I'd assume we have something like this:

class DecoderCache {
public:
    ov::CompiledModel get_model(uint8_t input_id_size) {
        // Get from hash table, otherwise compile and store...
    }
private:
    // [input_ids_size -> CompiledModel]
    std::unordered_map<uint8_t, ov::CompiledModel> m_cache;
    std::shared_ptr<ov::Model> decoder_model; // <- this is dynamic w/o transformation applied    
}


// Whenever we need a model:
auto decoder = m_decoder_cache.get(input_ids_size);

TolyaTalamanov · 2025-01-03T12:29:32Z

src/cpp/src/whisper_pipeline_static.cpp

-    std::fill(input_ids_data + init_ids.size(),
-              input_ids_data + input_ids_tensor.get_size(),
-              static_cast<int32_t>(pad_token));
+    // std::fill(input_ids_data + init_ids.size(),


TolyaTalamanov · 2025-01-03T13:00:13Z

src/cpp/src/whisper_pipeline_static.cpp

-    std::fill(attention_mask_ptr, attention_mask_ptr + 3u, 0);
-    std::fill(attention_mask_ptr + 3u, attention_mask_ptr + attention_mask.get_size() - 2, 1);
+    std::fill(attention_mask_ptr, attention_mask_ptr + init_ids_size, 0);
+    std::fill(attention_mask_ptr + init_ids_size, attention_mask_ptr + attention_mask.get_size() - 2, 1);


Btw, since we reshape model anyway, we don't need attention_mask at all, probably we may not apply transformation that expose this

TolyaTalamanov · 2025-01-03T13:01:38Z

src/cpp/src/whisper_pipeline_static.cpp

@@ -489,7 +489,7 @@ void preprocess_decoder(std::shared_ptr<ov::Model> model) {
            preprocessor.input("attention_mask").preprocess().convert_element_type();
        } else if (tensor.get_any_name().find("encoder_hidden_states") != std::string::npos) {
            preprocessor.input("encoder_hidden_states").tensor().set_element_type(ov::element::Type_t::f16);
-            preprocessor.input("encoder_hidden_states").preprocess().convert_element_type(ov::element::Type_t::f32); // ()


Why it's removed?

TolyaTalamanov · 2025-01-03T13:03:40Z

src/cpp/src/whisper_pipeline_static.cpp

@@ -654,7 +657,13 @@ WhisperDecodedResults WhisperPipeline::StaticWhisperPipeline::generate(

        // prepare init_ids just once for whole input
        if (init_ids.empty()) {
+            OPENVINO_ASSERT(m_models.decoder.get_tensor("input_ids").get_shape().back() == 1);


What does this check do?

github-actions bot added the category: whisper Whisper pipeline label Jan 3, 2025

Fix decoder inputs for static pipeline

dfba871

ilya-lavrenov reviewed Jan 3, 2025

View reviewed changes

ilya-lavrenov assigned dmatveev, TolyaTalamanov and as-suvorov Jan 3, 2025

TolyaTalamanov reviewed Jan 3, 2025

View reviewed changes

dmatveev changed the title ~~Fix decoder inputs for static pipeline~~ Whisper: Fix decoder inputs for static pipeline Jan 3, 2025

dmatveev added this to the 2025.0 milestone Jan 3, 2025

ilya-lavrenov added the category: NPU label Jan 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper: Fix decoder inputs for static pipeline #1469

Whisper: Fix decoder inputs for static pipeline #1469

eshiryae commented Jan 3, 2025

ilya-lavrenov Jan 3, 2025

TolyaTalamanov Jan 3, 2025 •

edited

Loading

TolyaTalamanov left a comment

TolyaTalamanov Jan 3, 2025

TolyaTalamanov Jan 3, 2025

TolyaTalamanov Jan 3, 2025

TolyaTalamanov Jan 3, 2025

TolyaTalamanov Jan 3, 2025

Whisper: Fix decoder inputs for static pipeline #1469

Are you sure you want to change the base?

Whisper: Fix decoder inputs for static pipeline #1469

Conversation

eshiryae commented Jan 3, 2025

Tickets:

ilya-lavrenov Jan 3, 2025

Choose a reason for hiding this comment

TolyaTalamanov Jan 3, 2025 • edited Loading

Choose a reason for hiding this comment

TolyaTalamanov left a comment

Choose a reason for hiding this comment

TolyaTalamanov Jan 3, 2025

Choose a reason for hiding this comment

TolyaTalamanov Jan 3, 2025

Choose a reason for hiding this comment

TolyaTalamanov Jan 3, 2025

Choose a reason for hiding this comment

TolyaTalamanov Jan 3, 2025

Choose a reason for hiding this comment

TolyaTalamanov Jan 3, 2025

Choose a reason for hiding this comment

TolyaTalamanov Jan 3, 2025 •

edited

Loading