Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept buffer in LLMPipeline ctor #1262

Merged

Conversation

pavel-esir
Copy link
Contributor

@pavel-esir pavel-esir commented Nov 27, 2024

Ticket: CVS-158144, CVS-158142

@github-actions github-actions bot added category: LLM LLM pipeline (stateful, static) category: sampling Sampling / Decoding algorithms category: tokenizers Tokenizer class or submodule update category: samples GenAI samples category: GenAI C++ API Changes in GenAI C++ public headers labels Nov 27, 2024
@pavel-esir pavel-esir changed the title initial Accept buffer in LLMPipeline ctor Nov 27, 2024
src/cpp/src/tokenizer.cpp Outdated Show resolved Hide resolved
src/cpp/src/utils.hpp Outdated Show resolved Hide resolved
src/cpp/src/utils.cpp Show resolved Hide resolved
src/cpp/src/tokenizer.cpp Outdated Show resolved Hide resolved
@ilya-lavrenov ilya-lavrenov added this to the 2024.6 milestone Nov 27, 2024
@github-actions github-actions bot removed the category: samples GenAI samples label Nov 28, 2024
@pavel-esir pavel-esir marked this pull request as ready for review November 29, 2024 10:59
src/cpp/include/openvino/genai/tokenizer.hpp Outdated Show resolved Hide resolved
src/cpp/include/openvino/genai/tokenizer.hpp Outdated Show resolved Hide resolved
src/cpp/src/continuous_batching_pipeline.cpp Outdated Show resolved Hide resolved
src/cpp/src/continuous_batching_pipeline.cpp Outdated Show resolved Hide resolved
@github-actions github-actions bot added no-match-files and removed category: sampling Sampling / Decoding algorithms labels Dec 4, 2024
samples/cpp/text_generation/README.md Outdated Show resolved Hide resolved
samples/cpp/text_generation/CMakeLists.txt Outdated Show resolved Hide resolved
src/cpp/include/openvino/genai/tokenizer.hpp Outdated Show resolved Hide resolved
src/cpp/src/llm_pipeline.cpp Outdated Show resolved Hide resolved
ov::Core core;
const char* ov_tokenizer_path = getenv(ScopedVar::ENVIRONMENT_VARIABLE_NAME);
OPENVINO_ASSERT(ov_tokenizer_path, "openvino_tokenizers path is not set");
core.add_extension(ov_tokenizer_path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that default singleton from utils.hpp can include tokenizer extension to avoid duplication of logic of multiples Cores creation.

src/cpp/src/llm_pipeline.cpp Outdated Show resolved Hide resolved
@pavel-esir pavel-esir requested a review from apaniukov December 4, 2024 13:27
samples/cpp/text_generation/encrypted_model_causal_lm.cpp Outdated Show resolved Hide resolved
// This is necessary only for NPU, for other plugins can be ommited.
ov::AnyMap model_descr_properties = {{"name_or_path", "meta-llama/Llama-2-7b-chat-hf"},
{"type", "llama"},
{"num_key_value_heads", 32}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can avoid demonstrating such NPU hacks in samples?

I hope that @dmatveev and @TolyaTalamanov can come up with more generic approach which does not require reading this information from config.json. E.g. num_key_value_heads can be extracted from model itself

ov::PartialShape k_shape = parameters[kv_caches_inputs_offset]->get_partial_shape();
OPENVINO_ASSERT(k_shape.rank().get_length() == 3, "KV cache shape is expected to have rank 3, while shape is ", k_shape);
size_t num_kv_heads = k_shape[1].get_length(), head_size = k_shape[2].get_length();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this from the sample. In order not to forget about that created a ticket for NPU to search for a better solution.
@TolyaTalamanov @dmatveev

src/cpp/src/llm_pipeline_static.hpp Outdated Show resolved Hide resolved
@ilya-lavrenov ilya-lavrenov added the port to master PR needs to be ported to master from release branch label Dec 4, 2024
@andrei-kochin andrei-kochin added this pull request to the merge queue Dec 5, 2024
Merged via the queue into openvinotoolkit:releases/2024/5 with commit 2d4496a Dec 5, 2024
52 checks passed
@pavel-esir pavel-esir deleted the add_new_ctors branch December 5, 2024 08:30
pavel-esir pushed a commit to pavel-esir/openvino.genai that referenced this pull request Dec 10, 2024
@pavel-esir pavel-esir removed the port to master PR needs to be ported to master from release branch label Dec 10, 2024
ilya-lavrenov added a commit that referenced this pull request Dec 11, 2024
- ~#1302 (didn't
port this PR because of the issue CVS-159227)
- #1262
- #1336
- #1331

---------

Co-authored-by: Andrei Kochin <[email protected]>
Co-authored-by: Vladimir Zlobin <[email protected]>
Co-authored-by: Ilya Lavrenov <[email protected]>
sungeunk pushed a commit to sungeunk/openvino.genai that referenced this pull request Dec 16, 2024
- ~openvinotoolkit#1302 (didn't
port this PR because of the issue CVS-159227)
- openvinotoolkit#1262
- openvinotoolkit#1336
- openvinotoolkit#1331

---------

Co-authored-by: Andrei Kochin <[email protected]>
Co-authored-by: Vladimir Zlobin <[email protected]>
Co-authored-by: Ilya Lavrenov <[email protected]>
ScottZhang812 pushed a commit to ScottZhang812/_openvino.genai that referenced this pull request Dec 23, 2024
- ~openvinotoolkit/openvino.genai#1302 (didn't
port this PR because of the issue CVS-159227)
- openvinotoolkit/openvino.genai#1262
- openvinotoolkit/openvino.genai#1336
- openvinotoolkit/openvino.genai#1331

---------

Co-authored-by: Andrei Kochin <[email protected]>
Co-authored-by: Vladimir Zlobin <[email protected]>
Co-authored-by: Ilya Lavrenov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: cmake / build Cmake scripts category: continuous batching Continuous batching category: GenAI C++ API Changes in GenAI C++ public headers category: GHA CI based on Github actions category: LLM LLM pipeline (stateful, static) category: samples GenAI samples category: speculative decoding Speculative decoding category: tokenizers Tokenizer class or submodule update Code Freeze no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants