Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port from 24.6 release to master #1356

Merged

Conversation

pavel-esir
Copy link
Contributor

@pavel-esir pavel-esir commented Dec 10, 2024

andrei-kochin and others added 3 commits December 10, 2024 11:49
…toolkit#1302)

- In some cases adding the next token can shorten the text, e.g. when
apostrophe removing regex had worked after adding new tokens. Need to
hold on before flushing if apostrophe is the last symbol.
- ticket CVS-157216
…notoolkit#1336)

- We use decode to infer tokens string representation from integer ids.
But when some tokens ids are not found in IR they are left unset (-1)
and evaluate for decode operation fails with

![image](https://github.com/user-attachments/assets/9d47dfaa-5236-47f9-999d-ad631f526544)
- Call decode only if token_id is not -1.
- Some models don't have token ids neither in IR nor in configs. E.g.
`black-forest-labs/FLUX.1-dev` don't have `pad_token_id` and that's
totally fine, we don't need to extract string representations for it.

---------

Co-authored-by: Ilya Lavrenov <[email protected]>
@pavel-esir pavel-esir added this to the 2025.0 milestone Dec 10, 2024
@github-actions github-actions bot added category: continuous batching Continuous batching category: LLM LLM pipeline (stateful, static) category: speculative decoding Speculative decoding category: GHA CI based on Github actions category: cmake / build Cmake scripts category: tokenizers Tokenizer class or submodule update category: samples GenAI samples category: GenAI C++ API Changes in GenAI C++ public headers no-match-files labels Dec 10, 2024
@ilya-lavrenov
Copy link
Contributor

Please, port fix for speculative decoding in python part https://github.com/openvinotoolkit/openvino.genai/pull/1331/files#diff-40348136edde3e69e1fb50a50b735e9d669bc1fea308bded9038b780ce1ea644

@ilya-lavrenov
Copy link
Contributor

You need to replace runner for cpp-greedy_causal_lm-redpajama-3b-chat to ubuntu-20.04-8-cores to fix that job

@pavel-esir
Copy link
Contributor Author

You need to replace runner for cpp-greedy_causal_lm-redpajama-3b-chat to ubuntu-20.04-8-cores to fix that job

yes, sure. Just did it

@pavel-esir
Copy link
Contributor Author

pavel-esir commented Dec 11, 2024

The remaining problem is caused by this #1302
In the fix we enforce streamer to not print last 4 bytes (just in case if they change after adding new tokens) this fixed a model but breaks streaming in python, because when symbols are cut in the middle they became invalid and python replaces them with

In python streamer we get

��������������������룅 encouraging룅

instead of c++ streamer out

룅튜룅튜룅튜룅튜룅튜룅 encouraging룅

Indeed the same issue now is in 2024.6, we just didn't see it because diff for mincpm is enabled only on the master #1134

@pavel-esir pavel-esir disabled auto-merge December 11, 2024 10:25
@pavel-esir
Copy link
Contributor Author

Reverted that cherry pick and created a ticket to fix that separately CVS-159227
FYI @ilya-lavrenov

@pavel-esir pavel-esir enabled auto-merge December 11, 2024 10:53
@pavel-esir pavel-esir added this pull request to the merge queue Dec 11, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 11, 2024
@ilya-lavrenov ilya-lavrenov merged commit 8a74d24 into openvinotoolkit:master Dec 11, 2024
53 of 54 checks passed
@pavel-esir pavel-esir deleted the port_from_24.6_release branch December 11, 2024 16:29
sungeunk pushed a commit to sungeunk/openvino.genai that referenced this pull request Dec 16, 2024
- ~openvinotoolkit#1302 (didn't
port this PR because of the issue CVS-159227)
- openvinotoolkit#1262
- openvinotoolkit#1336
- openvinotoolkit#1331

---------

Co-authored-by: Andrei Kochin <[email protected]>
Co-authored-by: Vladimir Zlobin <[email protected]>
Co-authored-by: Ilya Lavrenov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: cmake / build Cmake scripts category: continuous batching Continuous batching category: GenAI C++ API Changes in GenAI C++ public headers category: GHA CI based on Github actions category: LLM LLM pipeline (stateful, static) category: Python API Python API for GenAI category: samples GenAI samples category: speculative decoding Speculative decoding category: tokenizers Tokenizer class or submodule update no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants