-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: refactor post_processor logic and add test #2137
Conversation
router/src/main.rs
Outdated
if let Some(class) = &tokenizer_config.tokenizer_class { | ||
if class == "LlamaTokenizer" || class == "LlamaTokenizerFast" { | ||
tracing::info!("Overriding LlamaTokenizer with TemplateProcessing to follow python override defined in https://github.com/huggingface/transformers/blob/4aa17d00690b7f82c95bb2949ea57e22c35b4336/src/transformers/models/llama/tokenization_llama_fast.py#L203-L205"); | ||
if let Some(post_processor) = create_post_processor(tokenizer, &tokenizer_config) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to override only is the current post_processor is None, forgot that condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated and added in latest commit
router/src/main.rs
Outdated
pub fn create_post_processor( | ||
tokenizer: &Tokenizer, | ||
tokenizer_config: &HubTokenizerConfig, | ||
) -> Option<TemplateProcessing> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not return an Result and remove all those unwrap() into ?
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great point! updated in latest
if let Some(class) = &tokenizer_config.tokenizer_class { | ||
if class == "LlamaTokenizer" || class == "LlamaTokenizerFast" { | ||
tracing::info!("Overriding LlamaTokenizer with TemplateProcessing to follow python override defined in https://github.com/huggingface/transformers/blob/4aa17d00690b7f82c95bb2949ea57e22c35b4336/src/transformers/models/llama/tokenization_llama_fast.py#L203-L205"); | ||
if let Some(post_processor) = create_post_processor(tokenizer, &tokenizer_config) { | ||
tokenizer.with_post_processor(post_processor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's log only when we actually override, here we would log even if the function fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated and moved after create_post_processor
succeeds
|
||
let mut single = String::new(); | ||
let mut pair = String::new(); | ||
let mut special_tokens = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the actual vec ? String
is really a clutch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, updated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* fix: refactor post_processor logic and add test * fix: remove dev comment * fix: adjust when post_processor is overridden and improve create_post_processor
* fix: refactor post_processor logic and add test * fix: remove dev comment * fix: adjust when post_processor is overridden and improve create_post_processor
This PR updates the logic for adding a
post_processor
to a tokenizer and should resolve the issues in CI https://github.com/huggingface/text-generation-inference/actions/runs/9698059633/job/26765404563This PR aims to match the same functionality in
transformers/models/llama/tokenization_llama_fast.py
and fixes an issue where the single and pairs were not correctly constructed and applied. This resulted in a missingbos_token
and subsequently an issue with accessing slots inbatch.slots[batch.slot_indices]
. Also a test is added for that specific case