[Feature Extractors] Fix kwargs to pre-trained #30260

sanchit-gandhi · 2024-04-15T17:09:01Z

What does this PR do?

It was reported by @osanseviero that when instantiating a feature extractor from pre-trained, setting kwargs occasionally leads to incorrect behaviour. This is particularly the case for Whisper, where setting the argument feature_size can have no effect on the number of Mel-bins:

from transformers import WhisperFeatureExtractor

# initialise from kwargs -> mel filters matches feature size
feature_extractor = WhisperFeatureExtractor(feature_size=100)
print(f"From kwargs: feature size {feature_extractor.feature_size}, mel filters {feature_extractor.mel_filters.shape[1]}")

# initialise from pre-trained with kwargs -> mel filters not updated to new feature size
feature_extractor = WhisperFeatureExtractor.from_pretrained("openai/whisper-tiny", feature_size=100)
print(f"From pre-trained: feature size {feature_extractor.feature_size}, mel filters {feature_extractor.mel_filters.shape[1]}")

Print Output:

From kwargs: feature size 100, mel filters 100
From pre-trained: feature size 100, mel filters 80

This is because of the order in which we set the args in the feature extractor. We first pass all the arguments that we have saved in the preprocessor_config.json file:

transformers/src/transformers/feature_extraction_utils.py

Line 569 in 8127f39

feature_extractor = cls(**feature_extractor_dict)

And subsequently override any kwargs that we got from the user:

transformers/src/transformers/feature_extraction_utils.py

Lines 573 to 575 in 8127f39

    
           for key, value in kwargs.items(): 
        
               if hasattr(feature_extractor, key): 
        
                   setattr(feature_extractor, key, value)

The problem with this method is that for Whisper, we set some attributes based on the values of others in the init. E.g. we set mel_filters based on the value of feature_size:

transformers/src/transformers/models/whisper/feature_extraction_whisper.py

Lines 87 to 89 in 8127f39

    
           self.mel_filters = mel_filter_bank( 
        
               num_frequency_bins=1 + n_fft // 2, 
        
               num_mel_filters=feature_size,

These won't be updated properly with our current method, since we're only updating the attribute feature_size, and not subsequently recomputing the correct mel_filters.

The simple fix is to give priority to the user kwargs and ensure these are passed to the init.

HuggingFaceDocBuilderDev · 2024-04-15T17:30:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

LGTM good catch both!

fixes

guynich · 2024-06-04T14:46:42Z

Well done for this change. I was hitting the same issue with transformers 4.39.3 package - with transformers 4.41.0 I can use kwargs as expected. Thanks.

fixes

c73c0f2

sanchit-gandhi requested a review from ArthurZucker April 15, 2024 17:18

ArthurZucker approved these changes Apr 17, 2024

View reviewed changes

sanchit-gandhi merged commit cd09a8d into huggingface:main Apr 19, 2024
21 checks passed

sanchit-gandhi deleted the feature-extractor-kwargs branch April 19, 2024 10:16

ArthurZucker pushed a commit that referenced this pull request Apr 22, 2024

[Feature Extractors] Fix kwargs to pre-trained (#30260)

d0356d6

fixes

ydshieh pushed a commit that referenced this pull request Apr 23, 2024

[Feature Extractors] Fix kwargs to pre-trained (#30260)

9b5041d

fixes

itazap pushed a commit that referenced this pull request May 14, 2024

[Feature Extractors] Fix kwargs to pre-trained (#30260)

1e36f15

fixes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Extractors] Fix kwargs to pre-trained #30260

[Feature Extractors] Fix kwargs to pre-trained #30260

sanchit-gandhi commented Apr 15, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 15, 2024

ArthurZucker left a comment

guynich commented Jun 4, 2024 •

edited

Loading

	for key, value in kwargs.items():
	if hasattr(feature_extractor, key):
	setattr(feature_extractor, key, value)

	self.mel_filters = mel_filter_bank(
	num_frequency_bins=1 + n_fft // 2,
	num_mel_filters=feature_size,

[Feature Extractors] Fix kwargs to pre-trained #30260

[Feature Extractors] Fix kwargs to pre-trained #30260

Conversation

sanchit-gandhi commented Apr 15, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Apr 15, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

guynich commented Jun 4, 2024 • edited Loading

sanchit-gandhi commented Apr 15, 2024 •

edited

Loading

guynich commented Jun 4, 2024 •

edited

Loading