Rebase zamba2 #2

pglorio · 2024-11-05T00:00:50Z

No description provided.

* fix llavas * code style * green ci

* fix test * fix copies

* fix * fix mistral

…gface#34482) * use a tinymodel to test generation config which aviod timeout * remove tailing whitespace

…ng (huggingface#33200) * feat: Added int conversion and unwrapping * test: added tests for post_process_keypoint_detection of SuperPointImageProcessor * docs: changed docs to include post_process_keypoint_detection method and switched from opencv to matplotlib * test: changed test to not depend on SuperPointModel forward * test: added missing require_torch decorator * docs: changed pyplot parameters for the keypoints to be more visible in the example * tests: changed import torch location to make test_flax and test_tf * Revert "tests: changed import torch location to make test_flax and test_tf" This reverts commit 39b32a2. * tests: fixed import * chore: applied suggestions from code review Co-authored-by: NielsRogge <[email protected]> * tests: fixed import * tests: fixed import (bis) * tests: fixed import (ter) * feat: added choice of type for target_size and changed tests accordingly * docs: updated code snippet to reflect the addition of target size type choice in post process method * tests: fixed imports (...) * tests: fixed imports (...) * style: formatting file * docs: fixed typo from image[0] to image.size[0] * docs: added output image and fixed some tests * Update docs/source/en/model_doc/superpoint.md Co-authored-by: Pavel Iakubovskii <[email protected]> * fix: included SuperPointKeypointDescriptionOutput in TYPE_CHECKING if statement and changed tests results to reflect changes to SuperPoint from absolute keypoints coordinates to relative * docs: changed SuperPoint's docs to print output instead of just accessing * style: applied make style * docs: added missing output type and precision in docstring of post_process_keypoint_detection * perf: deleted loop to perform keypoint conversion in one statement * fix: moved keypoint conversion at the end of model forward * docs: changed SuperPointInterestPointDecoder to SuperPointKeypointDecoder class name and added relative (x, y) coordinates information to its method * fix: changed type hint * refactor: removed unnecessary brackets * revert: SuperPointKeypointDecoder to SuperPointInterestPointDecoder * Update docs/source/en/model_doc/superpoint.md Co-authored-by: Pavel Iakubovskii <[email protected]> --------- Co-authored-by: Steven Bucaille <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Pavel Iakubovskii <[email protected]>

* check * check * check * check * add docstring --------- Co-authored-by: ydshieh <[email protected]>

fix average NLL in perplexity.md

* Separator in regex * Standardize separator for relative path in auto generated message * open() encoding * Replace `\` on `os.path.abspath` --------- Co-authored-by: Arthur <[email protected]>

* fix regression * add test for torchao * expected output * better fix

Co-authored-by: Guang Yang <[email protected]>

* potential bug fix for drop path * variable name change * forgot to rename the variables * back to original * modify dpr properly * check_copies auto fix * corresponsing swin2 changes * auto fix * linting * default value for drop_path_rate as 0.0 * Update src/transformers/models/glm/modeling_glm.py * maskformer fix * ruff format * changes made to tf code as well * lint --------- Co-authored-by: abhijit deo <[email protected]>

Co-authored-by: Guang Yang <[email protected]>

…34358) * Adding `optimizer_cls_and_kwargs` to `Trainer.__init__` * formatting * make fix-copies docstring * added more docs for optimizer_cls_and_kwargs * add docs for Trainer(optimizer_cls_and_kwargs) * reverting anchor names

…sion_transformer (huggingface#34420) Bump werkzeug in /examples/research_projects/decision_transformer Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.3 to 3.0.6. - [Release notes](https://github.com/pallets/werkzeug/releases) - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst) - [Commits](pallets/werkzeug@3.0.3...3.0.6) --- updated-dependencies: - dependency-name: werkzeug dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: Fix performance in get_imports regexp * Minimize get_imports content regexp

* Un-deprecate timeout * Put "timeout" on the allowed list * make fixup

* Roberta is ExecuTorch compatible * [run_slow] roberta --------- Co-authored-by: Guang Yang <[email protected]>

…4493) * fix repr string format for tokenizer objects The repr of tokenizer tokens looks confusing and just stupid, like this: `Tokenizer(...), added_tokens_decoder={1: ..., 2: ...}`. The dict that is the value of the added_tokens_decoder attribute is outside of the parentheses of the tokenizer object, whereas all other attributes are inside the parentheses like they should be. This commit fixes this bug. * cos: add newline before closing parenthesis of repr string

* update docs * be more explicit * use avaialble methods

* fix * fix tests * add tests * style * style * fix qwen after rebase * fix video llava

…tests (huggingface#34464) * tmp commit * tmp commit * cull overwrites of deleted tests * typo * more specific docstring * make fixup * parameterize at the top? * correction * more deletions :D * tmp commit * for VLMs too * fix _check_outputs * test nit * make fixup * fix another flaky * test_generate_from_inputs_embeds -- handle missing attention mask

* fix pixtral processor * test out full length batches + remove undue ValueError * fix up processing * fix tests * fix * last fixup * style * [run-slow] pixtral * [run-slow] pixtral * fix config key * skip torchscript tests * [run-slow] pixtral * add missing key * [run-slow] pixtral * fix docs * [run-slow] pixtral * fix wrong url for integration test * [run-slow] pixtral * pixtralVisionModel does not have a lm head * [run-slow] pixtral

* torch 2.5 * try --------- Co-authored-by: ydshieh <[email protected]>

* add mamba architecture for gguf * add logic for weights conversion, some fixes and refactoring * add lm_head layers, unit test refactoring * more fixes for tests * remove lm_head creation * remove unused comments

Update training_args.py

* add fast image processor rtdetr * add gpu/cpu test and fix docstring * remove prints * add to doc * nit docstring * avoid iterating over images/annotations several times * change torch typing * Add image processor fast documentation

…Languages.(Changes made) (huggingface#34226) * Update TRANSLATING.md * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * Update TRANSLATING.md --------- Co-authored-by: Steven Liu <[email protected]>

* enable QA bf16 pipeline * add tests

…uggingface#34522) Fix: unpadding img mismatch

* replace total_batched_samples with step while counting grad accum step * remove unused variable * simplify condition for update step * fix format by ruff * simplify update step condition using accelerator.sync_gradients * simplify update condition using do_sync_step * remove print for test --------- Co-authored-by: Zach Mueller <[email protected]>

* update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>

…gingface#34535) it has complex inputs_embeds computation

…e tests (huggingface#34518) * fix(DPT,Depth-Anything) Address expected_slice errors inside inference tests Signed-off-by: Phillip Kuznetsov <[email protected]> * [run_slow] dpt, depth_anything --------- Signed-off-by: Phillip Kuznetsov <[email protected]>

* feat: add benchmarks pg indexes * refactor: remove debug `df -h`

* try * try * try * try * try * try * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>

Update SiglipVisionEmbeddings.forward to cast input to correct dtype before embedding it.

* Standardize image-text-to-text-models-output add post_process_image_text_to_text to chameleon and cleanup Fix legacy kwarg behavior and deprecation warning add post_process_image_text_to_text to qwen2_vl and llava_onevision Add post_process_image_text_to_text to idefics3, mllama, pixtral processor * nit var name post_process_image_text_to_text udop * nit fix deprecation warnings * Add image-text-to-text pipeline * add support for image url in chat template for pipeline * Reformat to be fully compatible with chat templates * Add tests chat template * Fix imports and tests * Add pipeline tag * change logic handling of single prompt ans multiple images * add pipeline mapping to models * fix batched inference * fix tests * Add manual batching for preprocessing * Fix outputs with nested images * Add support for all common processing kwargs * Add default padding when multiple text inputs (batch size>1) * nit change version deprecation warning * Add support for text only inference * add chat_template warnings * Add pipeline tests and add copied from post process function * Fix batched pipeline tests * nit * Fix pipeline tests blip2 * remove unnecessary max_new_tokens * revert processing kosmos2 and remove unnecessary max_new_tokens * fix pipeline tests idefics * Force try loading processor if pipeline supports it * revert load_processor change * hardcode loading only processor * remove unnecessary try except * skip imagetexttotext tests for kosmos2 as tiny model causes problems * Make code clearer * Address review comments * remove preprocessing logic from pipeline * fix fuyu * add BC resize fuyu * Move post_process_image_text_to_text to ProcessorMixin * add guard in post_process * fix zero shot object detection pipeline * add support for generator input in pipeline * nit * change default image-text-to-text model to llava onevision * fix owlv2 size dict * Change legacy deprecation warning to only show when True

…34419) * Remove interpolate_pos_encoding * Make fixup * Make interpolate_pos_encoding default to True * Reuse existing interpolation * Add integration test

* update doc * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <[email protected]> * delete closing tip --------- Co-authored-by: Steven Liu <[email protected]>

…bic (huggingface#33048) * Add docs/source/ar/multilingual.md to Add_docs_source_ar_multilingual.md * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/multilingual.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update _toctree.yml * Update _toctree.yml * Add Translated files to branch for merg * Update _toctree.yml * Update _toctree.yml * Update custom_models.md * Update chat_templating.md * Update docs/source/ar/create_a_model.md Co-authored-by: Steven Liu <[email protected]> * Update create_a_model.md * Update gguf.md * Update gguf.md * Update gguf.md * Update gguf.md --------- Co-authored-by: Abdullah Mohammed <[email protected]> Co-authored-by: Steven Liu <[email protected]>

* set-get embeds * add tests * fix tests * remove * return dict True * fix tests * why did i remove this * enabel torchscript tests

* blip2 tests * instructblips * copies * fix slow tests * fix * uncomment this * clean up after rebase * should be model main input * fix overwritten tests * oops len should be multiple of frame number * style * fix some tests

…emma2 config (huggingface#34540) * fix query_pre_attn_scalar different of num_heads in default config * propagate modular changes * fix copies * fix modular copies * fix copies? * correct copies fix

* rework converter * Update modular_model_converter.py * Update modular_model_converter.py * Update modular_model_converter.py * Update modular_model_converter.py * cleaning * cleaning * finalize imports * imports * Update modular_model_converter.py * Better renaming to avoid visiting same file multiple times * start converting files * style * address most comments * style * remove unused stuff in get_needed_imports * style * move class dependency functions outside class * Move main functions outside class * style * Update modular_model_converter.py * rename func * add augmented dependencies * Update modular_model_converter.py * Add types_to_file_type + tweak annotation handling * Allow assignment dependency mapping + fix regex * style + update modular examples * fix modular_roberta example (wrong redefinition of __init__) * slightly correct order in which dependencies will appear * style * review comments * Performance + better handling of dependencies when they are imported * style * Add advanced new classes capabilities * style * add forgotten check * Update modeling_llava_next_video.py * Add prority list ordering in check_conversion as well * Update check_modular_conversion.py * Update configuration_gemma.py

* [i18n-HI] Translated accelerate page to Hindi * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <[email protected]> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <[email protected]> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <[email protected]> * Update docs/source/hi/accelerate.md Co-authored-by: K.B.Dharun Krishna <[email protected]> --------- Co-authored-by: Kay <[email protected]> Co-authored-by: K.B.Dharun Krishna <[email protected]>

…proper reporting (huggingface#34511) * Update trainer for easier handling of accumulate + proper reporting * test * Fixup tests * Full fix * Fix style * rm comment * Fix tests * Minimize test + remove py 311 check * Unused import * Forward contrib credits from discussions * Fix reported metrics * Refactor, good as it's going to get * rm pad tok id check * object detection and audio are being annoying * Fin * Fin x2 --------- Co-authored-by: Gyanateet Dutta <[email protected]>

* kinda works * update * add tests * update * use special tokens in processors * typo * fix copies * fix * fix moshi after rebase * update * fix tests * update * Update docs/source/en/main_classes/tokenizer.md Co-authored-by: Arthur <[email protected]> * update docs * test for load time adding tokens * fix some more tests which are now fetched better * one more fix --------- Co-authored-by: Arthur <[email protected]>

* apply fix * tested * make fixup

…e#34418) * feat: add text support to TensorBoardCallback * feat: ignore long strings in trainer progress * docs: add docstring for max_str_len * style: remove trailing whitespace --------- Co-authored-by: Marc Sun <[email protected]>

* [i18n-HI] Translated TFLite page to Hindi * [i18n-HI] Translated TFLite page to Hindi * Update docs/source/hi/tflite.md Co-authored-by: K.B.Dharun Krishna <[email protected]> --------- Co-authored-by: K.B.Dharun Krishna <[email protected]>

…4590) * Translated to Ko, 1st version * updated _toctree.yml

* Update README_ko.md Delete the blank paragraph in the language selection button and Edit to synchronize with the English version of README.md * [i18n-KO] Update README_ko.md * Additional edit for keep consistency with main [documentation](https://huggingface.co/docs/transformers/v4.44.2/ko/index). (메인 문서와 일관성 유지를 위한 수정) * Update README_ko.md Additional update. * Change docs link to Korean translated page if it exists. * Change doc link to korean translated if it exists. Change the link of doc and delete a row 'migration' of the table Learn more[더 알아보기], since it does not exist in the main version of doc. * modify a link of the main README.md from `https://huggingface.co/docs/transformers/index#supported-frameworks` to `https://huggingface.co/docs/transformers/index#supported-models-and-frameworks` since the title of 'supported table' changed. * [i18n-ko] edit links and sync with main `README.md` * docs/change comment to Korean1 Change English comment to Korean Co-authored-by: Jihun Lim <[email protected]> * docs/change comment to Korean2 Change English comment to Korean Co-authored-by: Jihun Lim <[email protected]> * revise to original to seperate `edit_README_ko_md` and `README.md` * Synchronization with English documentation. Synchronization with English documentation, and translated a line of comment from English to Korean. --------- Co-authored-by: Jihun Lim <[email protected]>

huggingface#34593) fix TrainerState doc because num_input_tokens_seen is unused by default config Co-authored-by: kangsheng <[email protected]>

update Co-authored-by: ydshieh <[email protected]>

zucchini-nlp and others added 30 commits October 29, 2024 07:54

LLaVA: latency issues (huggingface#34460)

fe76b60

* fix llavas * code style * green ci

Generation: fix test (huggingface#34369)

808d6c5

* fix test * fix copies

Fix CI (huggingface#34458)

63ca6d9

* fix * fix mistral

use a tinymodel to test generation config which aviod timeout (huggin…

655bec2

…gface#34482) * use a tinymodel to test generation config which aviod timeout * remove tailing whitespace

Simplify running tests in a subprocess (huggingface#34213)

439334c

* check * check * check * check * add docstring --------- Co-authored-by: ydshieh <[email protected]>

Fix perplexity computation in perplexity.md (huggingface#34387)

626c610

fix average NLL in perplexity.md

Fixes for Modular Converter on Windows (huggingface#34266)

9e3d704

* Separator in regex * Standardize separator for relative path in auto generated message * open() encoding * Replace `\` on `os.path.abspath` --------- Co-authored-by: Arthur <[email protected]>

Fix regression loading dtype (huggingface#34409)

004530a

* fix regression * add test for torchao * expected output * better fix

Bert is ExecuTorch compatible (huggingface#34424)

5392f12

Co-authored-by: Guang Yang <[email protected]>

manual head_dim for mixtral model (huggingface#34281)

8755dd2

fix-qwen2vl-no-position_ids (huggingface#33487)

0ab0a42

MobileBERT is ExecuTorch compatible (huggingface#34473)

34620e8

Co-authored-by: Guang Yang <[email protected]>

Albert is ExecuTorch compatible (huggingface#34476)

f339042

Co-authored-by: Guang Yang <[email protected]>

Fix performance in get_imports regexp (huggingface#34298)

f55595b

* fix: Fix performance in get_imports regexp * Minimize get_imports content regexp

fix incorrect warning (huggingface#34416)

e4449bb

Un-deprecate timeout arg in pipelines (huggingface#34382)

9bee9ff

* Un-deprecate timeout * Put "timeout" on the allowed list * make fixup

Roberta is ExecuTorch compatible (huggingface#34425)

cd27761

* Roberta is ExecuTorch compatible * [run_slow] roberta --------- Co-authored-by: Guang Yang <[email protected]>

Mllama: update docs (huggingface#34334)

0f764a5

* update docs * be more explicit * use avaialble methods

VLMs: fix number of image tokens (huggingface#34332)

913330c

* fix * fix tests * add tests * style * style * fix qwen after rebase * fix video llava

Use torch 2.5 in scheduled CI (huggingface#34465)

eab6c49

* torch 2.5 * try --------- Co-authored-by: ydshieh <[email protected]>

Add GGUF for Mamba (huggingface#34200)

5251fe6

* add mamba architecture for gguf * add logic for weights conversion, some fixes and refactoring * add lm_head layers, unit test refactoring * more fixes for tests * remove lm_head creation * remove unused comments

Fix super tiny extra space typo (huggingface#34440)

9f06fb0

Update training_args.py

anshumangahlot and others added 28 commits October 30, 2024 12:37

enable QA bf16 pipeline (huggingface#34483)

f385316

* enable QA bf16 pipeline * add tests

Fix: img size mismatch caused by incorrect unpadding in LLaVA-Next (h…

1b86772

…uggingface#34522) Fix: unpadding img mismatch

avoid calling gc.collect and cuda.empty_cache (huggingface#34514)

ab98f0b

* update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>

Qwen2VL: skip base input_ids-inputs_embeds equivalence check (hug…

4ca004e

…gingface#34535) it has complex inputs_embeds computation

feat: add benchmarks pg indexes (huggingface#34536)

294c170

* feat: add benchmarks pg indexes * refactor: remove debug `df -h`

make test_eager_matches_sdpa_inference less flaky (huggingface#34512)

114dd81

* try * try * try * try * try * try * update * update * update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>

Bug Fix for issue huggingface#34294 (huggingface#34295)

c443d8d

Update SiglipVisionEmbeddings.forward to cast input to correct dtype before embedding it.

[CLIPSeg] Make interpolate_pos_encoding default to True (huggingface#…

df8640c

…34419) * Remove interpolate_pos_encoding * Make fixup * Make interpolate_pos_encoding default to True * Reuse existing interpolation * Add integration test

update doc (huggingface#34478)

2801d7b

* update doc * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <[email protected]> * delete closing tip --------- Co-authored-by: Steven Liu <[email protected]>

Blip: get/set input embeddings correctly (huggingface#34152)

6beb3f1

* set-get embeds * add tests * fix tests * remove * return dict True * fix tests * why did i remove this * enabel torchscript tests

BLIP: enable generation tests (huggingface#34174)

4cc0813

* blip2 tests * instructblips * copies * fix slow tests * fix * uncomment this * clean up after rebase * should be model main input * fix overwritten tests * oops len should be multiple of frame number * style * fix some tests

🔴 🔴 fix query_pre_attn_scalar different of num_heads in default g…

86701f2

…emma2 config (huggingface#34540) * fix query_pre_attn_scalar different of num_heads in default config * propagate modular changes * fix copies * fix modular copies * fix copies? * correct copies fix

MPS: isin_mps_friendly can support 0D tensors (huggingface#34538)

34927b0

* apply fix * tested * make fixup

🌐 [i18n-KO] Translated perf_train_special.md to Korean (huggingface#3…

1112c54

…4590) * Translated to Ko, 1st version * updated _toctree.yml

fix TrainerState doc because num_input_tokens_seen is unused by defau… (

bfa021b

huggingface#34593) fix TrainerState doc because num_input_tokens_seen is unused by default config Co-authored-by: kangsheng <[email protected]>

Fix Whisper CI (huggingface#34541)

eb81144

update Co-authored-by: ydshieh <[email protected]>

pglorio merged commit 88c4b26 into zamba2 Nov 5, 2024
21 of 34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebase zamba2 #2

Rebase zamba2 #2

pglorio commented Nov 5, 2024

Rebase zamba2 #2

Rebase zamba2 #2

Conversation

pglorio commented Nov 5, 2024