Slow tests on important models (on Push - A10)

Docs / Quantization: Replace all occurences of `load_in_8bit` with bn… #689

Workflow file for this run

.github/workflows/push-important-models.yml at f5590de

	name: Slow tests on important models (on Push - A10)

	on:
	push:
	branches: [ main ]

	env:
	IS_GITHUB_CI: "1"
	OUTPUT_SLACK_CHANNEL_ID: "C06L2SGMEEA"
	HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
	HF_HOME: /mnt/cache
	TRANSFORMERS_IS_CI: yes
	OMP_NUM_THREADS: 8
	MKL_NUM_THREADS: 8
	RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.
	SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
	TF_FORCE_GPU_ALLOW_GROWTH: true
	RUN_PT_TF_CROSS_TESTS: 1

	jobs:
	get_modified_models:
	name: "Get all modified files"
	runs-on: ubuntu-latest
	outputs:
	matrix: ${{ steps.set-matrix.outputs.matrix }}
	steps:
	- name: Check out code
	uses: actions/checkout@v4

	- name: Get changed files
	id: changed-files
	uses: tj-actions/changed-files@3f54ebb830831fc121d3263c1857cfbdc310cdb9 #v42
	with:
	files: src/transformers/models/**

	- name: Run step if only the files listed above change
	if: steps.changed-files.outputs.any_changed == 'true'
	id: set-matrix
	env:
	ALL_CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
	run: \|
	model_arrays=()
	for file in $ALL_CHANGED_FILES; do
	model_path="${file#*models/}"
	model_path="models/${model_path%%/*}"
	if grep -qFx "$model_path" utils/important_models.txt; then
	# Append the file to the matrix string
	model_arrays+=("$model_path")
	fi
	done
	matrix_string=$(printf '"%s", ' "${model_arrays[@]}" \| sed 's/, $//')
	echo "matrix=[$matrix_string]" >> $GITHUB_OUTPUT
	test_modified_files:
	needs: get_modified_models
	name: Slow & FA2 tests
	runs-on: [single-gpu, nvidia-gpu, a10, ci]
	container:
	image: huggingface/transformers-all-latest-gpu
	options: --gpus all --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
	if: ${{ needs.get_modified_models.outputs.matrix != '[]' && needs.get_modified_models.outputs.matrix != '' && fromJson(needs.get_modified_models.outputs.matrix)[0] != null }}
	strategy:
	fail-fast: false
	matrix:
	model-name: ${{ fromJson(needs.get_modified_models.outputs.matrix) }}

	steps:
	- name: Check out code
	uses: actions/checkout@v4

	- name: Install locally transformers & other libs
	run: \|
	apt install sudo
	sudo -H pip install --upgrade pip
	sudo -H pip uninstall -y transformers
	sudo -H pip install -U -e ".[testing]"
	MAX_JOBS=4 pip install flash-attn --no-build-isolation
	pip install bitsandbytes

	- name: NVIDIA-SMI
	run: \|
	nvidia-smi

	- name: Show installed libraries and their versions
	run: pip freeze

	- name: Run FA2 tests
	id: run_fa2_tests
	run:
	pytest -rs -m "flash_attn_test" --make-reports=${{ matrix.model-name }}_fa2_tests/ tests/${{ matrix.model-name }}/test_modeling_*

	- name: "Test suite reports artifacts: ${{ matrix.model-name }}_fa2_tests"
	if: ${{ always() }}
	uses: actions/upload-artifact@v4
	with:
	name: ${{ matrix.model-name }}_fa2_tests
	path: /transformers/reports/${{ matrix.model-name }}_fa2_tests

	- name: Post to Slack
	if: always()
	uses: huggingface/hf-workflows/.github/actions/post-slack@main
	with:
	slack_channel: ${{ env.OUTPUT_SLACK_CHANNEL_ID }}
	title: 🤗 Results of the FA2 tests - ${{ matrix.model-name }}
	status: ${{ steps.run_fa2_tests.conclusion}}
	slack_token: ${{ secrets.CI_SLACK_BOT_TOKEN }}

	- name: Run integration tests
	id: run_integration_tests
	if: always()
	run:
	pytest -rs -k "IntegrationTest" --make-reports=tests_integration_${{ matrix.model-name }} tests/${{ matrix.model-name }}/test_modeling_*

	- name: "Test suite reports artifacts: tests_integration_${{ matrix.model-name }}"
	if: ${{ always() }}
	uses: actions/upload-artifact@v4
	with:
	name: tests_integration_${{ matrix.model-name }}
	path: /transformers/reports/tests_integration_${{ matrix.model-name }}

	- name: Post to Slack
	if: always()
	uses: huggingface/hf-workflows/.github/actions/post-slack@main
	with:
	slack_channel: ${{ env.OUTPUT_SLACK_CHANNEL_ID }}
	title: 🤗 Results of the Integration tests - ${{ matrix.model-name }}
	status: ${{ steps.run_integration_tests.conclusion}}
	slack_token: ${{ secrets.CI_SLACK_BOT_TOKEN }}

	- name: Tailscale # In order to be able to SSH when a test fails
	if: ${{ runner.debug == '1'}}
	uses: huggingface/tailscale-action@v1
	with:
	authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
	slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
	slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
	waitForSSH: true

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs / Quantization: Replace all occurences of `load_in_8bit` with bn… #689

Workflow file

Docs / Quantization: Replace all occurences of `load_in_8bit` with bn… #689

Jobs

Run details

Workflow file for this run

Docs / Quantization: Replace all occurences of load_in_8bit with bn… #689

Workflow file

Workflow file for this run

Docs / Quantization: Replace all occurences of `load_in_8bit` with bn… #689