Skip to content

Commit

Permalink
Skip failing model until issues is resolved (#638)
Browse files Browse the repository at this point in the history
Depending on which machine Llama inference test lands, it'll ether pass or fail due to system DRAM limitations. In sum, during compile time Llama inference requires around 32GB of system memory, which is on limit for most machines. Therefore, untill this issues is resolved we're skipping this test to unblock our CI.
  • Loading branch information
nvukobratTT authored Nov 7, 2024
1 parent 0f1d8ee commit fcbee46
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion forge/test/mlir/llama/test_llama_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@


@pytest.mark.parametrize("model_path", ["openlm-research/open_llama_3b", "meta-llama/Llama-3.2-1B"])
@pytest.mark.xfail()
@pytest.mark.push
@pytest.mark.skip(reason="Out of system memory during compile time. Skipping until resolved")
def test_llama_inference(model_path):
if model_path == "meta-llama/Llama-3.2-1B":
pytest.skip("Skipping test for Llama-3.2-1B model, waiting for new transformers version.")
Expand Down

0 comments on commit fcbee46

Please sign in to comment.