Skip failing model until issues is resolved (#638)

Depending on which machine Llama inference test lands, it'll ether pass or fail due to system DRAM limitations. In sum, during compile time Llama inference requires around 32GB of system memory, which is on limit for most machines. Therefore, untill this issues is resolved we're skipping this test to unblock our CI.
tenstorrent · Nov 7, 2024 · fcbee46 · fcbee46
1 parent 0f1d8ee
commit fcbee46
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/forge/test/mlir/llama/test_llama_inference.py b/forge/test/mlir/llama/test_llama_inference.py
@@ -11,8 +11,8 @@
 
 
 @pytest.mark.parametrize("model_path", ["openlm-research/open_llama_3b", "meta-llama/Llama-3.2-1B"])
-@pytest.mark.xfail()
 @pytest.mark.push
+@pytest.mark.skip(reason="Out of system memory during compile time. Skipping until resolved")
 def test_llama_inference(model_path):
     if model_path == "meta-llama/Llama-3.2-1B":
         pytest.skip("Skipping test for Llama-3.2-1B model, waiting for new transformers version.")