Skip to content

Commit

Permalink
use optimum/gpt2
Browse files Browse the repository at this point in the history
  • Loading branch information
IlyasMoutawwakil committed Oct 16, 2023
1 parent 625162d commit 2fa9dcd
Showing 1 changed file with 2 additions and 6 deletions.
8 changes: 2 additions & 6 deletions docs/source/onnxruntime/usage_guides/gpu.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -309,24 +309,20 @@ For example, for text generation, the engine can be built with:

```python
>>> import os
>>> from transformers import AutoTokenizer
>>> from optimum.onnxruntime import ORTModelForCausalLM

>>> os.makedirs("tmp/trt_cache_gpt2_example", exist_ok=True)
>>> provider_options = {
... "trt_engine_cache_enable": True,
... "trt_engine_cache_path": "tmp/trt_cache_gpt2_example"
... "trt_engine_cache_path": "tmp/trt_cache_gpt2_example",
... "trt_profile_min_shapes": "input_ids:1x1,attention_mask:1x1,position_ids:1x1",
... "trt_profile_opt_shapes": "input_ids:1x1,attention_mask:1x1,position_ids:1x1",
... "trt_profile_max_shapes": "input_ids:1x64,attention_mask:1x64,position_ids:1x64",
... }

>>> ort_model = ORTModelForCausalLM.from_pretrained(
... "gpt2",
... export=True,
... "optimum/gpt2",
... use_cache=False,
... use_merged=False,
... use_io_binding=False,
... provider="TensorrtExecutionProvider",
... provider_options=provider_options,
... )
Expand Down

0 comments on commit 2fa9dcd

Please sign in to comment.