API: Get the current max context length for currently loaded model #5317

f-ale · 2024-01-19T11:47:03Z

f-ale
Jan 19, 2024

Hey everyone,
I am working on a client application which interfaces with text-generation-webui's new OpenAI API.
I'm currently generating new text using the completions endpoint. I am wondering if there is any way of retrieving the max context length for the currently loaded model via the API, so I can build my prompt accordingly and avoid going beyond that limit.

I haven't been able to find a method for that, am I missing something? What's the best way to avoid going beyond the model's context length?

Thank you!

leighklotz · 2024-06-14T02:20:41Z

leighklotz
Jun 14, 2024

It looks like a minimal local addition to text-generation-webui/extensions/openai/models.py would bring this to you:

def get_current_model_info():
    return {
        'model_name': shared.model_name,
        'lora_names': shared.lora_names,
        'loader': shared.args.loader,
+      'model_settings': get_model_metadata(shared.model_name)
    }

[Edit:] Also to make the http://localhost:5000/docs UI work right, add to text-generation-webui/extensions/openai/typing.py

from typing import Any
class ModelInfoResponse(BaseModel):
    model_name: str
    lora_names: List[str]
+   model_settings: Dict[str, Any]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: Get the current max context length for currently loaded model #5317

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

API: Get the current max context length for currently loaded model #5317

f-ale Jan 19, 2024

Replies: 1 comment

leighklotz Jun 14, 2024

f-ale
Jan 19, 2024

leighklotz
Jun 14, 2024