Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abstract ModelType #405

Open
Intex32 opened this issue Sep 8, 2023 · 1 comment
Open

abstract ModelType #405

Intex32 opened this issue Sep 8, 2023 · 1 comment
Assignees

Comments

@Intex32
Copy link
Member

Intex32 commented Sep 8, 2023

Goal: generalize the ModelType class from OpenAI specific implementations to support multiple providers

I consider ModelType a formal description of a model and it's properties without any capabilities. This includes eg name, contextLength.

my suggestions:

  • move file from module tokenizer to core
  • remove sealed class constraint
  • implementations of ModelType can be found directly in the provider specific classes that contain instances of available models; instance is passed as parameter to respective model constructor => move OpenAI instances and create new instances with correct parameters for gpt4all and GCP
  • encodingType has to be moved out of ModelType; GCP appears to not have public information about the encoding they are using; thus we have to make an API call to GCP to retrieve the token count of a given message; for OpenAI, the encoding is used to locally compute the token count

depends on #393

@Intex32 Intex32 added the help wanted Extra attention is needed label Sep 8, 2023
@Intex32 Intex32 removed the help wanted Extra attention is needed label Sep 29, 2023
@Intex32 Intex32 self-assigned this Oct 3, 2023
@Intex32
Copy link
Member Author

Intex32 commented Oct 3, 2023

After further assessment, I think maintaining a model structure (such as ModelType) besides the actual models with capabilities is not worth the effort. I suggest inlining the functionality of ModelType into the implementations of LLM. This is necessary as the encoding and context length is not the same for all models and providers and requires some more hierarchy. Currently, ModelType is heavily designed for OAI. This idea and it's consequences are subject of exploration in this issue.

ModelType held following properties:

  • model name
  • context length
  • encoding type
  • magic numbers for chat models

ModelType's encoding type was used eg for summarization or adapting a prompt to the context size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant