Skip to content

Commit

Permalink
feat: Updated descriptions of some properties.
Browse files Browse the repository at this point in the history
  • Loading branch information
HavenDV committed Jul 12, 2024
1 parent f91982a commit 8c7ccf2
Show file tree
Hide file tree
Showing 23 changed files with 654 additions and 41 deletions.
59 changes: 39 additions & 20 deletions docs/openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -338,90 +338,106 @@ components:
type: integer
nullable: true
description: |
Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)
Sets the random number seed to use for generation. Setting this to a specific number will make the model
generate the same text for the same prompt. (Default: 0)
num_predict:
type: integer
nullable: true
description: |
Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)
Maximum number of tokens to predict when generating text.
(Default: 128, -1 = infinite generation, -2 = fill context)
top_k:
type: integer
nullable: true
description: |
Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers,
while a lower value (e.g. 10) will be more conservative. (Default: 40)
top_p:
type: number
format: float
nullable: true
description: |
Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value
(e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
tfs_z:
type: number
format: float
nullable: true
description: |
Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)
Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value
(e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)
typical_p:
type: number
format: float
nullable: true
description: |
Typical p is used to reduce the impact of less probable tokens from the output.
Typical p is used to reduce the impact of less probable tokens from the output. (default: 1)
repeat_last_n:
type: integer
nullable: true
description: |
Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
Sets how far back for the model to look back to prevent repetition.
(Default: 64, 0 = disabled, -1 = num_ctx)
temperature:
type: number
format: float
nullable: true
description: |
The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)
The temperature of the model. Increasing the temperature will make the model answer more creatively.
(Default: 0.8)
repeat_penalty:
type: number
format: float
nullable: true
description: |
Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more
strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
presence_penalty:
type: number
format: float
nullable: true
description: |
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Positive values penalize new tokens based on whether they appear in the text so far, increasing the
model's likelihood to talk about new topics. (Default: 0)
frequency_penalty:
type: number
format: float
nullable: true
description: |
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the
model's likelihood to repeat the same line verbatim. (Default: 0)
mirostat:
type: integer
nullable: true
description: |
Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
Enable Mirostat sampling for controlling perplexity.
(default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
mirostat_tau:
type: number
format: float
nullable: true
description: |
Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
Controls the balance between coherence and diversity of the output. A lower value will result in more
focused and coherent text. (Default: 5.0)
mirostat_eta:
type: number
format: float
nullable: true
description: |
Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)
Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate
will result in slower adjustments, while a higher learning rate will make the algorithm more responsive.
(Default: 0.1)
penalize_newline:
type: boolean
nullable: true
description: |
Penalize newlines in the output. (Default: false)
Penalize newlines in the output. (Default: true)
stop:
type: array
nullable: true
description: Sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
description: |
Sequences where the API will stop generating further tokens. The returned text will not contain the stop
sequence.
items:
type: string
numa:
Expand All @@ -433,17 +449,18 @@ components:
type: integer
nullable: true
description: |
Sets the size of the context window used to generate the next token.
Sets the size of the context window used to generate the next token. (Default: 2048)
num_batch:
type: integer
nullable: true
description: |
Sets the number of batches to use for generation. (Default: 1)
Sets the number of batches to use for generation. (Default: 512)
num_gpu:
type: integer
nullable: true
description: |
The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.
The number of layers to send to the GPU(s).
On macOS it defaults to 1 to enable metal support, 0 to disable.
main_gpu:
type: integer
nullable: true
Expand Down Expand Up @@ -483,7 +500,9 @@ components:
type: integer
nullable: true
description: |
Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores).
Sets the number of threads to use during computation. By default, Ollama will detect this for optimal
performance. It is recommended to set this value to the number of physical CPU cores your system has
(as opposed to the logical number of cores).
ResponseFormat:
type: string
description: |
Expand Down
2 changes: 1 addition & 1 deletion src/libs/Directory.Build.props
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
</ItemGroup>

<PropertyGroup Label="Nuget">
<Version>1.4.1</Version>
<Version>1.4.2</Version>
<GeneratePackageOnBuild Condition=" '$(Configuration)' == 'Release' ">true</GeneratePackageOnBuild>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<Authors>tryAGI and contributors</Authors>
Expand Down
1 change: 1 addition & 0 deletions src/libs/Ollama/Generated/JsonSerializerContext.g.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#nullable enable

#pragma warning disable CS0618 // Type or member is obsolete
#pragma warning disable CS3016 // Arrays as attribute arguments is not CLS-compliant

namespace Ollama
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@ namespace Ollama
{
public partial class ChatClient
{
partial void PrepareGenerateChatCompletionArguments(
global::System.Net.Http.HttpClient httpClient,
global::Ollama.GenerateChatCompletionRequest request);
partial void PrepareGenerateChatCompletionRequest(
global::System.Net.Http.HttpClient httpClient,
global::System.Net.Http.HttpRequestMessage httpRequestMessage,
global::Ollama.GenerateChatCompletionRequest request);
partial void ProcessGenerateChatCompletionResponse(
global::System.Net.Http.HttpClient httpClient,
global::System.Net.Http.HttpResponseMessage httpResponseMessage);

/// <summary>
/// Generate the next message in a chat with a provided model.<br/>
/// This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
Expand All @@ -18,6 +29,12 @@ public partial class ChatClient
{
request = request ?? throw new global::System.ArgumentNullException(nameof(request));

PrepareArguments(
client: _httpClient);
PrepareGenerateChatCompletionArguments(
httpClient: _httpClient,
request: request);

using var httpRequest = new global::System.Net.Http.HttpRequestMessage(
method: global::System.Net.Http.HttpMethod.Post,
requestUri: new global::System.Uri(_httpClient.BaseAddress?.AbsoluteUri.TrimEnd('/') + "/chat", global::System.UriKind.RelativeOrAbsolute));
Expand All @@ -27,10 +44,25 @@ public partial class ChatClient
encoding: global::System.Text.Encoding.UTF8,
mediaType: "application/json");

PrepareRequest(
client: _httpClient,
request: httpRequest);
PrepareGenerateChatCompletionRequest(
httpClient: _httpClient,
httpRequestMessage: httpRequest,
request: request);

using var response = await _httpClient.SendAsync(
request: httpRequest,
completionOption: global::System.Net.Http.HttpCompletionOption.ResponseHeadersRead,
cancellationToken: cancellationToken).ConfigureAwait(false);

ProcessResponse(
client: _httpClient,
response: response);
ProcessGenerateChatCompletionResponse(
httpClient: _httpClient,
httpResponseMessage: response);
response.EnsureSuccessStatusCode();

using var stream = await response.Content.ReadAsStreamAsync(cancellationToken).ConfigureAwait(false);
Expand Down
13 changes: 13 additions & 0 deletions src/libs/Ollama/Generated/Ollama.ChatClient.g.cs
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,18 @@ public void Dispose()
{
_httpClient.Dispose();
}

partial void PrepareArguments(
global::System.Net.Http.HttpClient client);
partial void PrepareRequest(
global::System.Net.Http.HttpClient client,
global::System.Net.Http.HttpRequestMessage request);
partial void ProcessResponse(
global::System.Net.Http.HttpClient client,
global::System.Net.Http.HttpResponseMessage response);
partial void ProcessResponseContent(
global::System.Net.Http.HttpClient client,
global::System.Net.Http.HttpResponseMessage response,
ref string content);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@ namespace Ollama
{
public partial class CompletionsClient
{
partial void PrepareGenerateCompletionArguments(
global::System.Net.Http.HttpClient httpClient,
global::Ollama.GenerateCompletionRequest request);
partial void PrepareGenerateCompletionRequest(
global::System.Net.Http.HttpClient httpClient,
global::System.Net.Http.HttpRequestMessage httpRequestMessage,
global::Ollama.GenerateCompletionRequest request);
partial void ProcessGenerateCompletionResponse(
global::System.Net.Http.HttpClient httpClient,
global::System.Net.Http.HttpResponseMessage httpResponseMessage);

/// <summary>
/// Generate a response for a given prompt with a provided model.<br/>
/// The final response object will include statistics and additional data from the request.
Expand All @@ -18,6 +29,12 @@ public partial class CompletionsClient
{
request = request ?? throw new global::System.ArgumentNullException(nameof(request));

PrepareArguments(
client: _httpClient);
PrepareGenerateCompletionArguments(
httpClient: _httpClient,
request: request);

using var httpRequest = new global::System.Net.Http.HttpRequestMessage(
method: global::System.Net.Http.HttpMethod.Post,
requestUri: new global::System.Uri(_httpClient.BaseAddress?.AbsoluteUri.TrimEnd('/') + "/generate", global::System.UriKind.RelativeOrAbsolute));
Expand All @@ -27,10 +44,25 @@ public partial class CompletionsClient
encoding: global::System.Text.Encoding.UTF8,
mediaType: "application/json");

PrepareRequest(
client: _httpClient,
request: httpRequest);
PrepareGenerateCompletionRequest(
httpClient: _httpClient,
httpRequestMessage: httpRequest,
request: request);

using var response = await _httpClient.SendAsync(
request: httpRequest,
completionOption: global::System.Net.Http.HttpCompletionOption.ResponseHeadersRead,
cancellationToken: cancellationToken).ConfigureAwait(false);

ProcessResponse(
client: _httpClient,
response: response);
ProcessGenerateCompletionResponse(
httpClient: _httpClient,
httpResponseMessage: response);
response.EnsureSuccessStatusCode();

using var stream = await response.Content.ReadAsStreamAsync(cancellationToken).ConfigureAwait(false);
Expand Down
13 changes: 13 additions & 0 deletions src/libs/Ollama/Generated/Ollama.CompletionsClient.g.cs
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,18 @@ public void Dispose()
{
_httpClient.Dispose();
}

partial void PrepareArguments(
global::System.Net.Http.HttpClient client);
partial void PrepareRequest(
global::System.Net.Http.HttpClient client,
global::System.Net.Http.HttpRequestMessage request);
partial void ProcessResponse(
global::System.Net.Http.HttpClient client,
global::System.Net.Http.HttpResponseMessage response);
partial void ProcessResponseContent(
global::System.Net.Http.HttpClient client,
global::System.Net.Http.HttpResponseMessage response,
ref string content);
}
}
Loading

0 comments on commit 8c7ccf2

Please sign in to comment.