Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(genapi): init docs #3636

Merged
merged 51 commits into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
7cf4d78
feat(genapi): init docs
bene2k1 Aug 27, 2024
f15d03c
feat(genapi): add changelogs and filters
bene2k1 Aug 27, 2024
64a3d9e
feat(genapi): update navigation
bene2k1 Aug 27, 2024
37e6d10
fix(genapi): fix typo
bene2k1 Aug 27, 2024
c13852c
fix(genai): fix broken formatting
bene2k1 Aug 27, 2024
98ce8c1
feat(genapi): add Generative APIs concepts (#3640)
bene2k1 Aug 28, 2024
95523ff
feat(ifr): update index
bene2k1 Aug 28, 2024
54722d0
feat(genapi): added quickstart (#3637)
bene2k1 Aug 28, 2024
3870a9c
feat(genapi): add how-tos (#3638)
bene2k1 Aug 28, 2024
7dba2ef
feat(ifr): add content
bene2k1 Aug 28, 2024
25b0e8e
feat(ifr): add genapi docs
bene2k1 Aug 28, 2024
ebea84e
feat(ifr): update navigation
bene2k1 Aug 28, 2024
6c86b2a
feat(ifr): update index
bene2k1 Aug 28, 2024
7ff0a9f
feat(ifr): added tutorials for generative apis (#3647)
bene2k1 Aug 30, 2024
85c7b9e
docs(ifr): update quickstart
bene2k1 Aug 28, 2024
dab2bcd
errors and models contributed
tgenaitay Sep 2, 2024
844d21a
messages and links
tgenaitay Sep 2, 2024
0d3c8a2
corrections
tgenaitay Sep 2, 2024
a89a8fa
removed openai compatibility content
tgenaitay Sep 2, 2024
08fdc71
rate limit starting point
tgenaitay Sep 2, 2024
5b09410
no playground at start
tgenaitay Sep 2, 2024
17ff66e
basics covered
tgenaitay Sep 2, 2024
72522b8
tags
tgenaitay Sep 2, 2024
a6db606
remove precision
tgenaitay Sep 3, 2024
c87d898
metas
tgenaitay Sep 3, 2024
a6b3e67
no playground at start
tgenaitay Sep 3, 2024
215f13f
expanding
tgenaitay Sep 3, 2024
466a117
update
tgenaitay Sep 3, 2024
414e020
additions
tgenaitay Sep 3, 2024
5e9809f
insertions
tgenaitay Sep 3, 2024
b903037
simplified concepts
tgenaitay Sep 3, 2024
d630411
tag
tgenaitay Sep 3, 2024
a0ffe3b
feat(ifr): corrected per reviews
tgenaitay Sep 3, 2024
23cde45
feat(genapi): review
tgenaitay Sep 4, 2024
1a47119
feat(genapi): below
tgenaitay Sep 4, 2024
99b21b5
feat(genapi): go further
tgenaitay Sep 4, 2024
32c5364
feat(genapi): dot
tgenaitay Sep 4, 2024
4d78cc3
feat(genapi): dot dot
tgenaitay Sep 4, 2024
69a04aa
feat(genapi): formatting
tgenaitay Sep 4, 2024
e7e2862
feat(genapi): check
tgenaitay Sep 4, 2024
34abaa3
feat(genapi): form
tgenaitay Sep 4, 2024
57aae0b
Apply suggestions from code review
tgenaitay Sep 4, 2024
a4a3fc2
Apply suggestions from code review
tgenaitay Sep 4, 2024
fb249e1
Apply suggestions from code review
tgenaitay Sep 4, 2024
d05487a
Apply suggestions from code review
tgenaitay Sep 4, 2024
d188aee
feat(genapi): edited per review
tgenaitay Sep 4, 2024
a34bf4f
feat(genapi): project
tgenaitay Sep 4, 2024
c60e876
feat(genapi): slack chan
tgenaitay Sep 4, 2024
c64dde2
feat(genapi): clarify content type for POST
tgenaitay Sep 4, 2024
899e698
feat(genapi): clarify 5xx errors in SSE
tgenaitay Sep 4, 2024
5ed4616
feat(genapi): bring request samples
tgenaitay Sep 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions ai-data/generative-apis/api-cli/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
meta:
title: Generative APIs - API/CLI
description: Generative APIs API/CLI
content:
h1: Generative APIs - API/CLI
paragraph: Generative APIs API/CLI
---
37 changes: 37 additions & 0 deletions ai-data/generative-apis/api-cli/understanding-errors.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
meta:
title: Understanding errors
description: This page explains how to understand errors with Generative APIs
content:
h1: Understanding errors
paragraph: This page explains how to understand errors with Generative APIs
tags: generative-apis ai-data understanding-data
dates:
validation: 2024-09-02
posted: 2024-09-02
---

Scaleway is using conventional HTTP response codes to indicate the success or failure of an API request.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
In general, codes in the 2xx range indicate success, codes in the 4xx range indicate an error given the information provided, and codes in the 5xx range show an error from Scaleway servers.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

If the response code is not within the 2xx range, the response will contain an error object structured as follows:

```
{
"error": string,
"status": number,
"message": string
}
```

Following are usual HTTP error codes:
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

- 400 - **Bad Request**: The format or content of your payload is incorrect. Body may be too large, or fail to parse, or content-type is mismatched.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
- 401 - **Unauthorized**: The `authorization` header is missing. Find required headers in [this page](/generative-apis/api-cli/using-generative-apis/)
- 403 - **Forbidden**: Your API key doesn't exist or does not have the necessary permissions to access the requested resource. Find required permission sets in [this page](/generative-apis/api-cli/using-generative-apis/)
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
- 404 - **Route Not Found**: The requested resource could not be found. Check your request is being made to the correct endpoint.
- 422 - **Model Not Found**: The `model` key is present in the request payload, but the corresponding model is not found.
- 422 - **Missing Model**: The `model` key is missing from the request payload.
- 500 - **API error**: An unexpected internal error has occurred within Scaleway's systems. If the issue persists, please open a support ticket.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

For streaming responses via SSE, errors may occur after a 200 response has been returned.
88 changes: 88 additions & 0 deletions ai-data/generative-apis/api-cli/using-chat-api.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
meta:
title: Using Chat API
description: This page explains how to use the Chat API to query models
content:
h1: Using Chat API
paragraph: This page explains how to use the Chat API to query models
tags: generative-apis ai-data chat-api
dates:
validation: 2024-09-03
posted: 2024-09-03
---

Scaleway Generative APIs are designed as a drop-in replacement for the OpenAI APIs. If you have an LLM-driven application that uses one of OpenAI's client libraries, you can easily configure it to point to Scaleway Chat API, and get your existing applications running using open-weight instruct models hosted at Scaleway.

## Create chat completion

Creates a model response for the given chat conversation.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

```
curl --request POST \
--url https://api.scaleway.ai/v1/chat/completions \
--header 'Authorization: Bearer ${SCW_SECRET_KEY}' \
--header 'Content-Type: application/json'
--data '{
"model": "llama-3.1-8b-instruct",
"messages": [
{
"role": "system",
"content": "<string>"
},
{
"role": "user",
"content": "<string>"
}
],
"max_tokens": integer,
"temperature": float,
"top_p": float,
"presence_penalty": float,
"stop": "<string>",
"stream": boolean,
}'
```


## Headers

Find required headers in [this page](/generative-apis/api-cli/using-generative-apis/)
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

## Body

### Required parameters

| Param | Type | Description |
| ------------- |-------------|-------------|
| **messages*** | array of objects | A list of messages comprising the conversation so far. |
| **model*** | string | The name of the model to query. |

Our chat API is OpenAI compatible. Use OpenAI’s [API reference](https://platform.openai.com/docs/api-reference/chat/create) for more detailed information on the usage.

### Supported parameters

- temperature
- top_p
- max_tokens
- stream
- presence_penalty
- logprobs
- stop
- seed

### Unsupported parameters

- response_format
- frequency_penalty
- n
- top_logprobs
- tools
- tool_choice
- logit_bias
- user

If you have a use case requiring one of these unsupported parameters, please [contact us via Slack](https://slack.scaleway.com/).
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

<Message type="note">
To go further, [find here Python code examples](/ai-data/generative-apis/how-to/query-text-models/#querying-text-models-via-api) to query text models using Scaleway's Chat API.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
</Message>
54 changes: 54 additions & 0 deletions ai-data/generative-apis/api-cli/using-embeddings-api.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
meta:
title: Using Embeddings API
description: This page explains how to use the Embeddings API
content:
h1: Using Embeddings API
paragraph: This page explains how to use the Embeddings API
tags: generative-apis ai-data embeddings-api
dates:
validation: 2024-09-03
posted: 2024-09-03
---

Scaleway Generative APIs are designed as a drop-in replacement for the OpenAI APIs. If you have clustering or classification tasks already using one of OpenAI's client libraries, you can easily configure it to point to Scaleway Embeddings API, and get your existing applications running with open-weight embedding models hosted at Scaleway.

## Create embeddings

Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.

```
curl --request POST \
--url https://api.scaleway.ai/v1/embeddings \
--header 'Authorization: Bearer ${SCW_SECRET_KEY}' \
--header 'Content-Type: application/json'
--data '{
"model": "sentence-t5-xxl",
"input": "<string>"
}'
```

## Headers

Find required headers in [this page](/generative-apis/api-cli/using-generative-apis/)
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

## Body

### Required parameters

| Param | Type | Description |
| ------------- |-------------|-------------|
| **input*** | string or array | Input text to embed, encoded as a string or array of strings. Cannot be an empty string. |
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
| **model*** | string | The name of the model to query. |

Our embeddings API is OpenAI compatible. Use OpenAI’s [API reference](https://platform.openai.com/docs/api-reference/embeddings) for more detailed information on the usage.

### Unsupported parameters
- encoding_format (default float)
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
- dimensions

If you have a use case requiring one of these unsupported parameters, please [contact us via Slack](https://slack.scaleway.com/).

<Message type="note">
We provide [here some Python code examples](/ai-data/generative-apis/how-to/query-embedding-models/#querying-embedding-models-via-api) to query embedding models using Scaleway's Embeddings API.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
</Message>
65 changes: 65 additions & 0 deletions ai-data/generative-apis/api-cli/using-generative-apis.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
meta:
title: Using Generative APIs
description: This page explains how to use Generative APIs
content:
h1: Using Generative APIs
paragraph: This page explains how to use Generative APIs
tags: generative-apis ai-data embeddings-api
dates:
validation: 2024-08-28
posted: 2024-08-28
---

## Access

- Access to this service is restricted while in beta. You can request access to the product via [a form here](https://www.scaleway.com/en/betas/#generative-api).
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved
- A valid [API key](/identity-and-access-management/iam/how-to/create-api-keys/) is needed.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

## Authentication

All requests to the Scaleway Generative APIs must include an `Authorization` HTTP header with your API key prefixed by `Bearer`.

We recommend exporting your secret key as an environment variable, which you can then pass directly in your curl request as follows. Remember to replace the example value with your own API secret key.

```
export SCW_SECRET_KEY=720438f9-fcb9-4ebb-80a7-808ebf15314b
```

Curl request:
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

```
curl -X GET \
-H "Authorization: Bearer ${SCW_SECRET_KEY}" \
-H "Content-Type: application/json" \
"https://api.scaleway.ai/v1/models"
```

When using the OpenAI Python SDK, the API key is set once during client initialization, and the SDK automatically manages the inclusion of the Authorization header in all API requests.
In contrast, when directly integrating with the Scaleway Generative APIs, you are responsible for manually setting the Authorization header with the API key for each request to ensure proper authentication.

## Content types

Scaleway Generative APIs accept JSON in request bodies and returns JSON in response bodies.
You will want to send the `Content-Type: application/json` HTTP header in your requests.

## Permissions

Permissions define the actions a user or an application can perform on Scaleway Generative APIs. They are managed using Scaleway’s [Identity and Access Management](/identity-and-access-management/iam/quickstart/) interface.

[Owner](/identity-and-access-management/iam/concepts/#owner) status or certain [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allow you to perform actions in the intended Organization.

Querying AI models hosted by Scaleway Generative APIs will require any of the following [permission sets](/identity-and-access-management/iam/concepts/#permission-set):

- **GenerativeApisModelAccess**
- **GenerativeApisFullAccess**
- **AllProductsFullAccess**

## Projects

tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

tgenaitay marked this conversation as resolved.
Show resolved Hide resolved





25 changes: 25 additions & 0 deletions ai-data/generative-apis/api-cli/using-models-api.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
meta:
title: Using Models API
description: This page explains how to use the Models API
content:
h1: Using Models API
paragraph: This page explains how to use the Models API
tags: generative-apis ai-data embeddings-api
dates:
validation: 2024-09-02
posted: 2024-09-02
---

Scaleway Generative APIs are designed as drop-in replacement for the OpenAI APIs.
Using the Models API, it is easy to list the various AI models available at Scaleway.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

## List models

Lists the available models, and provides basic information about each one.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

```
curl -s \
--url "https://api.scaleway.ai/v1/models" \
--header "Authorization: Bearer ${SCW_SECRET_KEY}"
```
65 changes: 65 additions & 0 deletions ai-data/generative-apis/concepts.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
meta:
title: Generative APIs - Concepts
description: This page explains all the concepts related to Generative APIs
content:
h1: Generative APIs - Concepts
paragraph: This page explains all the concepts related to Generative APIs
tags:
dates:
validation: 2024-08-27
categories:
- ai-data
---

## API rate limits

API rate limits define the maximum number of requests a user can make to the Generative APIs within a specific time frame. Rate limiting helps to manage resource allocation, prevent abuse, and ensure fair access for all users. Understanding and adhering to these limits is essential for maintaining optimal application performance using these APIs.

## Context window

The context window is the maximum amount of prompt data considered by the model to generate a response. Using models with high context length, you can provide more information to generate relevant responses. The context is measured in tokens.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

## Embeddings

Embeddings are numerical representations of text data that capture semantic information in a dense vector format. In Generative APIs, embeddings are essential for tasks such as similarity matching, clustering, and serving as inputs for downstream models. These vectors enable the model to understand and generate text based on the underlying meaning rather than just the surface-level words.

## Error handling

Error handling refers to the strategies and mechanisms in place to manage and respond to errors during API requests. This includes handling network issues, invalid inputs, or server-side errors. Proper error handling ensures that applications using Generative APIs can gracefully recover from failures and provide meaningful feedback to users.

## Parameters

Parameters are settings that control the behavior and performance of generative models. These include temperature, max tokens, and top-p sampling, among others. Adjusting parameters allows users to tweak the model's output, balancing factors like creativity, accuracy, and response length to suit specific use cases.

## Inter-token Latency (ITL)

The inter-token latency corresponds to the average time elapsed between two generated tokens. It is usually expressed in milliseconds.
tgenaitay marked this conversation as resolved.
Show resolved Hide resolved

## Prompt Engineering

Prompt engineering involves crafting specific and well-structured inputs (prompts) to guide the model towards generating the desired output. Effective prompt design is crucial for generating relevant responses, particularly in complex or creative tasks. It often requires experimentation to find the right balance between specificity and flexibility.

## Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is a technique that enhances generative models by integrating information retrieval methods. By fetching relevant data from external sources before generating a response, RAG ensures that the output is more accurate and contextually relevant, especially in scenarios requiring up-to-date or specific information.

## Stop words

Stop words are a parameter set to tell the model to stop generating further tokens after one or more chosen tokens have been generated. This is useful for controlling the end of the model output, as it will cut off at the first occurrence of any of these strings.
ldecarvalho-doc marked this conversation as resolved.
Show resolved Hide resolved

## Streaming

Streaming is a parameter allowing responses to be delivered in real-time, showing parts of the output as they are generated rather than waiting for the full response. Scaleway is following the [Server-sent events](https://html.spec.whatwg.org/multipage/server-sent-events.html#server-sent-events) standard. This behavior usually enhances user experience by providing immediate feedback and a more interactive conversation.

## Temperature

Temperature is a parameter that controls the randomness of the model's output during text generation. A higher temperature produces more creative and diverse outputs, while a lower temperature makes the model's responses more deterministic and focused. Adjusting the temperature allows users to balance creativity with coherence in the generated text.

## Time to First Token (TTFT)

Time to First Token (TTFT) measures the time elapsed from the moment a request is made to the point when the first token of the generated text is returned. TTFT is a crucial performance metric for evaluating the responsiveness of generative models, especially in interactive applications where users expect immediate feedback.

## Tokens

Tokens are the basic units of text that a generative model processes. Depending on the tokenization strategy, these can be words, subwords, or even characters. The number of tokens directly affects the context window size and the computational cost of using the model. Understanding token usage is essential for optimizing API requests and managing costs effectively.
8 changes: 8 additions & 0 deletions ai-data/generative-apis/how-to/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
meta:
title: Generative APIs - How Tos
description: Generative APIs How Tos
content:
h1: Generative APIs - How Tos
paragraph: Generative APIs How Tos
---
Loading
Loading