Skip to content

Commit

Permalink
more fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
mkozakov authored Aug 23, 2024
1 parent 085539b commit 5d21654
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions fern/pages/get-started/datasets.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -175,8 +175,8 @@ The following table describes the types of datasets supported by the Dataset API

| Dataset Type | Description | Schema | Rules | Task Type | Status | File Types Supported | Are Metadata Fields Supported? | Sample File |
|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------|---------------------------|--------------------------------|--------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `single-label-classification-finetune-input` | A file containing text and a single label (class) for each text | `text:string label:string` | You must include 40 valid train examples, \nwith five examples per label. A label cannot be present in all examples \nThere must be 24 valid evaluation examples. | Classification Fine-tuning | Supported | `csv` and `jsonl` | No | [Art classification file](https://drive.google.com/file/d/15-CchSiALUQwto4b-yAMWhdUqz8vfwQ1/view?usp=drive_link) |
| `multi-label-classification-finetune-input` | A file containing text and an array of label(s) (class) for each text | `text:string label:list[string]` | You must include 40 valid train examples, with five examples per label \nA label cannot be present in all examples. There must be 24 valid evaluation examples. | Classification Fine-tuning | Supported | `jsonl` | No | n/a |
| `single-label-classification-finetune-input` | A file containing text and a single label (class) for each text | `text:string label:string` | You must include 40 valid train examples, with five examples per label. A label cannot be present in all examples There must be 24 valid evaluation examples. | Classification Fine-tuning | Supported | `csv` and `jsonl` | No | [Art classification file](https://drive.google.com/file/d/15-CchSiALUQwto4b-yAMWhdUqz8vfwQ1/view?usp=drive_link) |
| `multi-label-classification-finetune-input` | A file containing text and an array of label(s) (class) for each text | `text:string label:list[string]` | You must include 40 valid train examples, with five examples per label. A label cannot be present in all examples. There must be 24 valid evaluation examples. | Classification Fine-tuning | Supported | `jsonl` | No | n/a |
| `reranker-finetune-input` | A file containing queries and an array of passages relevant to the query. There must also be "hard negatives", passages semantically similar but ultimately not relevant. | `query:string relevant_passages:list[string] hard_negatives:list[string]` | There must be 256 train examples and at least 64 evaluation examples. There must be at least one relevant passage, with no overlap between relevant passage and hard negatives. | Rerank Fine-tuning | Supported | `jsonl` | No | [train_valid.json](https://drive.google.com/file/d/1CmXWfQRedVyWBDCsSkeF9g8gyqmpUA7C/view?usp=drive_link) |
| `chat-finetune-input` | A file containing conversations | `messages: list[{role: string, content: string}]` | There must be two valid train examples and one valid evaluation example. | Chat Fine-tuning | In progress/not supported | `jsonl` | No | [train_celestial_fox.json](https://drive.google.com/file/d/19x6sOPXNWoZj9Jo989h09wd4IJ6Su9by/view?usp=drive_link) |
| `embed-input` | A file containing text to be embedded | `text:string` | None of the rows in the file can be empty. | Embed job | Supported | `csv` and `jsonl` | Yes | [embed_jobs_sample_data.jsonl](https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/embed_jobs_sample_data.jsonl) / [embed_jobs_sample_data.csv](https://github.com/cohere-ai/notebooks/blob/main/notebooks/data/embed_jobs_sample_data.csv) |
Expand Down

0 comments on commit 5d21654

Please sign in to comment.