cohere-ai · fern-support · Aug 12, 2024
@@ -168,20 +168,84 @@ Datasets of type `chat-finetune-input`, for example, are expected to have a json
       "role": "Chatbot",
       "content": "Time magazines top 10 cover stories in the last 10 years were:\\n\\n1. Volodymyr Zelenskyy\\n2. Elon Musk\\n3. Martin Luther King Jr.\\n4. How Earth Survived\\n5. Her Lasting Impact\\n6. Nothing to See Here\\n7. Meltdown\\n8. Deal With It\\n9. The Top of America\\n10. Bitter Pill"
     }
+  ]
 }
 ```
 
 The following table describes the types of datasets supported by the Dataset API:
 
-| Dataset Type                           | Description                                                                                                                                                                     | Schema                                                                                     | Rules                                                                                                                                                                      | Task Type                   | Status                    | File Types Supported           | Are Metadata Fields Supported? | Sample File                                                                                                                                                                               |
-|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------|---------------------------|--------------------------------|--------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `single-label-classification-finetune-input` | A file containing text and a single label (class) for each text                                                                                                                 | `text:string  \nlabel:string`                                                              | You must include 40 valid train examples,  \nwith five examples per label. A label cannot be present in all examples  \nThere must be 24 valid evaluation examples.        | Classification Fine-tuning  | Supported                 | `csv` and `jsonl`               | No                             | [Art classification file](https://drive.google.com/file/d/15-CchSiALUQwto4b-yAMWhdUqz8vfwQ1/view?usp=drive_link)                                                                          |
-| `multi-label-classification-finetune-input`  | A file containing text and an array of label(s) (class) for each text                                                                                                            | `text:string  \nlabel:list[string]`                                                        | You must include 40 valid train examples, with five  examples per label  \nA label cannot be present in all examples. There must be 24 valid evaluation examples.          | Classification Fine-tuning  | Supported                 | `jsonl`                          | No                             | n/a                                                                                                                                                                                         |
-| `reranker-finetune-input`              | A file containing queries and an array of passages relevant to the query. There must also be "hard negatives", passages semantically similar but ultimately not relevant.        | `query:string  \nrelevant_passages:list[string]  \nhard_negatives:list[string]`            | There must be 256 train examples and at least 64 evaluation examples. There must be  at least one relevant passage, with no overlap between relevant passage and hard  negatives. | Rerank Fine-tuning          | Supported                 | `jsonl`                          | No                             | [train_valid.json](https://drive.google.com/file/d/1CmXWfQRedVyWBDCsSkeF9g8gyqmpUA7C/view?usp=drive_link)                                                                                  |
-| `chat-finetune-input`                  | A file containing conversations                                                                                                                                                 | `messages: list[Message]`  \n  \n`- Message -  \n  role: text  \n  context: text`          | There must be two valid train examples and one valid evaluation example.                                                                                                     | Chat Fine-tuning            | In progress/not supported | `jsonl`                          | No                             | [train_celestial_fox.json](https://drive.google.com/file/d/19x6sOPXNWoZj9Jo989h09wd4IJ6Su9by/view?usp=drive_link)                                                                         |
-| `embed-input`                          | A file containing text to be embedded                                                                                                                                           | `text:string`                                                                              | None of the rows in the file can be empty.                                                                                                                                | Embed job                   | Supported                 | `csv` and `jsonl`               | Yes                            | [embed_jobs_sample_data.jsonl](https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/embed_jobs_sample_data.jsonl) / [embed_jobs_sample_data.csv](https://github.com/cohere-ai/notebooks/blob/main/notebooks/data/embed_jobs_sample_data.csv) |
-
-
+<table class="fern-table" style={{ 'white-space': 'nowrap', display: 'block', overflow: 'auto' }}>
+  <thead>
+    <tr>
+      <th>Dataset Type</th>
+      <th>Description</th>
+      <th>Schema</th>
+      <th>Rules</th>
+      <th>Task Type</th>
+      <th>Status</th>
+      <th>File Types Supported</th>
+      <th>Are Metadata Fields Supported?</th>
+      <th>Sample File</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><code>single-label-classification-finetune-input</code></td>
+      <td>A file containing text and a single label (class) for each text</td>
+      <td><code>text:string<br/>label:string</code></td>
+      <td>You must include 40 valid train examples, with five examples per label. <br/>A label cannot be present in all examples.<br/>There must be 24 valid evaluation examples.</td>
+      <td>Classification Fine-tuning</td>
+      <td>Supported</td>
+      <td><code>csv</code> and <code>jsonl</code></td>
+      <td>No</td>
+      <td><a href="https://drive.google.com/file/d/15-CchSiALUQwto4b-yAMWhdUqz8vfwQ1/view?usp=drive_link">Art classification file</a></td>
+    </tr>
+    <tr>
+      <td><code>multi-label-classification-finetune-input</code></td>
+      <td>A file containing text and an array of label(s) (class) for each text</td>
+      <td><code>text:string<br/>label:list[string]</code></td>
+      <td>You must include 40 valid train examples, with five examples per label.<br/>A label cannot be present in all examples.<br/>There must be 24 valid evaluation examples.</td>
+      <td>Classification Fine-tuning</td>
+      <td>Supported</td>
+      <td><code>jsonl</code></td>
+      <td>No</td>
+      <td>n/a</td>
+    </tr>
+    <tr>
+      <td><code>reranker-finetune-input</code></td>
+      <td>A file containing queries and an array of passages relevant to the query. There must also be "hard negatives", passages semantically similar but ultimately not relevant.</td>
+      <td><code>query:string<br/>relevant_passages:list[string]<br/>hard_negatives:list[string]</code></td>
+      <td>There must be 256 train examples and at least 64 evaluation examples.<br/>There must be at least one relevant passage, with no overlap between relevant passage and hard negatives.</td>
+      <td>Rerank Fine-tuning</td>
+      <td>Supported</td>
+      <td><code>jsonl</code></td>
+      <td>No</td>
+      <td><a href="https://drive.google.com/file/d/1CmXWfQRedVyWBDCsSkeF9g8gyqmpUA7C/view?usp=drive_link">train_valid.json</a></td>
+    </tr>
+    <tr>
+      <td><code>chat-finetune-input</code></td>
+      <td>A file containing conversations</td>
+      <td><code>messages: list[Message]<br/><br/>- Message -<br/>role: text<br/>context: text</code></td>
+      <td>There must be two valid train examples and one valid evaluation example.</td>
+      <td>Chat Fine-tuning</td>
+      <td>In progress/not supported</td>
+      <td><code>jsonl</code></td>
+      <td>No</td>
+      <td><a href="https://drive.google.com/file/d/19x6sOPXNWoZj9Jo989h09wd4IJ6Su9by/view?usp=drive_link">train_celestial_fox.json</a></td>
+    </tr>
+    <tr>
+      <td><code>embed-input</code></td>
+      <td>A file containing text to be embedded</td>
+      <td><code>text:string</code></td>
+      <td>None of the rows in the file can be empty.</td>
+      <td>Embed job</td>
+      <td>Supported</td>
+      <td><code>csv</code> and <code>jsonl</code></td>
+      <td>Yes</td>
+      <td><a href="https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/embed_jobs_sample_data.jsonl">embed_jobs_sample_data.jsonl</a> / <a href="https://github.com/cohere-ai/notebooks/blob/main/notebooks/data/embed_jobs_sample_data.csv">embed_jobs_sample_data.csv</a></td>
+    </tr>
+  </tbody>
+</table>
 
 ### Downloading a dataset