From 8255fdfd5b9bb4eb8740fc2c9e7fdcd2f74b1378 Mon Sep 17 00:00:00 2001
From: Hadrien Mary <hadrien.mary@gmail.com>
Date: Sat, 28 Oct 2023 08:55:39 -0400
Subject: [PATCH] add hf links to readme

---
 README.md     | 11 ++++++++---
 docs/index.md | 11 ++++++++---
 2 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/README.md b/README.md
index 850c56e..4569f22 100644
--- a/README.md
+++ b/README.md
@@ -16,6 +16,9 @@
   </a> |
   <a href="https://huggingface.co/datamol-io/safe" target="_blank">
     🤗 Model
+  </a> |
+  <a href="https://huggingface.co/datasets/datamol-io/safe-gpt" target="_blank">
+    🤗 Training Dataset
   </a>
 </p>
 
@@ -73,9 +76,11 @@ mamba install -c conda-forge safe-mol
 
 ### Datasets and Models
 
-We provided a pretained GPT2 model (XX M parameters) using the SAFE molecular representation that has been trained on 1.1 billion molecules from Unichem (0.1B) + Zinc (1B):
-
-- _Safe-XXM_ TODO
+| Type    | Name                                                                  | Infos      | Size  | Comment              |
+| ------- | --------------------------------------------------------------------- | ---------- | ----- | -------------------- |
+| Model   | [datamol-io/safe-gpt](https://huggingface.co/datamol-io/safe-gpt)     | 87M params | 350M  | Default model        |
+| Dataset | [datamol-io/safe-gpt](https://huggingface.co/datamol-io/safe-gpt)     | 1.1B rows  | 250GB | Training dataset     |
+| Dataset | [datamol-io/safe-drugs](https://huggingface.co/datamol-io/safe-drugs) | 26 rows    | 20 kB | Benchmarking dataset |
 
 ## Usage
 
diff --git a/docs/index.md b/docs/index.md
index 54b7478..617c596 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -16,6 +16,9 @@
   </a> |
   <a href="https://huggingface.co/datamol-io/safe" target="_blank">
     🤗 Model
+  </a> |
+  <a href="https://huggingface.co/datasets/datamol-io/safe-gpt" target="_blank">
+    🤗 Training Dataset
   </a>
 </p>
 
@@ -73,9 +76,11 @@ mamba install -c conda-forge safe-mol
 
 ### Datasets and Models
 
-We provided a pretained GPT2 model (XX M parameters) using the SAFE molecular representation that has been trained on 1.1 billion molecules from Unichem (0.1B) + Zinc (1B):
-
-- _Safe-XXM_ TODO
+| Type    | Name                                                                  | Infos      | Size  | Comment              |
+| ------- | --------------------------------------------------------------------- | ---------- | ----- | -------------------- |
+| Model   | [datamol-io/safe-gpt](https://huggingface.co/datamol-io/safe-gpt)     | 87M params | 350M  | Default model        |
+| Dataset | [datamol-io/safe-gpt](https://huggingface.co/datamol-io/safe-gpt)     | 1.1B rows  | 250GB | Training dataset     |
+| Dataset | [datamol-io/safe-drugs](https://huggingface.co/datamol-io/safe-drugs) | 26 rows    | 20 kB | Benchmarking dataset |
 
 ## Usage