Skip to content

Commit

Permalink
minor changes
Browse files Browse the repository at this point in the history
  • Loading branch information
ccmao1130 committed Dec 19, 2024
1 parent 4fab582 commit be595a0
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 5 deletions.
6 changes: 3 additions & 3 deletions docs-v2/integrations/huggingface.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Hugging Face Datasets

Daft is able to read datasets directly from Huggingface via the `hf://datasets/` protocol.
Daft is able to read datasets directly from Hugging Face via the `hf://datasets/` protocol.

Since Huggingface will [automatically convert](https://huggingface.co/docs/dataset-viewer/en/parquet) all public datasets to parquet format, we can read these datasets using the [`read_parquet`](https://www.getdaft.io/projects/docs/en/stable/api_docs/doc_gen/io_functions/daft.read_parquet.html) method.
Since Hugging Face will [automatically convert](https://huggingface.co/docs/dataset-viewer/en/parquet) all public datasets to parquet format, we can read these datasets using the [`read_parquet`](https://www.getdaft.io/projects/docs/en/stable/api_docs/doc_gen/io_functions/daft.read_parquet.html) method.

!!! warning "Warning"

Expand Down Expand Up @@ -52,7 +52,7 @@ For authenticated datasets:
```

It's important to note that this will not work with standard tier private datasets.
Huggingface does not auto convert private datasets to parquet format, so you will need to specify the path to the files you want to read.
Hugging Face does not auto convert private datasets to parquet format, so you will need to specify the path to the files you want to read.

=== "🐍 Python"

Expand Down
2 changes: 1 addition & 1 deletion docs-v2/integrations/sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Daft currently supports:

## Installing Daft with SQL Support

Install Daft with the `getdaft[sql]` extra, or manually install the required packages: `ConnectorX <https://sfu-db.github.io/connector-x/databases.html>`__, `SQLAlchemy <https://docs.sqlalchemy.org/en/20/orm/quickstart.html>`__, and `SQLGlot <https://sqlglot.com/sqlglot.html>`__.
Install Daft with the `getdaft[sql]` extra, or manually install the required packages: [ConnectorX](https://sfu-db.github.io/connector-x/databases.html), [SQLAlchemy](https://docs.sqlalchemy.org/en/20/orm/quickstart.html) and [SQLGlot](https://sqlglot.com/sqlglot.html).

```bash
pip install -U "getdaft[sql]"
Expand Down
3 changes: 3 additions & 0 deletions docs-v2/resources/architecture.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Architecture

!!! failure "todo(docs): Add information about where Daft fits into the ecosystem or architecture of a system"


## High Level Overview

![Architecture diagram for the Daft library spanning the User API, Planning, Scheduling and Execution layers](../img/architecture.png)
Expand Down
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ nav:
- Microsoft Azure: integrations/azure.md
- Amazon Web Services: integrations/aws.md
- SQL: integrations/sql.md
- Huggingface Datasets: integrations/huggingface.md
- Hugging Face Datasets: integrations/huggingface.md
- Resources:
- Architecture: resources/architecture.md
- DataFrame Comparison: resources/dataframe_comparison.md
Expand Down

0 comments on commit be595a0

Please sign in to comment.