Skip to content

Commit

Permalink
Update documentation for vdb_upload to use realistic source data wi…
Browse files Browse the repository at this point in the history
…th the `--file_source` flag (#1800)

* Replace "./morpheus/data/*" as a data source for `vdb_upload` which is not a valid data source. 
* Add new serialized dataframes to `examples/data/vdb_upload`
  * `doca_guides.jsonlines`: Serialized Dataframe from data in `examples/doca/vdb_realtime/sender/dataset`
  * `nvidia_blogs.jsonlines`: Serialized Dataframe from two Nvidia blog posts:
    - https://blogs.nvidia.com/blog/mlperf-training-benchmarks/
    - https://blogs.nvidia.com/blog/ai-security-steps/

Closes #1790

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #1800
  • Loading branch information
dagardner-nv authored Jul 17, 2024
1 parent d58563a commit b72ca4c
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 2 deletions.
3 changes: 3 additions & 0 deletions examples/data/vdb_upload/doca_guides.jsonlines
Git LFS file not shown
3 changes: 3 additions & 0 deletions examples/data/vdb_upload/nvidia_blogs.jsonlines
Git LFS file not shown
4 changes: 2 additions & 2 deletions examples/llm/vdb_upload/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ python examples/llm/main.py vdb_upload pipeline \
```bash
python examples/llm/main.py vdb_upload pipeline \
--source_type filesystem \
--file_source "./morpheus/data/*" \
--file_source="./examples/data/vdb_upload/*.jsonlines" \
--enable_monitors \
--embedding_model_name all-MiniLM-L6-v2
```
Expand All @@ -224,7 +224,7 @@ python examples/llm/main.py vdb_upload pipeline \
```bash
python examples/llm/main.py vdb_upload pipeline \
--source_type rss --source_type filesystem \
--file_source "./morpheus/data/*" \
--file_source="./examples/data/vdb_upload/*.jsonlines" \
--interval_secs 600 \
--enable_cache \
--enable_monitors \
Expand Down

0 comments on commit b72ca4c

Please sign in to comment.