Skip to content

Commit

Permalink
merge master
Browse files Browse the repository at this point in the history
  • Loading branch information
gtarpenning committed Aug 16, 2024
2 parents e3c966a + 6c74337 commit 2671be2
Show file tree
Hide file tree
Showing 155 changed files with 10,855 additions and 3,031 deletions.
50 changes: 50 additions & 0 deletions docs/CONTRIBUTING_DOCS.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ Satisfy the following dependencies to create, build, and locally serve Weave Doc
```yarn
npm install --global yarn
```
- Make sure your python environment is setup by running the following from the repro root:
- `pip install -r requirements.dev.txt`
- `pip install -e .`
- Install an IDE (e.g. VS Code) or Text Editor (e.g. Sublime)

 
Expand Down Expand Up @@ -77,3 +80,50 @@ git push origin <your-feature-branch>
```

8. Open a pull request from the new branch to the original repo.

## DocGen
Currently, we have 3 forms of doc generation:
1. Python Doc Gen
2. Service Doc Gen
3. Notebook Doc Gen

Assuming you have node and python packages installed, these can all be generated by running `make generate_reference_docs`.

Let's review some details about each process:

### Python Doc Gen

See: `docs/scripts/generate_python_sdk_docs.py` and `./docs/reference/python-sdk`

Python doc gen uses `lazydocs` as the core library for building markdown docs from our symbols. There are a few things to keep in mind:

1. `docs/scripts/generate_python_sdk_docs.py` contains an allow-list of modules to document. Since the Weave codebase is massive, it is far easier to just select what modules are useful for docs.
2. If the file does now have a `__docspec__` list of symbols, all non-underscore symbols will be documented. However, if it does have a `__docspec__`, that will further narrow the symbols to just that selection.
3. Documentation itself:
1. Module-level: Put a triple double quote (""") comment as the first line of the module to add module-level documentation
2. Classes: Put a triple double quote (""") comment as the first line of the class to add class-level docs
3. Methods / Functions: Put a triple double quote (""") comment as the first line of the implementation to add method/function-level docs
1. Currently attributes are not automatically documented. Instead, use the @property pattern.
2. `BaseModel`. For classes that inherit from `BaseModel`, we create a special field list automatically to overcome this limitation.

### Service Doc Gen

See `docs/scripts/generate_service_api_spec.py` and `./docs/reference/service-api`

Service doc generation loads the `openapi.json` file describing the server, processes it, then uses the `docusaurus-plugin-openapi-docs` plugin to generate markdown files from that specification.

To improve docs, basically follow FastAPI's instructions to create good Swagger docs by adding field-level and endpoint-level documentation using their APIs. Assuming your have made some changes, `docs/scripts/generate_service_api_spec.py` needs a server to point to. You can either deploy to prod, or run the server locally and point to it in `docs/scripts/generate_service_api_spec.py`. From there, `docs/scripts/generate_service_api_spec.py` will download the spec, clean it up, and build the docs!

### Notebook Doc Gen

See `docs/scripts/generate_notebooks.py`, `./docs/notebooks`, and `./docs/reference/gen_notebooks`.

This script will load all notebooks in `./docs/notebooks`, transforming them into viable markdown docs in `./docs/reference/gen_notebooks` which can be referenced by docusaurus just like any other markdown file. If you need header metadata, you can add a markdown block at the top of your notebook with:
```
<!-- docusaurus_head_meta::start
---
title: Head Metadata
---
docusaurus_head_meta::end -->
```

10 changes: 9 additions & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
@@ -1,13 +1,21 @@
generate_service_api_docs:
mkdir -p ./docs/reference/service-api
rm -rf ./docs/reference/service-api
mkdir -p ./docs/reference/service-api
python scripts/generate_service_api_spec.py
yarn docusaurus gen-api-docs all

generate_python_sdk_docs:
mkdir -p ./docs/reference/python-sdk
rm -rf ./docs/reference/python-sdk
mkdir -p ./docs/reference/python-sdk
python scripts/generate_python_sdk_docs.py

generate_reference_docs: generate_service_api_docs generate_python_sdk_docs
generate_notebooks_docs:
mkdir -p ./docs/reference/gen_notebooks
rm -rf ./docs/reference/gen_notebooks
mkdir -p ./docs/reference/gen_notebooks
python scripts/generate_notebooks.py

generate_reference_docs: generate_service_api_docs generate_python_sdk_docs generate_notebooks_docs
yarn build
10 changes: 5 additions & 5 deletions docs/docs/guides/integrations/llamaindex.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ In the example above, we are creating a simple LlamaIndex chat engine which unde

## Tracing

LlamaIndex is known for it's ease of connecting data with LLM. A simple RAG application requires an embedding step, retrieval step and a response synthesis step. With the increasing complexity, it becomes important to store traces of individual steps in a central database during both development and production.
LlamaIndex is known for its ease of connecting data with LLM. A simple RAG application requires an embedding step, retrieval step and a response synthesis step. With the increasing complexity, it becomes important to store traces of individual steps in a central database during both development and production.

These traces are essential for debugging and improving your application. Weave automatically tracks all calls made through the LlamaIndex library, including prompt templates, LLM calls, tools, and agent steps. You can view the traces in the Weave web interface.

Expand Down Expand Up @@ -68,7 +68,7 @@ Our integration leverages this capability of LlamaIndex and automatically sets [

Organizing and evaluating LLMs in applications for various use cases is challenging with multiple components, such as prompts, model configurations, and inference parameters. Using the [`weave.Model`](/guides/core-types/models), you can capture and organize experimental details like system prompts or the models you use, making it easier to compare different iterations.

The following example demonstrates building a LlamaIndex query engine in a `WeaveModel`:
The following example demonstrates building a LlamaIndex query engine in a `WeaveModel`, using data that can be found in the [weave/data](https://github.com/wandb/weave/tree/master/data) folder:

```python
import weave
Expand All @@ -84,7 +84,7 @@ You are given with relevant information about Paul Graham. Answer the user query
User Query: {query_str}
Context: {context_str}
Answer:
Answer:
"""

# highlight-next-line
Expand Down Expand Up @@ -123,11 +123,12 @@ class SimpleRAGPipeline(weave.Model):
llm=llm,
text_qa_template=prompt_template,
)

# highlight-next-line
@weave.op()
def predict(self, query: str):
llm = self.get_llm()
query_engine = self.get_query_engine(
# This data can be found in the weave repo under data/paul_graham
"data/paul_graham",
)
response = query_engine.query(query)
Expand All @@ -145,7 +146,6 @@ This `SimpleRAGPipeline` class subclassed from `weave.Model` organizes the impor

[![llamaindex_model.png](imgs/llamaindex_model.png)](https://wandb.ai/wandbot/test-llamaindex-weave/weave/calls?filter=%7B%22traceRootsOnly%22%3Atrue%7D&peekPath=%2Fwandbot%2Ftest-llamaindex-weave%2Fcalls%2Fa82afbf4-29a5-43cd-8c51-603350abeafd%3Ftracetree%3D1)


## Doing Evaluation with `weave.Evaluation`

Evaluations help you measure the performance of your applications. By using the [`weave.Evaluation`](/guides/core-types/evaluations) class, you can capture how well your model performs on specific tasks or datasets, making it easier to compare different models and iterations of your application. The following example demonstrates how to evaluate the model we created:
Expand Down
Loading

0 comments on commit 2671be2

Please sign in to comment.