Rule-based Retrieval

The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applications with advanced filtering capabilities. It seamlessly integrates with OpenAI for text generation and Pinecone for efficient vector database management.

Installation

Prerequisites

Python 3.10 or higher
OpenAI API key
Pinecone API key

Install from PyPI

You can install the package directly from PyPI using pip:

pip install rule-based-retrieval

export OPENAI_API_KEY=<your open ai api key>
export PINECONE_API_KEY=<your pinecone api key>

Install from GitHub

Alternatively, you can clone the repo and install the package:

git clone [email protected]:whyhow-ai/rule-based-retrieval.git
cd rule-based-retrieval
pip install .

export OPENAI_API_KEY=<your open ai api key>
export PINECONE_API_KEY=<your pinecone api key>

Developer Install

For a developer installation, use an editable install and include the development dependencies:

pip install -e .[dev]

For ZSH:

pip install -e ".[dev]"

If you want to install the package directly without explicitly cloning yourself run

pip install git+ssh://[email protected]/whyhow-ai/rule-based-retrieval

Documentation

Documentation can be found here.

To serve the docs locally run

pip install -e .[docs]
mkdocs serve

For ZSH:

pip install -e ".[docs]"
mkdocs serve

Navigate to http://127.0.0.1:8000/ in your browser to view the documentation.

Examples

Check out the examples/ directory for sample scripts demonstrating how to use the Rule-based Retrieval package.

How to

Demo

Create index & upload

from whyhow import Client

# Configure parameters
index_name = "whyhow-demo"
namespace = "demo"
pdfs = ["harry_potter_book_1.pdf"]

# Initialize client
client = Client()

# Create index
index = client.get_index(index_name)

# Upload, split, chunk, and vectorize documents in Pinecone
client.upload_documents(index=index, documents=pdfs, namespace=namespace)

Query with rules

from whyhow import Client, Rule

# Configure query parameters
index_name = "whyhow-demo"
namespace = "demo"
question = "What does Harry wear?"
top_k = 5

# Initialize client
client = Client()

# Create rules
rules = [
    Rule(
        filename="harry_potter_book_1.pdf",
        page_numbers=[21, 22, 23]
    ),
    Rule(
        filename="harry_potter_book_1.pdf",
        page_numbers=[151, 152, 153, 154]
    )
]

# Run query
result = client.query(
    question=question,
    index=index,
    namespace=namespace,
    rules=rules,
    top_k=top_k,
)

answer = result["answer"]
used_contexts = [
    result["matches"][i]["metadata"]["text"] for i in result["used_contexts"]
]
print(f"Answer: {answer}")
print(
    f"The model used {len(used_contexts)} chunk(s) from the DB to answer the question"
)

Query with keywords

from whyhow import Client, Rule

client = Client()

index = client.get_index("amazing-index")
namespace = "books"

question = "What does Harry Potter like to eat?"

rule = Rule(
    filename="harry-potter.pdf",
    keywords=["food", "favorite", "likes to eat"]
)

result = client.query(
    question=question,
    index=index,
    namespace=namespace,
    rules=[rule],
    keyword_trigger=True
)

print(result["answer"])
print(result["matches"])
print(result["used_contexts"])

Query each rule seperately

from whyhow import Client, Rule

client = Client()

index = client.get_index("amazing-index")
namespace = "books"

question = "What is Harry Potter's favorite food?"

rule_1 = Rule(
    filename="harry-potter.pdf",
    page_numbers=[120, 121, 150]
)

rule_2 = Rule(
    filename="harry-potter-volume-2.pdf",
    page_numbers=[80, 81, 82]
)

result = client.query(
    question=question,
    index=index,
    namespace=namespace,
    rules=[rule_1, rule_2],
    process_rules_separately=True
)

print(result["answer"])
print(result["matches"])
print(result["used_contexts"])

Contributing

We welcome contributions to improve the Rule-based Retrieval package! If you have any ideas, bug reports, or feature requests, please open an issue on the GitHub repository.

If you'd like to contribute code, please follow these steps:

Fork the repository
Create a new branch for your feature or bug fix
Make your changes and commit them with descriptive messages
Push your changes to your forked repository
Open a pull request to the main repository

License

This project is licensed under the MIT License.

Support

WhyHow.AI is building tools to help developers bring more determinism and control to their RAG pipelines using graph structures. If you're thinking about, in the process of, or have already incorporated knowledge graphs in RAG, we’d love to chat at [email protected], or follow our newsletter at WhyHow.AI. Join our discussions about rules, determinism and knowledge graphs in RAG on our newly-created Discord.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src		src
tests		tests
.DS_Store		.DS_Store
.flake8		.flake8
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENCE.md		LICENCE.md
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rule-based Retrieval

Installation

Prerequisites

Install from PyPI

Install from GitHub

Developer Install

Documentation

Examples

How to

Demo

Create index & upload

Query with rules

Query with keywords

Query each rule seperately

Contributing

License

Support

About

Releases

Packages

Languages

License

recchris/rule-based-retrieval

Folders and files

Latest commit

History

Repository files navigation

Rule-based Retrieval

Installation

Prerequisites

Install from PyPI

Install from GitHub

Developer Install

Documentation

Examples

How to

Demo

Create index & upload

Query with rules

Query with keywords

Query each rule seperately

Contributing

License

Support

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages