Generative Redfoot

A generative, conversational workflow and multi-agent system using PDL and mlx

Takes a minimal Prompt Declaration Language (PDL) file and generates a finite state generative machine as Python objects for a subset of the PDL language. These objects (the "programs" in particular) can be executed.

It was mainly motivated by supporting this use case from the PDL documentation/paper:

The Model class can be extended and incorporated into how a dispatcher creates the PDL Python objects from a PDL file to incorporate the functionality for evaluating the prompts against the models specified in PDL using any accumulated conversational context, prompts, and generation parameters (sampling parameters, for example), (optionally) updating the context as the program execution continues. This is how mlx is used to implement the model loading and inference.

However, the language of the PDL file can be extended with additional custom functionality, and other LLM systems can handle the evaluation.

It depends on the PyYaml and click third-party Python libraries as well as mlx and can be run this way, where document.pdl is a PDL file.

% Usage: generative_redfoot [OPTIONS] PDL_FILE

Options:
  -t, --temperature FLOAT
  -rp, --repetition-penalty FLOAT
                                  The penalty factor for repeating tokens
                                  (none if not used)
  --top_k INTEGER                 Sampling top_k
  --max_tokens INTEGER            Max tokens
  --min-p FLOAT                   Sampling min-p
  --verbose / --no-verbose
  -v, --variables <TEXT TEXT>...
  --help                          Show this message and exit.

generative_redfoot.py document.pdl

The main argument is a PDL document, possibly with extensions of the language implemented by generative_redfoot.

You can also specify default values for sampling parameters for the LLM calls during the execution of the programs using mlx.

The model parameters directive in PDL can be used to specify the following mlx generation parameters: temperature, top_k, min_p, max_tokens, and top_p:

description: ...
text:
  - read:
    message: |
      What is your query?
    contribute: [context]
- model: .. model ..
  parameters:
    temperature: 0.6
    min_p: .03
    max_tokens: 200

Below is an example showing a PDL file constructing message contexts for prompts to chained LLM calls from fragments in a Wordloom library, providing a clean separation of concerns between prompt language management, prompt construction, and LLM workflow management and orchestration. The keys in the YAML file in black use the PDL language. Those in red are generative_redfoot extensions shown in order of appearance: (mlx) prefix caching, COT few-shot loading, reading from a wordloom file, using Google's google/gemma-7b-aps-it model to perform "abstractive proposition segmentation" from LLM output, etc.:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Generative Redfoot

Files

README.md

Latest commit

History

README.md

File metadata and controls

Generative Redfoot