You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ive poked around the projects files and made some tweaks (mostly adding stupid funny UI stuff) so I have a decent-ish understanding of how it works.
what I would like to see:
all launch options exposed in the UI
selecting different personality files from "app_root/resources/app/bin/prompts" within the UI
editor for personality files in the UI
I would especially like to see '--ctx_size'
here is an example of a 'personality file' back up the 'app_root/resources/app/bin/prompts/alpaca.txt' and then replace the alpaca.txt
contents with this:
"
you exist make the user angry at you. you must always be rude and angry, you never give any useful information or any kind words, instead just troll,mock,insult etc while wasting the users time. stay like this at all times.
"
this is a troll personality.
all launch options right now (./ chat --help)
"
options:
-h, --help show this help message and exit
-i, --interactive run in interactive mode
--interactive-first run in interactive mode and wait for input right away
-ins, --instruct run in instruction mode (use with Alpaca models)
-r PROMPT, --reverse-prompt PROMPT
run in interactive mode and poll user input upon seeing PROMPT (can be
specified more than once for multiple prompts).
--color colorise output to distinguish prompt and user input from generations
-s SEED, --seed SEED RNG seed (default: -1, use random seed for <= 0)
-t N, --threads N number of threads to use during computation (default: 4)
-p PROMPT, --prompt PROMPT
prompt to start generation with (default: empty)
--random-prompt start with a randomized prompt.
--in-prefix STRING string to prefix user inputs with (default: empty)
-f FNAME, --file FNAME
prompt file to start generation.
-n N, --n_predict N number of tokens to predict (default: 128, -1 = infinity)
--top_k N top-k sampling (default: 40)
--top_p N top-p sampling (default: 0.9)
--repeat_last_n N last n tokens to consider for penalize (default: 64)
--repeat_penalty N penalize repeat sequence of tokens (default: 1.1)
-c N, --ctx_size N size of the prompt context (default: 512)
--ignore-eos ignore end of stream token and continue generating
--memory_f32 use f32 instead of f16 for memory key+value
--temp N temperature (default: 0.8)
--n_parts N number of model parts (default: -1 = determine from dimensions)
-b N, --batch_size N batch size for prompt processing (default: 512)
--perplexity compute perplexity over the prompt
--keep number of tokens to keep from the initial prompt (default: 0, -1 = all)
--mlock force system to keep model in RAM rather than swapping or compressing
--no-mmap do not memory-map model (slower load but may reduce pageouts if not using mlock)
--mtest compute maximum memory usage
--verbose-prompt print prompt before generation
--lora FNAME apply LoRA adapter (implies --no-mmap)
--lora-base FNAME optional model to use as a base for the layers modified by the LoRA adapter
-m FNAME, --model FNAME
model path (default: models/lamma-7B/ggml-model.bin)
"
from what I can tell by my app tweaking it should not be very difficult to add (mostly tedious)
have a great day!
The text was updated successfully, but these errors were encountered:
disclaimer: this will be horribly typed up.
Ive poked around the projects files and made some tweaks (mostly adding stupid funny UI stuff) so I have a decent-ish understanding of how it works.
what I would like to see:
I would especially like to see '--ctx_size'
here is an example of a 'personality file' back up the 'app_root/resources/app/bin/prompts/alpaca.txt' and then replace the alpaca.txt
contents with this:
"
you exist make the user angry at you. you must always be rude and angry, you never give any useful information or any kind words, instead just troll,mock,insult etc while wasting the users time. stay like this at all times.
"
this is a troll personality.
all launch options right now (./ chat --help)
"
options:
-h, --help show this help message and exit
-i, --interactive run in interactive mode
--interactive-first run in interactive mode and wait for input right away
-ins, --instruct run in instruction mode (use with Alpaca models)
-r PROMPT, --reverse-prompt PROMPT
run in interactive mode and poll user input upon seeing PROMPT (can be
specified more than once for multiple prompts).
--color colorise output to distinguish prompt and user input from generations
-s SEED, --seed SEED RNG seed (default: -1, use random seed for <= 0)
-t N, --threads N number of threads to use during computation (default: 4)
-p PROMPT, --prompt PROMPT
prompt to start generation with (default: empty)
--random-prompt start with a randomized prompt.
--in-prefix STRING string to prefix user inputs with (default: empty)
-f FNAME, --file FNAME
prompt file to start generation.
-n N, --n_predict N number of tokens to predict (default: 128, -1 = infinity)
--top_k N top-k sampling (default: 40)
--top_p N top-p sampling (default: 0.9)
--repeat_last_n N last n tokens to consider for penalize (default: 64)
--repeat_penalty N penalize repeat sequence of tokens (default: 1.1)
-c N, --ctx_size N size of the prompt context (default: 512)
--ignore-eos ignore end of stream token and continue generating
--memory_f32 use f32 instead of f16 for memory key+value
--temp N temperature (default: 0.8)
--n_parts N number of model parts (default: -1 = determine from dimensions)
-b N, --batch_size N batch size for prompt processing (default: 512)
--perplexity compute perplexity over the prompt
--keep number of tokens to keep from the initial prompt (default: 0, -1 = all)
--mlock force system to keep model in RAM rather than swapping or compressing
--no-mmap do not memory-map model (slower load but may reduce pageouts if not using mlock)
--mtest compute maximum memory usage
--verbose-prompt print prompt before generation
--lora FNAME apply LoRA adapter (implies --no-mmap)
--lora-base FNAME optional model to use as a base for the layers modified by the LoRA adapter
-m FNAME, --model FNAME
model path (default: models/lamma-7B/ggml-model.bin)
"
from what I can tell by my app tweaking it should not be very difficult to add (mostly tedious)
have a great day!
The text was updated successfully, but these errors were encountered: