-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large memory usage? #21
Comments
The token dictionary takes up most of the allocated memory. We need to keep the entire dictionary in memory so that encoding text into tokens and vice versa is efficient. Currently, the built-in Profile<?php
use Yethee\Tiktoken\EncoderProvider;
require_once 'vendor/autoload.php';
$provider = new EncoderProvider();
$encoder = $provider->get('<encoding>'); Top of memory usage: Vocab::fromStream() Encoding: cl100k_base
Encoding: o200k_base
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Take the given test:
26mb seems a bit much no? Especially considering the cached vocab is only
3.6mb
.The text was updated successfully, but these errors were encountered: