Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature-Request] Set a parallel translation limit for Custom Translator #507

Open
saturnsky opened this issue Aug 19, 2024 · 2 comments
Open
Labels
questions Further information is requested

Comments

@saturnsky
Copy link

Description

Some local translators use local computing power, which means they can't translate an unlimited amount of text simultaneously. For example:

  • CPU-based local translators may experience performance issues if trying to translate beyond the CPU's thread limit.
  • Translators using local LLMs often struggle to handle a large number of concurrent requests.

Considering these limitations, it's necessary to restrict the number of translation requests processed at any given moment.

Proposed Solution

Implement an option to limit concurrent translation requests. For example:

  • If the request limit is set to 8, the system would:
    1. Initially send translation requests for 8 sentences.
    2. As each translation completes, send an additional request.

This approach would maintain a constant number of active translation requests without overloading the system.

Benefits

This option would be beneficial for many types of Custom Translators, especially those with limited computing resources or those using local models.

@vitonsky
Copy link
Collaborator

@saturnsky what about method getRequestsTimeout? You may control time between requests.

Also, you may control how to handle requests with custom module methods implementation. Just collect queue and handle this queue as slow as you wish.

One more note is you should implement method translateBatch as efficient as possible to avoid problems. If you just call translation for every text from array, then you may have 3-9k requests in queue when you click translate average web page.

Did you optimized your code in this aspects?

@vitonsky vitonsky added the questions Further information is requested label Aug 19, 2024
@saturnsky
Copy link
Author

Thank you for your detailed response. I'd like to elaborate on my thoughts regarding the proposed feature and address the points you've raised.

Regarding getRequestsTimeout

While getRequestsTimeout is indeed helpful for managing rate limits with online translators, it may not be as effective for offline translators. The unpredictable nature of translation time for offline translators (e.g., 100ms for one sentence, 1 second for another) makes it challenging to set a fixed timeout. For offline translators, a more appropriate approach would be to send the next request as soon as a translation result is received, rather than managing request frequency.

On translateBatch

I agree that translateBatch can be beneficial for online translators, where combining multiple sentences into a single chunk for server transmission and then splitting the results is efficient. However, for many offline translators, this approach may not be ideal:

  1. Offline translators often have computational costs proportional to sentence length.
  2. translateBatch in this context might only increase latency until results appear on screen, without improving overall throughput.
  3. Error handling becomes more complex and costly if issues occur during the translation of batched sentences.

Given these considerations, I think this is the most appropriate:

  1. Adding an option to disable translateBatch for Custom Translators.
  2. Implementing a parallel request limit using methods like Semaphore.

This approach would likely yield the best results for offline translators.

Of course, limiting parallel requests can be implemented directly by individual users in their Custom Translators using Semaphores, etc., so it is a lower priority issue than disabling translateBatch.
However, since this scenario is common when using offline translators, I thought it would be more convenient to provide it as an option in linguist.
And there is already an issue about Custom Translator not using translateBatch (issue #236). That is why I only wrote an issue about parallel request limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
questions Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants