Support for Loading Model from Memory Buffer #10990

sienaiwun · 2024-12-26T21:41:11Z

sienaiwun
Dec 26, 2024

Hi,

In whisper.cpp, there is a method whisper_init_from_buffer_with_params that allows loading a model directly from a memory buffer. This feature is particularly helpful for scenarios like mobile applications, where the model file is packaged and needs to be accessed in-memory without file I/O.

I’m wondering if llama.cpp could support a similar approach for loading models directly from a memory buffer. This would simplify usage in environments where file-based access is constrained or not feasible.

Is this feature already supported or something planned for the future? If not, could it be considered for implementation?

ggerganov · 2024-12-31T09:59:15Z

ggerganov
Dec 31, 2024
Maintainer

This is not currently supported, but it would be nice to implement. I was wondering if there is a simple workaround where a memory buffer containing the GGUF model could be mapped to a virtual file which then in turn could be loaded using the existing llama_load_model_from_file() - this would avoid extending the API to support loading from buffers.

0 replies

ngxson · 2024-12-31T14:46:57Z

ngxson
Dec 31, 2024
Collaborator

I have a draft PR doing exactly this, but back then I couldn't find any valid usage for it outside of webassembly. The PR is here: #9125

Mobile apps is another valid use case that you mentioned. So I think it's now worth finishing this. I'll go back to my PR in near future.

1 reply

lcarrere Jan 1, 2025

I couldn't find any valid usage for it outside of webassembly

For webassembly, it seems essential to me. Additionally, there are various scenarios where loading content from the memory (even from stream) can be beneficial, possibly for security reasons. Ideally, we would support a streaming interface as input rather than a simple memory buffer. A kind of IO structure defining callbacks for basic seek, read, write, tell, flush, eof, size operations. Having such standardization across all libraries would be highly beneficial imo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Loading Model from Memory Buffer #10990

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Support for Loading Model from Memory Buffer #10990

sienaiwun Dec 26, 2024

Replies: 2 comments · 1 reply

ggerganov Dec 31, 2024 Maintainer

ngxson Dec 31, 2024 Collaborator

lcarrere Jan 1, 2025

sienaiwun
Dec 26, 2024

Replies: 2 comments 1 reply

ggerganov
Dec 31, 2024
Maintainer

ngxson
Dec 31, 2024
Collaborator