Replies: 2 comments 1 reply
-
This is not currently supported, but it would be nice to implement. I was wondering if there is a simple workaround where a memory buffer containing the GGUF model could be mapped to a virtual file which then in turn could be loaded using the existing |
Beta Was this translation helpful? Give feedback.
-
I have a draft PR doing exactly this, but back then I couldn't find any valid usage for it outside of webassembly. The PR is here: #9125 Mobile apps is another valid use case that you mentioned. So I think it's now worth finishing this. I'll go back to my PR in near future. |
Beta Was this translation helpful? Give feedback.
-
Hi,
In whisper.cpp, there is a method whisper_init_from_buffer_with_params that allows loading a model directly from a memory buffer. This feature is particularly helpful for scenarios like mobile applications, where the model file is packaged and needs to be accessed in-memory without file I/O.
I’m wondering if llama.cpp could support a similar approach for loading models directly from a memory buffer. This would simplify usage in environments where file-based access is constrained or not feasible.
Is this feature already supported or something planned for the future? If not, could it be considered for implementation?
Beta Was this translation helpful? Give feedback.
All reactions