Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt model:-None Loaded error Older gaming PC #1

Open
Willxiam opened this issue Jan 29, 2024 · 10 comments
Open

Prompt model:-None Loaded error Older gaming PC #1

Willxiam opened this issue Jan 29, 2024 · 10 comments

Comments

@Willxiam
Copy link

When testing this out (thank you for making it available) on an older gaming pc
Windows 10 PC
16 gb ram
i5 processor
NVIDIA GeForce® GTX 1070 GAMING PCI Express video card

I receive "Prompt-None Loaded error" when I submit a query.

Other Details:
Release v0.4.0
manually download recommended models tinyllama-1.1b following links. Copied into model directory.
Selected generic prompt template for model.
Downloading via Model manager did not seem to work.

Have tried reloading model and the force reload tool in main menu window.

@Willxiam Willxiam changed the title Prompt model:-None Loaded error Prompt model:-None Loaded error Older gaming PC Jan 29, 2024
@nathanlesage
Copy link
Owner

nathanlesage commented Jan 29, 2024

So in 0.4.0 there was a bug where with new conversations you had to manually force reload the model once; in 0.5.0 this should happen automatically. So I'd recommend updating and then (just for good measure) force reloading the model (in the statusbar the reload button).

Does this solve the issue?


EDIT: Because I saw you explicitly added "older gaming PC" into the issue title: I just double checked and the RTX 1070 supports CUDA and apparently it's backwards compatible, so it's likely not an issue with your GPU.

@Willxiam
Copy link
Author

Willxiam commented Feb 2, 2024

Updated to Release v0.7.1 and decided to again try and use the build in downloader, I realized I was using the wrong download link. See image below. It continued not to work after updating to the latest version, force reloading model in both the manage model window and in the main window. Along the bottom of application it says "provider not initialized (unknown). There is a button to force reload. WhenI click it it says "Could not reload model: None loaded"

Wrong download link to copy.
image

@Willxiam
Copy link
Author

Willxiam commented Feb 2, 2024

Update: After rebooting it will now initialize the provider. Showing " Model ready" along the bottom.

EDIT: It takes quite some time to generate anything. In both instances I have used this, I have thought it would be useful to have a stop generating.

@nathanlesage
Copy link
Owner

@Willxiam Exactly, I don't know why HuggingFace forces you to click on that download link to actually download the file, but that's the way it is …

Regarding stop generation button: While the model is generating, the "Force reload" button will turn into a "stop generating" button. If you click it, wait until the next token has been generated and the model will automatically stop generating.

@Willxiam
Copy link
Author

Willxiam commented Feb 5, 2024

I have used the force reload and sometimes it the program will hang. I have also found that sometimes trying to enter a new query will generate an error that will help it quite out of the former. I have experienced this on both the systems I have installed on, one being the older one which is harder to gauge what is due to the application and what is due to the system.

@nathanlesage
Copy link
Owner

Mhmh, could be — I just today released a new version that also allows for CUDA support. I haven't yet enabled that flag, but the 0.8.0 should run exclusively on the CPU. This is not making most use of the system, BUT it should at least get you up and running. Still, the Node bindings are still in beta, so there will probably be more improvements over time.

@Willxiam
Copy link
Author

Willxiam commented Feb 7, 2024

thanks. I will give the latest release a go. Cannot get to it until after work though. After I got localchat to recognize the model (not sure why it was not before) it would generate, but it would take quite some time. If I remember correctly, about 15 to 30 minutes for a quick prompt. And sometimes it would seem to get stuck in the attempt to generate. I am assuming that running just on the CPU might be slowe3r unless there is some issue interfacing with my GPU> It has occurred to me that maybe my gpu has an updates set of drivers. Or I could put a newer GPU in this system to get it working better.

@Willxiam
Copy link
Author

Willxiam commented Feb 7, 2024

Ok so I installed latest release and noticed that the reading of the model's meta data was much faster. And it also seems like generation is faster and more stable. The force reload was nearly instantaneous.

I also did some tests
7 minutes to generate answer to : What is the capital of France.?
Details: generated “the capital of france is Paris” at about the 450s.1s mark.
“the capital of france is Paris”
Stopped generating at 534.9s

12 minutes for What is 1+1?
generated 1+1= at about 330.5s
got the correct answer 2 at around 450.5s
then finished generating at 765.

So this is an improvement.
I pulled these test prompt form some forum someplace, but maybe there are other better ones.

@nathanlesage
Copy link
Owner

Regarding the slow speed: I haven't used an i5 in quite some time, so I don't know if these numbers are odd or to be expected. But it's definitely not decent. I haven't gotten a great idea of how to implement configuration, but once I do I'll enable the possibility to switch to the CUDA version of llama.cpp, which should increase inference speed drastically.

@Willxiam
Copy link
Author

Willxiam commented Feb 8, 2024

This is the reason I put older machine in the thread. I do not know either. But I will keep leaving feedback because I suspect many others will be trying to figure out the same. Thanks for all of your help. I will try to test out the models on the newer machine I have tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants