Add QELM to list. #850

github-actions · 2024-12-11T21:53:42Z

Add QELM to list.

Closes #849

CLAassistant · 2024-12-11T21:53:48Z

All committers have signed the CLA.

Trivarian

Ted was here

frankharkins

Thank you!

frankharkins · 2024-12-12T15:53:12Z

ecosystem/resources/members/Qelm.toml

@@ -0,0 +1,10 @@
+name = "QELM"
+url = "https://github.com/R-D-BioTech-Alaska/Qelm"
+description = "An innovative project that merges the power of quantum computing with natural language processing to create a next-generation language model. Leveraging Qiskit and Qiskit Aer, QELM explores the potential of quantum circuits in enhancing language understanding and generation capabilities. Utilizing Qubits, Qelm can train a model that would normally take up gigabytes worth of data in llm files all the way down to miniature sizes. With this size, llm's can run instantly with no loss of capabilities or intelligence and on small computers instead requiring data centers to run single models. The goal is to create llm's that are instant, smarter and can be utilized anywhere and by anyone."


This description needs to be <135 characters, preferably focussing on what users can do with this package. Take a look at the other ecosystem projects for examples.

Also:

Utilizing Qubits, Qelm can train a model that would normally take up gigabytes worth of data in llm files all the way down to miniature sizes. With this size, llm's can run instantly with no loss of capabilities or intelligence and on small computers instead requiring data centers to run single models.

Can you explain what you mean by "minature sizes"? It would also be good for ecosystem users to show benchmarks for the size / performance claims.

The best way to describe "miniature sizes" in this context would be as ultra-compact models. So far, we have trained two basic quantum-enhanced language models (QELMs) using quantum-inspired parameters (working on the third). For testing purposes, we utilized CPU threading with 12 threads, ran 2 epochs, and used a learning rate of 0.05. Each training session averaged about 8 hours. Whereas we can leverage gpu, we are utilizing cpu to sort of "bottleneck" it.

Currently, the models produce nonsensical outputs because they were trained on small, basic datasets and have not yet been properly tokenized or configured for inference. Our primary goal was to test the training process and ensure it functioned as expected. We've recently started training with a real dataset based on Global Health Statistics, and our next step is to move on to the RedPajama 1T dataset for more comprehensive testing.

When comparing model sizes, we observed significant differences between a traditional LLM and our QELM. For example, training a small, fabricated dataset with a traditional LLM yielded a model size of approximately 125 MB, whereas the same dataset trained on QELM resulted in a model size of just 238 KB. Despite this compact size, both models shared similar training times and produced equally nonsensical outputs, highlighting the early stage of development for QELM. However the Qelm was able to respond instantly without any cpu throttling, whereas the llm did slightly.

This project is still in its infancy, and as it’s being handled primarily by myself with assistance from a friend, we haven’t reached the bench-marking phase yet. However, we hope to conduct performance benchmarks soon. We are also working on creating a dedicated website for the project.

Additionally, the 135-character input limitation issue has been resolved. If you need further details or have any questions, feel free to reach out. Thank you!

github-actions bot added the submission Project submission label Dec 11, 2024

Inserian approved these changes Dec 12, 2024

View reviewed changes

Trivarian approved these changes Dec 12, 2024

View reviewed changes

Trivarian reviewed Dec 12, 2024

View reviewed changes

iizjak approved these changes Dec 12, 2024

View reviewed changes

frankharkins requested changes Dec 12, 2024

View reviewed changes

Submission - Add QELM to list.

92f027a

github-actions bot force-pushed the submission-849 branch from 4a9b872 to 92f027a Compare December 13, 2024 00:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add QELM to list. #850

Add QELM to list. #850

github-actions bot commented Dec 11, 2024

CLAassistant commented Dec 11, 2024 •

edited

Loading

Trivarian left a comment

frankharkins left a comment

frankharkins Dec 12, 2024

Inserian Dec 13, 2024

Add QELM to list. #850

Are you sure you want to change the base?

Add QELM to list. #850

Conversation

github-actions bot commented Dec 11, 2024

CLAassistant commented Dec 11, 2024 • edited Loading

Trivarian left a comment

Choose a reason for hiding this comment

frankharkins left a comment

Choose a reason for hiding this comment

frankharkins Dec 12, 2024

Choose a reason for hiding this comment

Inserian Dec 13, 2024

Choose a reason for hiding this comment

CLAassistant commented Dec 11, 2024 •

edited

Loading