add solar pro support #9541

mxyng · 2024-09-18T22:38:06Z

solar pro introduces block skip connections where blocks are connected to other, non-sequential blocks with a scale multiple

this change adds 4 new keys to store the skip connections and one new tensor to store the scalar. the scalar is implemented as a 1-dimensional tensor with 2 elements derived from the model's bskcn_tv configuration. in general, the values are (bskcn_tv, 1 - bskcn_tv)

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

solar pro introduces block skip connections where blocks are connected to other, non-sequential blocks with a scale multiple this change adds 4 new keys to store the skip connections and one new tensor to store the scalar. the scalar is implemented a 1-dimensional tensor with 2 elements dervied from the model's bskcn_tv configuration. in general, the values are (bskcn_tv, 1 - bskcn_tv)

slaren · 2024-09-20T00:08:14Z

src/llama.cpp

@@ -2538,6 +2565,14 @@ struct llama_hparams {
            return ssm_d_state * ssm_d_inner;
        }
    }
+
+    bool n_bskcn(uint32_t n, uint32_t il = 0) const {


The n_ prefix implies that this returns an integer, however it returns a boolean.

SteelPh0enix · 2024-09-25T17:46:41Z

is this PR active and maintained?
it'd be nice to see this merged

vignesh1507

I agree with the changes.

compilade · 2024-10-06T20:13:20Z

convert_hf_to_gguf.py

+    def prepare_tensors(self):
+        if bskcn_tv := self.find_hparam(['bskcn_tv'], optional=True):
+          # use bskcn_tv[1] for inference since bskcn_tv[0] is for training
+          self.gguf_writer.add_tensor(self.format_tensor_name(gguf.MODEL_TENSOR.BSKCN_TV), np.array([bskcn_tv[1], 1 - bskcn_tv[1]], dtype=np.float32))
+
+        super().prepare_tensors()


I think this should override generate_extra_tensors instead of prepare_tensors. Otherwise LoRA conversion will not work properly, at least since #9396.

compilade · 2024-10-06T20:19:40Z

src/llama.cpp

+            if (hparams.n_bskcn(2, il)) {
+                inpSA = ggml_add(
+                   ctx0,
+                   ggml_mul(ctx0, bskcn_1, ggml_view_1d(ctx0, model.layers[il].bskcn_tv, 1, 0)),


bskcn_1 is not necessarily initialized here, because a model file could be crafted to make hparams.n_bskcn(2, il) return true while making hparams.n_bskcn(1, il) always return false.

compilade · 2024-10-06T20:30:44Z

convert_hf_to_gguf.py

+        for i, bskcn in enumerate(self.hparams[k] for k in self.hparams.keys() if k.startswith("bskcn_") and k != 'bskcn_tv'):
+            # store the skip connections as a layer index where a non-zero value indicates a skip connection
+            # this approach simplifies lookup at inference time
+            self.gguf_writer.add_block_skip_connection(i, [1 if n in bskcn else 0 for n in range(self.block_count)])


This assumes bskcn_{n} are in the correct order in config.json. Why not instead iterate them by their names?

Nexesenex · 2024-10-13T17:50:54Z

@mxyng Is this PR still on?

github-actions bot added the python python script changes label Sep 18, 2024

slaren reviewed Sep 20, 2024

View reviewed changes

vignesh1507 approved these changes Oct 6, 2024

View reviewed changes

compilade reviewed Oct 6, 2024

View reviewed changes

brankoradovanovic-mcom mentioned this pull request Oct 12, 2024

Upstage Solar Pro Preview model is not supported nomic-ai/gpt4all#2960

Open

mxyng closed this by deleting the head repository Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add solar pro support #9541

add solar pro support #9541

mxyng commented Sep 18, 2024

slaren Sep 20, 2024

SteelPh0enix commented Sep 25, 2024

vignesh1507 left a comment

compilade Oct 6, 2024 •

edited

Loading

compilade Oct 6, 2024

compilade Oct 6, 2024

Nexesenex commented Oct 13, 2024

add solar pro support #9541

add solar pro support #9541

Conversation

mxyng commented Sep 18, 2024

slaren Sep 20, 2024

Choose a reason for hiding this comment

SteelPh0enix commented Sep 25, 2024

vignesh1507 left a comment

Choose a reason for hiding this comment

compilade Oct 6, 2024 • edited Loading

Choose a reason for hiding this comment

compilade Oct 6, 2024

Choose a reason for hiding this comment

compilade Oct 6, 2024

Choose a reason for hiding this comment

Nexesenex commented Oct 13, 2024

compilade Oct 6, 2024 •

edited

Loading