Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jinaai/jina-reranker-v1-*-en does not work with optimum #362

Open
2 of 4 tasks
rawsh opened this issue Sep 13, 2024 · 1 comment
Open
2 of 4 tasks

jinaai/jina-reranker-v1-*-en does not work with optimum #362

rawsh opened this issue Sep 13, 2024 · 1 comment

Comments

@rawsh
Copy link
Contributor

rawsh commented Sep 13, 2024

System Info

py3.10
infinity-emb 0.0.55

INFO     2024-09-13 15:19:59,927 datasets INFO: PyTorch version 2.4.0 available.                                                            config.py:59
INFO:     Started server process [76898]
INFO:     Waiting for application startup.
INFO     2024-09-13 15:20:01,042 infinity_emb INFO: model=`jinaai/jina-reranker-v1-tiny-en` selected, using engine=`optimum` and      select_model.py:62
         device=`cpu`                                                                                                                                   
INFO     2024-09-13 15:20:01,393 infinity_emb INFO: Found 7 onnx files: [PosixPath('onnx/model.onnx'),                              utils_optimum.py:217
         PosixPath('onnx/model_bnb4.onnx'), PosixPath('onnx/model_fp16.onnx'), PosixPath('onnx/model_int8.onnx'),                                       
         PosixPath('onnx/model_q4.onnx'), PosixPath('onnx/model_quantized.onnx'), PosixPath('onnx/model_uint8.onnx')]                                   
INFO     2024-09-13 15:20:01,401 infinity_emb INFO: Using onnx/model_quantized.onnx as the model                                    utils_optimum.py:221
INFO     2024-09-13 15:20:01,412 infinity_emb INFO: Optimized model found at                                                        utils_optimum.py:120
         /Users/robert/.cache/huggingface/hub/infinity_onnx/CPUExecutionProvider/jinaai/jina-reranker-v1-tiny-en/model_quantized_op                     
         timized.onnx, skipping optimization                                                                                                            
The ONNX file model_quantized_optimized.onnx is not a regular name used in optimum.onnxruntime, the ORTModel might not behave as expected.
ERROR:    Traceback (most recent call last):
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/infinity_server.py", line 63, in lifespan
    app.engine_array = AsyncEngineArray.from_args(engine_args_list)  # type: ignore
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/engine.py", line 259, in from_args
    return cls(engines=tuple(engines))
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/engine.py", line 67, in from_args
    engine = cls(**engine_args.to_dict(), _show_deprecation_warning=False)
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/engine.py", line 53, in __init__
    self._model, self._min_inference_t, self._max_inference_t = select_model(
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/inference/select_model.py", line 76, in select_model
    loaded_engine.warmup(batch_size=engine_args.batch_size, n_tokens=1)
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/abstract.py", line 86, in warmup
    return run_warmup(self, inp)
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/abstract.py", line 180, in run_warmup
    model.encode_post(embed)
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/quantization/interface.py", line 141, in wrapper
    embeddings = func(self, *args, **kwargs)
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/embedder/optimum.py", line 105, in encode_post
    return normalize(embedding).astype(np.float32)
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/utils_optimum.py", line 47, in normalize
    norm = np.linalg.norm(input_array, ord=p, axis=dim, keepdims=True)
  File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 2583, in norm
    return sqrt(add.reduce(s, axis=axis, keepdims=keepdims))
numpy.exceptions.AxisError: axis 1 is out of bounds for array of dimension 1

ERROR:    Application startup failed. Exiting.

Information

  • Docker
  • The CLI directly via pip

Tasks

  • An officially supported command
  • My own modifications

Reproduction

infinity_emb v2 --model-id jinaai/jina-reranker-v1-tiny-en --device cpu --engine optimum

Expected behavior

onnx works with jina

@wirthual
Copy link
Collaborator

wirthual commented Oct 31, 2024

Hi @rawsh ,
This seems to be the same issue as #325

I managed to get this model running by setting the architecture type to JinaBertForSequenceClassification and an additonal parameter "num_labels":1.

Here is the local config I used:

{
  "_name_or_path": "/home/wirthual/development/jina-reranker-v1-tiny-en/",
  "architectures": [
    "JinaBertForSequenceClassification"
  ],
  "num_labels":1,
  "attention_probs_dropout_prob": 0.1,
  "attn_implementation": null,
  "auto_map": {
    "AutoConfig": "configuration_bert.JinaBertConfig",
    "AutoModel": "modeling_bert.JinaBertModel",
    "AutoModelForMaskedLM": "modeling_bert.JinaBertForMaskedLM",
    "AutoModelForQuestionAnswering": "modeling_bert.JinaBertForQuestionAnswering",
    "AutoModelForSequenceClassification": "modeling_bert.JinaBertForSequenceClassification",
    "AutoModelForTokenClassification": "modeling_bert.JinaBertForTokenClassification"
  },
  "classifier_dropout": null,
  "emb_pooler": "mean",
  "feed_forward_type": "geglu",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 384,
  "initializer_range": 0.02,
  "intermediate_size": 1536,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 8192,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 4,
  "pad_token_id": 0,
  "position_embedding_type": "alibi",
  "torch_dtype": "float16",
  "transformers_version": "4.44.2",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 61056
}

Where the model correctly showed up as rerank model.

I created a PR for it so you can try it out using the --revision flag:
infinity_emb v2 --model-id jinaai/jina-reranker-v1-tiny-en --device cpu --engine optimum --revision refs/pr/9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants