Xinference

Xorbits Inference(Xinference) empowers you to unleash the full potential of cutting-edge AI models.

Install

pip install "xinference[all]"
Docker

To start a local instance of Xinference, run the following command:

$ xinference-local --host 0.0.0.0 --port 9997

Launch Xinference

Decide which LLM you want to deploy (here's a list for supported LLM), say, mistral. Execute the following command to launch the model, remember to replace ${quantization} with your chosen quantization method from the options listed above:

$ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization}

Use Xinference in RAGFlow

Go to 'Settings > Model Providers > Models to be added > Xinference'.

Base URL: Enter the base URL where the Xinference service is accessible, like, http://<your-xinference-endpoint-domain>:9997/v1.

Use Xinference Models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xinference.md

xinference.md

Xinference

Install

Launch Xinference

Use Xinference in RAGFlow

Files

xinference.md

Latest commit

History

xinference.md

File metadata and controls

Xinference

Install

Launch Xinference

Use Xinference in RAGFlow