Together AI Inference Engine #1271

nikshepsvn · 2023-11-20T19:08:23Z

Feature request

Saw a blog post where together.ai is advertising 3x inference performance via their API, I'm sure there are some optimization techniques they are using this repo can benefit from
https://www.together.ai/blog/together-inference-engine-v1

Motivation

Faster inference!

Your contribution

Happy to help if there is overlap with my skillset

Narsil · 2023-12-05T11:22:19Z

They are using medusa version, which is just a different model.

It's going to get support very soon #1308, but it will require creating those medusa models to make it fast, and there are very little open source ones currently (although we hope people will add more since the speedup is quite significant, even more than advertised here).

The PR is also going to add support for regular speculative decoding, which should get you a significant speedup on any model too actually without needing any modifications.

fancyerii · 2023-12-14T03:31:41Z

@nikshepsvn it seems together-inference-engine-v1 is not a open source project? We can only host our model to its cloud but can not deploy them locally.

github-actions · 2024-01-14T01:50:17Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions bot added the Stale label Jan 14, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Together AI Inference Engine #1271

Together AI Inference Engine #1271

nikshepsvn commented Nov 20, 2023 •

edited

Loading

Narsil commented Dec 5, 2023

fancyerii commented Dec 14, 2023

github-actions bot commented Jan 14, 2024

Together AI Inference Engine #1271

Together AI Inference Engine #1271

Comments

nikshepsvn commented Nov 20, 2023 • edited Loading

Feature request

Motivation

Your contribution

Narsil commented Dec 5, 2023

fancyerii commented Dec 14, 2023

github-actions bot commented Jan 14, 2024

nikshepsvn commented Nov 20, 2023 •

edited

Loading