Skip to content

Latest commit

 

History

History
22 lines (19 loc) · 801 Bytes

ROADMAP.md

File metadata and controls

22 lines (19 loc) · 801 Bytes

Roadmap

Functionality

  • Explore tree sparsity
  • Fine-tune Medusa heads together with LM head from scratch
  • Distill from any model without access to the original training data
  • Batched inference
  • Fine-grained KV cache management

Integration

Local Deployment

Serving

Research

  • Optimize the tree-based attention to reduce additional computation
  • Improve the acceptance scheme to generate more diverse sequences