Sukima is a ready-to-deploy container that implements a REST API for Language Models designed with the specific purpose of easy deployment and scalability.
- models : Fetch a list of ready-to-use Language Models for inference.
- load : Allocate a Language Model.
- generate : Use a Language Model to generate tokens.
- classify : Use a Language Model to classify tokens and retrieve scores.
To view more information for API Usage, see /docs
endpoint.
- Autoscaling
- HTTPS Support
- Rate Limiting
- Support for other Language Modeling tasks such as Sentiment Analysis and Named Entity Recognition.