The SuperSONIC project implements common server infrastructure for GPU inference-as-a-service to accelerate machine learining algorithms at large high energy physics (HEP) and multi-messenger astrophysics (MMA) experiments. The server infrastructure is designed for deployment at Kubernetes clusters equipped with GPUs.
The main components of SuperSONIC are:
- Nvidia Triton inference servers
- Dynamic muti-purpose Envoy Proxy:
- Load balancing
- GPU saturation prevention
- Token-based authentication (optional)
- Load-based autoscaling via KEDA
CMS | ATLAS | IceCube | |
---|---|---|---|
Geddes cluster (Purdue) | ✅ | - | - |
Nautilus cluster (NRP) | ✅ | ⏳ | ✅ |