Skip to content
Change the repository type filter

All

    Repositories list

    • GPTQModel

      Public
      Production ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
      Python
      Apache License 2.0
      3014294Updated Dec 11, 2024Dec 11, 2024
    • Self-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separate tools such as nvidia-smi or /proc/cpuinfo and parsing it yourself.
      Python
      Apache License 2.0
      0903Updated Dec 10, 2024Dec 10, 2024