This part of the HairX image generation software is used to set up and run a StableDiffusion installation on a RunPod serverless environment. When properly set up, it exposes a public Stable Diffusion API we can call to generate images with our self-hosted Stable Diffusion environment, based on the Auto1111 SD WebUI.
This setup requires two main components:
- A RunPod serverless endpoint: Runs the Docker image and exposes the API.
- A Network Volume: Stores models and other necessary files.
This repository provides:
- All data needed to build the Docker image for the serverless worker
- Instructions for installing data on the Network Volume
Key points:
- The Network Volume is set up once, with models and data installed. It remains persistent on the internet, incurring ongoing storage costs.
- The RunPod serverless endpoint is configured once but spawns workers on demand:
- When needed, a worker pulls the Docker image and starts a container
- When idle, the worker terminates and the container stops
To optimize performance:
- The Docker image is kept relatively small
- Stable Diffusion Web UI and models are stored on the Network Volume
- This approach balances startup times with the slight delay of accessing files over the internet connection between the RunPod worker and Network Volume
- Overall, this method is faster than using a very large Docker image
-
Log into https://www.runpod.io/
-
Ensure you have at least 20-30$ topped on your account
-
Create new empty network volume (50+ GB)
-
Deploy a temporary pod to install data on the network volume
select lightweight template like „RunPod Pytorch 2.1“ choose your network volume
-
For the pod, select „Connect to JupyterLab [8888]“
-
Open Terminal
-
In current folder, install data onto the volume by running an install script:
wget https://raw.githubusercontent.com/dubtor/hairx-runpod-worker-a1111/main/scripts/install.sh chmod +x install.sh ./install.sh
NOTE: this script is actually part of the current repository.
Installation takes around 30mins. New models need to be added to the script.
1.8 Wait until the terminal concludes with „Model loaded in XXXXs“
1.9 Ctrl+C, close the terminal
1.10 Terminate the pod, network volume is ready
- Build Docker image based on the current repository (if anything has changed)
docker build -t dubtor/hairx-runpod-worker-a1111:3.x.x .
- Push image to dockerhub (if anything has changed)
docker push dubtor/hairx-runpod-worker-a1111:3.x.x
- In Runpod, create new Template using the image
Select „serverless“ 5GB storage is enough Docker Container image (public on dockerhub): 1. dubtor/hairx-runpod-worker-a1111:3.x.x
-
Select your created template
-
Select Docker image you previously created
-
Write down the serverless endpoint ID and URL, example: „p1on5b85l3dlqu“
-
You can now send requests to the endpoint
Open the postman collection in the repo, enter your servless endpoint ID, and test the endpoints.
This is the source code for a RunPod Serverless worker that uses the Automatic1111 Stable Diffusion API for inference.
Important
A1111 1.9.0 API format has changed dramatically and is not
backwards compatible. You will need to ensure that you check
out the 2.5.0
release of this worker if you require backwards
compatibility, and also ensure that you are using A1111 1.8.0
and not version 1.9.0.
The model(s) for inference will be loaded from a RunPod Network Volume.
This worker includes the following A1111 extensions:
- Install Automatic1111 Web UI on your Network Volume
- Building the Docker image
- Deploying on RunPod Serveless
- Frequently Asked Questions
You can send requests to your RunPod API Endpoint using the /run
or /runsync
endpoints.
Requests sent to the /run
endpoint will be handled asynchronously,
and are non-blocking operations. Your first response status will always
be IN_QUEUE
. You need to send subsequent requests to the /status
endpoint to get further status updates, and eventually the COMPLETED
status will be returned if your request is successful.
Requests sent to the /runsync
endpoint will be handled synchronously
and are blocking operations. If they are processed by a worker within
90 seconds, the result will be returned in the response, but if
the processing time exceeds 90 seconds, you will need to handle the
response and request status updates from the /status
endpoint until
you receive the COMPLETED
status which indicates that your request
was successful.
- Get ControlNet Models
- Get Embeddings
- Get Extensions
- Get Face Restorers
- Get Hypernetworks
- Get Loras
- Get Latent Upscale Modes
- Get Memory
- Get Models
- Get Options
- Get Prompt Styles
- Get Real-ESRGAN Models
- Get Samplers
- Get Schedulers
- Get Script Info
- Get Scripts
- Get Upscalers
- Get VAE
- Image to Image
- Image to Image with ControlNet
- Interrogate
- Refresh Checkpoints
- Refresh Embeddings
- Refresh Loras
- Refresh VAE
- Set Model
- Set VAE
- Text to Image
- Text to Image with ReActor
- Text to Image with ADetailer
- Text to Image with InstantID
You can optionally Enable a Webhook.
Status | Description |
---|---|
IN_QUEUE | Request is in the queue waiting to be picked up by a worker. You can call the /status endpoint to check for status updates. |
IN_PROGRESS | Request is currently being processed by a worker. You can call the /status endpoint to check for status updates. |
FAILED | The request failed, most likely due to encountering an error. |
CANCELLED | The request was cancelled. This usually happens when you call the /cancel endpoint to cancel the request. |
TIMED_OUT | The request timed out. This usually happens when your handler throws some kind of exception that does return a valid response. |
COMPLETED | The request completed successfully and the output is available in the output field of the response. |
The serverless handler (rp_handler.py
) is a Python script that handles
the API requests to your Endpoint using the runpod
Python library. It defines a function handler(event)
that takes an
API request (event), runs the inference using the model(s) from your
Network Volume with the input
, and returns the output
in the JSON response.
- Postman Collection for this Worker
- Generative Labs YouTube Tutorials
- Getting Started With RunPod Serverless
- Serverless | Create a Custom Basic API
Pull requests and issues on GitHub are welcome. Bug fixes and new features are encouraged.