Skip to content
Ankit Gupta edited this page Apr 17, 2018 · 35 revisions

ServiceQ

ServiceQ is a HTTP load balancer and request queue. All incoming HTTP/HTTPS requests are load balanced across multiple endpoints based on an intuitive algorithm and are buffered in case of network failures or api errors. The buffering provides assurance to clients that the requests will be executed irrespective of the present state of the system.

Configuration


All configurations are handled by file sq.properties. The permanent location of this file is /opt/serviceq/config/sq.properties. There are 4 mandatory properties - LISTENER_PORT, PROTO, ENDPOINTS, CONCURRENCY_PEAK and rest are optional or to be left default. Default LISTENER_PORT is 5252 and default PROTO is http. Every property is added to a separate line and can be commented by a # prefix.

LISTENER_PORT=5252

PROTO=http (same for http and https)

Service Endpoints


The group of upstream services (1 or many) can be deployed on a set of servers/ports. They are added as a comma-separated list to ENDPOINTS. Make sure scheme is added to every endpoint with an optional port. If port is not provided, ServiceQ will consider http and https endpoints to be running on 80 and 443 respectively. Although not suggestible, but endpoints list can contain a combination of services running on both http and https scheme.

ENDPOINTS=https://api.server2.com,https://api.server0.com:8000,https://api.server1.com:8001

Concurrency Peak


ServiceQ can handle and distribute concurrent requests pretty well, and is only limited by the load handling capabilities of upstream services. That is why it is encouraged to load test services and find out maximum load the cluster can take. Determining this at cluster level is important because the bottleneck is usually a central database or message queue or throttled third party service, being accessed from all services. Both the active connections and buffered queue are governed by this limit.

CONCURRENCY_PEAK=2048

Deferred Requests


If n out of n nodes are down, the requests are queued up. These are forwarded in FIFO order when any one node is available next. Though the system doesn't place restriction, unless asked to, on the kind of requests that can be queued up and forwarded, it is important to note the implications of the same. ServiceQ sends 503 Service Unavailable to the client if all nodes are down. The deferred request behaviour thus becomes desirable, in cases of requests that contain HTTP methods which change the state of the system and client's workflow is not dependant on the response. So, a fire and forget PUT request, when all services were down, will go and update the destination system, albeit at a later point in time. On the other hand, if the client has exited after firing a GET request, and ServiceQ tries to get response on next availability, the result of GET is lost and is an overhead to the system. This should be avoided. The control is provided to the user on whether to enable queueing and the kind of requests to be considered for queueing (for example we might want to have only POST/PUT/PATCH/DELETE on specific routes to be buffered).

Enable/Disable deferred queue

ENABLE_DEFERRED_Q=true

Format of Request to enable deferred queue for (Suffix ! if disabling for a single method+route combination). First token should always be method followed by optional route and optional !. This option only works if ENABLE_DEFERRED_Q=true. Few examples -

DEFERRED_Q_REQUEST_FORMATS=ALL (buffer all)
DEFERRED_Q_REQUEST_FORMATS=POST,PUT,PATCH,DELETE (buffer these methods)
DEFERRED_Q_REQUEST_FORMATS=POST /orders !,POST,PUT /orders,PATCH,DELETE (buffer POST except /orders, block PUT except PUT /orders)
DEFERRED_Q_REQUEST_FORMATS=POST /orders,PATCH (buffer POST /orders, PATCH)

Cluster State and Behaviour


n-node cluster healthy, all nodes up

Active connections are forwarded to one of the nodes in the cluster. The choice of node is made after consulting with the routing algorithm, which is described later in the documentation. The maximum number of active connections are governed by CONCURRENCY_PEAK setting.

n-node cluster unhealthy, [1:n-1] nodes down

Process is same as above, except that the error rate increases, which is stored in a hashset against the service address.

n-node cluster unhealthy, n nodes down

Error rate shoots to 100%, and request is bufferred in a FIFO queue. The bufferred request remain in the queue until any one service is re-available. If there are active connections being accepted at the same time, they are forwarded concurrently to bufferred requests. There is no precedence logic here.

Routing Algorithm


ServiceQ is built to use a combination of randomized and round robin approach to routing. The algorithm selects a random node at first and tries to forward the request. If this request fails, then rest of the nodes are selected in a round robin manner for a total of 2*n+1 retries, after which error is raised, and the request is queued, if eligible. This, in my opinion, gives a fair chance to all nodes and distributes load within 10% deviation (if we use 'number of servers' as the only metric).

Client Responses


If upstream and clients are both alive, ServiceQ simply tunnels the response from upstream to client. In case of failures, specific responses are provided. For example -

Upstream Connected    - Tunneled Response
All nodes are down    - 503 Service Unavailable
Request Timed Out     - 504 Gateway Timeout
Request Malformed     - 400 Bad Request
Undeterministic Error - 502 Bad Gateway
Clone this wiki locally