Proportional HTTP request rejection based on system load.
This library provides options to loadshed either a service by using a http middleware or calls to a dependent service using http transport.
This package exports a middleware via the middleware.New()
method that returns
a func(http.Handler) http.Handler
which should be compatible with virtually
any mux implementation or middleware chain management tool. By default, the
generated middleware is a passthrough. Load shedding based on system metrics
is enabled by passing to the constructor a load shedder with the options.
This package also exports a transport via the transport.New()
method that
returns a func(http.RoundTripper) http.RoundTripper
which should be compatible
with any RoundTripper management tool. By default, the generated RoundTripper is
a pass through. Load shedding based on system metrics is enabled by passing to
the constructor a load shedder with the options.
Both the middleware and transport have middleware and transport options you can pass in for callbacks and error codes apart from loadshed options. They have been incorporated in the examples below.
The CPU
option enables rejection of new requests based on CPU usage of the
host. The example below is for a middleware, with a callback option and cpu
loadshedder:
import (
loadshedmiddleware "github.com/asecurityteam/loadshed/wrappers/middleware"
)
var lowerThreshold = 0.6
var upperThreshold = 0.8
var pollingInterval = time.Second
var windowSize = 10
var middleware = loadshedmiddleware.New(
loadshed.New(
loadshed.CPU(lowerThreshold, upperThreshold, pollingInterval, windowSize)),
loadshedmiddleware.Callback(
http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
}),
)
The above configures the load shedding middleware to record a 10 second, rolling
window of CPU usage data. As long as the average CPU usage within the window
is below the lowerThreshold
value then all new requests pass through to the
wrapped handler. Once the rolling window exceed the lowerThreshold
then the
middleware will begin rejecting requests with a 503
response at a rate
proportional to the distance of the average between the two thresholds. Once the
value exceed the upper threshold then all new requests are rejected until it
lowers again.
The Concurrency
option enables rejections of new requests when there are too
many requests currently in flight.
import (
loadshedmiddleware "github.com/asecurityteam/loadshed/wrappers/middleware"
)
var lowerThreshold = 2500
var upperThreshold = 5000
var wg = loadshed.NewWaitGroup()
var middleware = loadshedmiddleware.New(
loadshed.New(loadshed.Concurrency(lowerThreshold, upperThreshold, wg)),
)
The above configures the load shedding middleware to track in-flight requests being handled by the server. The middleware will begin rejecting a proportional number of new requests between the lower and upper thresholds like the CPU option above.
For convenience, this package exposes a wrapper around the
sync.WaitGroup
feature in the standard library that wraps it in an interface
compatible with the metric aggregation system used by the middleware. The
loadshed.WaitGroup.Add()
method will be called on every new request and the
corresponding Done()
call as each request completes. This is intended to
act as a drop-in replacement for graceful shutdown uses of sync.WaitGroup
.
The AverageLatency
option enables rejection of new requests when the average
latency of all requests within a rolling time window is too high.
import (
loadshedmiddleware "github.com/asecurityteam/loadshed/wrappers/middleware"
)
var lowerThreshold = .2
var upperThreshold = 1.0
var bucketSize = time.Second
var buckets = 10
var preallocationHint = 2000
var middleware = loadshedmiddleare.NewMiddleware(
loadshed.New(
loadshed.AverageLatency(lowerThreshold, upperThreshold, bucketSize, buckets, preallocationHint, requiredPoints)),
loadshedmiddleware.Callback(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
..
}),
)
The above configures the load shedding middleware to track the duration of
handling requests. It records the information into bucketSize
segments of
time and keeps a rolling window of buckets
number of segments. The above,
for example, keeps a 10 second rolling window with a granularity of 1 second
intervals.
The upper and lower thresholds are the time, in fractional seconds, that it takes to execute the wrapped handler. As the average latency of all requests in the window grows beyond the lower threshold then the middleware will begin rejecting new requests. If the latency exceeds the upper threshold then all new requests will be rejected until the average drops again. This will happen over time either as outliers expire or until the entire window has rolled.
The requiredPoints
value sets the minimum number of data points recorded in
the window before the filter takes effect. This is to help ensure that a
sufficient number of data points are collected to satisfy the aggregate before
a service begins denying new requests.
The preallocationHint
is an optional optimisation for the internals of the
rolling window. It should be set to the projected number of data points that
will be contained within each bucket of the window. For example, the above
service expects to receive approximately 2,000 requests per second. This value
is only an optimisation and can be left as 0
if the projected rate is not
known.
The PercentileLatency
option works exactly the same as the AverageLatency
option except that it is based on a rolling percentile calculation rather than
an average.
import (
loadshedmiddleware "github.com/asecurityteam/loadshed/wrappers/middleware"
)
var lowerThreshold = .2
var upperThreshold = 1.0
var bucketSize = time.Second
var buckets = 10
var preallocationHint = 2000
var percentile = 95.0
var middleware = loadshedmiddleware.New(
loadshed.New(
loadshed.PercentileLatency(lowerThreshold, upperThreshold, bucketSize, buckets, preallocationHint, requiredPoints, percentile)),
)
The ErrorRate
option enables rejection of new requests when the error
rate of all requests within a rolling time window is too high. This requires
the transport/middleware errorCodes option to pass in the error codes which
are to be accounted for in the error rate.
import (
loadshedtransport "github.com/asecurityteam/loadshed/wrappers/transport"
)
var lowerThreshold = .2
var upperThreshold = 1.0
var bucketSize = time.Second
var buckets = 10
var preallocationHint = 2000
var transport = loadshedtransport.New(
loadshed.New(
loadshed.ErrorRate(lowerThreshold, upperThreshold, bucketSize, buckets, preallocationHint, requiredPoints)),
loadshedtransport.Callback(func(r *http.Request) {}),
loadshedtransport.ErrorCodes([]int{400, 404, 500, 501, 502, 503}),
)
The Aggregator enables injection of custom metrics that are not already included in this package. The option relies on the Aggregator interface provided by github.com/asecurityteam/rolling and the given aggregator must return a value that is a percentage of requests to reject between 0.0 and 1.0.
// Inject a random amount of chaos when the system is not under load.
type chaosAggregator struct {
amount float64
}
func (a *chaosAggregator) Aggregate() float64 {
return a.amount
}
// use it with middleware
var middleware = loadshed.New(
loadshed.New(
loadshed.Aggregator(&chaosAggregator{.01}))
)
// or use it with a Transport
var transport = loadshedtransport.New(
loadshed.New(
loadshed.Aggregator(&chaosAggregator{.01})),
)
Pull requests, issues and comments welcome. For pull requests:
- Add tests for new features and bug fixes
- Follow the existing style
- Separate unrelated changes into multiple pull requests
See the existing issues for things to start contributing.
For bigger changes, make sure you start a discussion first by creating an issue and explaining the intended change.
Atlassian requires contributors to sign a Contributor License Agreement, known as a CLA. This serves as a record stating that the contributor is entitled to contribute the code/documentation/translation to the project and is willing to have it used in distributions and derivative works (or is willing to transfer ownership).
Prior to accepting your contributions we ask that you please follow the appropriate link below to digitally sign the CLA. The Corporate CLA is for those who are contributing as a member of an organization and the individual CLA is for those contributing as an individual.
Copyright (c) 2017 Atlassian and others. Apache 2.0 licensed, see LICENSE.txt file.