Adaptive Backpressure Tuning¶
This guide provides a production-focused tuning workflow for defaults.adaptive_backpressure.
Goal:
- stabilize ingress tail latency (p95/p99) before hard queue saturation
- reduce hard queue_full rejects (503) without over-throttling ingress
Prerequisites¶
Enable metrics and collect:
hookaido_queue_totalhookaido_queue_ready_lag_secondshookaido_queue_oldest_queued_age_secondshookaido_ingress_adaptive_backpressure_applied_totalhookaido_ingress_adaptive_backpressure_total{reason}hookaido_ingress_rejected_by_reason_total{reason,status}
Also capture ingress request latency percentiles from your HTTP telemetry (p95, p99).
v1.5 Decision (2026-02-14)¶
Project decision for v1.5:
- Runtime default remains
defaults.adaptive_backpressure.enabled off. - This is intentional: adaptive pressure is a workload/SLO policy lever, not a universal safe default.
- Recommended opt-in starting profile for enterprise mixed workloads:
defaults {
adaptive_backpressure {
enabled on
min_total 400
queued_percent 88
ready_lag 45s
oldest_queued_age 90s
sustained_growth on
}
}
Rationale:
- Mixed A/B saturation runs show adaptive mode can shift rejects away from hard
queue_fulland improve lag/age behavior. - The same runs can also shift tail latency/throughput, so rollout must be SLO-driven.
- Single-host benchmark hardware is suitable for relative same-host A/B comparisons, but not as a universal default reference across environments.
Starting Profiles¶
Pick one profile and adjust from there.
| Profile | Use when | min_total |
queued_percent |
ready_lag |
oldest_queued_age |
sustained_growth |
|---|---|---|---|---|---|---|
balanced |
default for most teams | 200 |
80 |
30s |
60s |
on |
latency_first |
strict p95/p99 protection | 100 |
72 |
15s |
30s |
on |
throughput_first |
fewer early rejects, more queue elasticity | 400 |
88 |
45s |
90s |
on |
Example (latency_first):
defaults {
adaptive_backpressure {
enabled on
min_total 100
queued_percent 72
ready_lag 15s
oldest_queued_age 30s
sustained_growth on
}
}
Tuning Loop¶
Run each step with production-like traffic for at least 15-30 minutes.
- Set one starting profile.
- Observe latency (
p95/p99), adaptive rejects, and hard queue rejects. - Change one parameter at a time.
- Keep the change only if tail latency improves without unacceptable throughput loss.
Decision Matrix¶
If p95/p99 rises and queue_full rejects increase before adaptive rejects appear:
- lower queued_percent by 3-5
- lower ready_lag by 5-10s
- lower oldest_queued_age by 10-20s
- optionally lower min_total by 50-100
If adaptive rejects are high but queue lag/age stay low:
- raise queued_percent by 3-5
- raise ready_lag by 5-10s
- raise oldest_queued_age by 10-20s
If reason="sustained_growth" dominates during normal bursts:
- keep sustained_growth on
- raise min_total gradually so short bursts do not trigger early
Guardrails¶
- Keep hard
queue_limits.max_depthas the final safety boundary. - Avoid large jumps; use small deltas and re-measure.
- In bursty enterprise workloads, keep
sustained_growth onunless you have strong evidence to disable it. - Treat adaptive rejects as an intentional control signal, not only as an error.
Benchmark Alignment¶
Use Pull benchmarks (make bench-pull*, make bench-pull-mixed*) to compare internal queue/drain optimization slices.
For reproducible adaptive off vs on saturation evidence (mixed by default, optional pull reference; with final metrics/health artifacts and comparison tables), use make adaptive-ab.
If your host does not reach sustained pressure with defaults, use make adaptive-ab-mixed-saturation.
For mixed contention tracking (#55), include hookaido_pull_ack_conflict_total / hookaido_pull_acked_total trend checks in the same A/B slice.
Use production-like ingress load tests for threshold tuning decisions. The benchmark suite is best for relative runtime regressions/improvements, while threshold tuning depends on workload shape and SLO targets.
Dashboard Compatibility¶
When dashboards span versions, gate adaptive panels/alerts with:
hookaido_metrics_schema_info{schema="1.3.0"} == 1
This avoids interpreting missing pre-1.3.0 series as zero.