Performance Baselines¶
This page defines the reproducible benchmark workflow used for #39 optimization slices across Pull and Push drain paths.
Scope¶
Current benchmarks cover:
- dequeue +
ack(single) - dequeue +
ack(batch size 15, sustained-drain profile) - dequeue +
ack(batch size 32) - dequeue +
nack(single) - dequeue + repeated
extendon an active lease (single) - duplicate
ack/nackretry path under parallel load (contention profile) - mixed ingress + pull drain profile with latency percentiles (
p95_ms,p99_ms) - mixed ingress + push drain saturation profile with ingress reject and delivery counts
- mixed ingress + push skewed-target saturation profile (fast + slow target) for cross-target drain fairness checks
- adaptive backpressure A/B runtime harness (
offvson) for mixed-focused saturation analysis (with optional pull reference), final metrics/health artifacts, and side-by-side comparison tables including Pull ACK conflict ratio
Each benchmark runs against both queue backends:
memorysqlite
Benchmarks are implemented in:
internal/pullapi/bench_test.gointernal/dispatcher/bench_test.goscripts/adaptive-ab.sh(runtime A/B harness, non-go test)
Reproducible Runbook¶
Run all commands from the repository root.
- Capture a baseline before changes:
This writes ./.bench/pull-baseline.txt.
-
Apply code changes.
-
Capture current results:
This writes ./.bench/pull.txt.
- Compare baseline vs current:
This writes ./.bench/pull-compare.txt and prints a benchstat diff table.
Isolated Extend Check¶
When you want to validate only the active-lease extend path:
This uses a longer, higher-count run to reduce variance from unrelated Pull-path benchmarks.
Sustained Drain Check (Batch 15)¶
For an issue-#39-style pull workload (dequeue + batch ack with batch=15):
This writes:
./.bench/pull-drain-baseline.txt./.bench/pull-drain.txt./.bench/pull-drain-compare.txt
The drain profile uses a longer run (-benchtime=5s, -count=10) for lower variance.
ACK/NACK Contention Check¶
For high-parallel duplicate-retry pressure on Pull ack/nack:
This writes:
./.bench/pull-contention-baseline.txt./.bench/pull-contention.txt./.bench/pull-contention-compare.txt
The contention profile runs with GOMAXPROCS=4 and -cpu 1,4 to expose scaling behavior and conflict-path costs.
Mixed Ingress + Drain Tail-Latency Check¶
For a mixed workload (concurrent ingress writes while pull workers dequeue+ack in the background):
This writes:
./.bench/pull-mixed-baseline.txt./.bench/pull-mixed.txt./.bench/pull-mixed-compare.txt
BenchmarkMixedIngressDrain also reports custom metrics per backend:
p95_msp99_msingress_rejectsdrain_errors
Push Ingress + Drain Saturation Check¶
For push-mode saturation behavior (ingress while dispatcher drains a single-target route):
This writes:
./.bench/push-mixed-baseline.txt./.bench/push-mixed.txt./.bench/push-mixed-compare.txt
BenchmarkPushIngressDrainSaturation reports:
ingress_rejectsingress_rejects_queue_fullingress_rejects_adaptive_backpressureingress_rejects_memory_pressureingress_rejects_otherp95_msp99_msdeliveries
Push Ingress + Skewed-Target Saturation Check¶
For push-mode cross-target fairness under saturation (single route with one fast and one slow target):
This writes:
./.bench/push-skewed-baseline.txt./.bench/push-skewed.txt./.bench/push-skewed-compare.txt
BenchmarkPushIngressDrainSkewedTargets reports:
ingress_rejectsingress_rejects_queue_fullingress_rejects_adaptive_backpressureingress_rejects_memory_pressureingress_rejects_otherp95_msp99_msdeliveries_fastdeliveries_slow
Adaptive Backpressure A/B Runtime Check (Issues #53/#54/#55/#56)¶
For explicit adaptive_backpressure.enabled=off vs on runs (same load profile):
make adaptive-ab defaults to the currently open validation scope (mixed).
Scenario-specific runs:
Guardrail checks on an existing mixed run:
make adaptive-ab-guardrail-check RUN_ROOT=.artifacts/adaptive-ab/<run-id>
make adaptive-ab-lag-guardrail-check RUN_ROOT=.artifacts/adaptive-ab/<run-id>
One-shot calibrated run + guardrail check:
make adaptive-ab-all executes:
pull-off,pull-on(reference profile)mixed-off,mixed-on(remaining decision profile)
make adaptive-ab-mixed-saturation is a calibrated high-pressure profile for issue validation (#53/#54/#55/#56):
- duration:
30sper mode - ingress workers:
256 - mixed drain workers:
8 - dequeue batch:
5 - queue max depth:
2000
Use it when baseline make adaptive-ab does not reach sustained pressure on your host.
Decision note: these runs are intended for relative same-host A/B evidence and tuning guidance. They are not a standalone basis for global default policy across heterogeneous production hardware.
Artifacts are written under:
./.artifacts/adaptive-ab/<run-id>/<scenario>-<mode>/
Each run directory includes:
final-metrics.txtfinal-health.jsonmonitor-output.logrun-meta.json(binary hash/version, git revision, runtime profile)summary.envandsummary.json
Comparison tables are generated as:
./.artifacts/adaptive-ab/<run-id>/comparison-pull.md./.artifacts/adaptive-ab/<run-id>/comparison-mixed.md./.artifacts/adaptive-ab/<run-id>/comparison.md./.artifacts/adaptive-ab/<run-id>/guardrail-mixed.md(when guardrail target/script is used)./.artifacts/adaptive-ab/<run-id>/guardrail-lag-mixed.md(when lag/age guardrail target/script is used)
The comparison table includes:
hookaido_ingress_adaptive_backpressure_applied_totalhookaido_ingress_rejected_by_reason_total{reason="adaptive_backpressure",status="503"}hookaido_ingress_rejected_by_reason_total{reason="queue_full",status="503"}hookaido_queue_ready_lag_secondshookaido_queue_oldest_queued_age_seconds- ingress
p95_ms/p99_ms - accepted request rate (requests/sec)
hookaido_pull_acked_total(sum across routes)hookaido_pull_ack_conflict_total(sum across routes)hookaido_pull_nack_conflict_total(sum across routes)pull_ack_conflict_ratio_percent(ack_conflict / acked * 100)
Guardrail defaults for #55:
- aggregate
pull_ack_conflict_ratio_percent <= 5.0 - minimum aggregate
pull_acked_total >= 100per mode (mixed-off,mixed-on) - per-route
pull_ack_conflict_ratio_percent <= 5.0when routepull_acked_total >= 50
Guardrail defaults for #56:
- aggregate
queue_ready_lag_seconds <= 30per mode (mixed-off,mixed-on) - aggregate
queue_oldest_queued_age_seconds <= 30per mode (mixed-off,mixed-on) - delta (on-off)
queue_ready_lag_seconds <= 10 - delta (on-off)
queue_oldest_queued_age_seconds <= 10 - minimum
accepted_total >= 100per mode
Reproducibility Defaults¶
The Make targets enforce:
GOMAXPROCS=1-cpu 1-count=5and-benchtime=3sfor the default pull suite-count=10and-benchtime=5sfor isolated extend/drain profilesGOMAXPROCS=4and-cpu 1,4for the contention profileGOMAXPROCS=4and-cpu 4for the mixed ingress+drain profileGOMAXPROCS=4and-cpu 4for the push saturation profileGOMAXPROCS=4and-cpu 4for the push skewed-target profile- adaptive A/B harness defaults:
duration=120s,ingress_workers=16,mixed_drain_workers=8,dequeue_batch=15,queue_max_depth=50000
This reduces host variance and gives stable median trends across runs.
Interpreting Results¶
- Focus first on
sec/opdeltas for the same benchmark/backend pair. - Use
B/opandallocs/opto catch regressions hidden by throughput changes. - For SQLite, compare both single and batch paths; batch wins should show up most clearly in
AckBatch32. - For mixed profile runs, track
p95_ms/p99_msfirst, then checkingress_rejectsanddrain_errorsto interpret latency shifts. - For push saturation runs, track
p95_ms/p99_msandingress_rejects_queue_fullfirst, then comparedeliveries. - For push skewed-target runs, track
p95_ms/p99_ms,deliveries_slow, andingress_rejects_queue_full; improving slow-target drain without growing queue-full rejects or tail latency indicates better cross-target fairness. - For adaptive A/B runs, first confirm
adaptive_applied_total=0inoff, then comparequeue_fulldelta and latency/rate trade-offs inon. - For mixed A/B (
#55), trackpull_ack_conflict_ratio_percentalongside ingress metrics; large conflict-ratio regressions can hide behind stable ingress acceptance. - For
#55regression acceptance, useguardrail-mixed.mdas the pass/fail artifact and inspect the per-route drill-down section to localize conflict spikes. - For lag/age regression acceptance (
#56), useguardrail-lag-mixed.mdand investigate sustained queue lag/age when absolute or delta thresholds fail. - Keep policy decisions tied to workload SLOs: same-host gains do not imply cross-environment default changes.
Notes¶
- Benchmark artifacts are written under
./.bench/and ignored by git. - Adaptive A/B artifacts are written under
./.artifacts/adaptive-ab/and ignored by git. bench-pull-compareuses a pinnedbenchstatmodule version inMakefileto avoid tool drift.- For production threshold tuning of adaptive ingress pressure, use Adaptive Backpressure Tuning.