UpSkillZone

Incident postmortem · INC-2026-004

JWKS endpoint cold cache

SEV-3·monitoring

First request after deploy paid the full key-load cost; verifier clients saw a one-time 2.1s tail. Pre-warm step added; monitoring.

Started

Apr 9, 2026, 05:18 AM UTC

Resolved

Apr 9, 2026, 05:46 AM UTC

Duration

28m

Root cause

Post-deploy, the first /.well-known/jwks.json request loaded the keystore from cold; downstream verifier clients with strict timeouts saw a one-time 2.1s p99.

Customer impact

One downstream verifier (an HR-tech integration) reported a single failed verification batch of N=1; the batch retried successfully.

Remediation

  • Added a JWKS pre-warm to the deploy script — first request after rollout returns from a primed cache.
  • Documented the 1-second target for cold p99 in the platform SLO sheet.
  • Monitoring window held open for 30 days post-deploy.