Job Twin brief — UpSkillZone AI

Job Twin 4 — Live incident response

live incident·90 min live + 4h post-mortem·pass 65%

Scenario

A customer-facing model is hallucinating regulated content. Diagnose the failure mode, deploy a hotfix, document the post-mortem.

Live window: 90 minutes. Post-mortem due within 4 hours of the live window.

Deliverables

Diagnosis — minimal repro of the hallucination class.
Hotfix — code change plus a regression test that fails before, passes after.
Post-mortem — timeline, root cause, blast radius, follow-ups.

Materials

Incident sandbox reporepo
Post-mortem templatedoc

Time-box

90 min live + 4h post-mortem

Server-authoritative clock. The deadline is hard; auto-save does not extend it.

Submission modes

repo_url
file_upload

The first mode listed is the default on the submit screen.

Rubric

Each dimension scored on [0.0, 1.0] in 0.05 increments. The overall score is the weighted average; pass at 65%.

Dimension	Weight	What it tests
Diagnosis speed diagnosis_speed	20%	Time from incident open to confirmed root cause.
Hotfix correctness hotfix_correctness	25%	Fix actually addresses the failure class.
Regression safety regression_safety	20%	Regression test guards the failure class going forward.
Post-mortem quality postmortem_quality	20%	Blameless, specific, with concrete follow-ups.
Communication communication	15%	Status updates during the live window were clear.

Failure modes

Self-checks the learner answers before submit. Critical checks block submission unless explicitly forced; the force flag is then surfaced to the mentor.

F1
Did your hotfix include a regression test?
critical
F2
Did you publish at least one status update during the live window?
reflective
F3
Does the post-mortem name follow-ups with owners?
reflective

Skill assertions on offer

On a passing review the mentor selects a subset of these to assert, with an asserted weight bounded by the per-skill ceiling shown below.

llm.safety.hallucination-mitigation
LLM safety — hallucination mitigation
max weight 1.00
llm.ops.incident-response
LLM ops — incident response
max weight 1.00

Mentor SLA

48h

From mentor claim to signoff.

Pass threshold

65%

Weighted-average overall score.

Re-attempts

Higher of the two scores flows to the credential.

Start this twin

The clock starts when you press start. Read the brief above first. You will be asked to sign in if you have not already.

Open jt-incident-4 in dashboard →