UpSkillZone AI

Job Twin brief — UpSkillZone AI

Job Twin 4 — Live incident response

live incident·90 min live + 4h post-mortem·pass 65%

Scenario

Scenario

A customer-facing model is hallucinating regulated content. Diagnose the failure mode, deploy a hotfix, document the post-mortem.

Live window: 90 minutes. Post-mortem due within 4 hours of the live window.

Deliverables

Deliverables

  1. Diagnosis — minimal repro of the hallucination class.
  2. Hotfix — code change plus a regression test that fails before, passes after.
  3. Post-mortem — timeline, root cause, blast radius, follow-ups.

Materials

Time-box

90 min live + 4h post-mortem

Server-authoritative clock. The deadline is hard; auto-save does not extend it.

Submission modes

  • repo_url
  • file_upload

The first mode listed is the default on the submit screen.

Rubric

Each dimension scored on [0.0, 1.0] in 0.05 increments. The overall score is the weighted average; pass at 65%.

DimensionWeightWhat it tests

Diagnosis speed

diagnosis_speed

20%Time from incident open to confirmed root cause.

Hotfix correctness

hotfix_correctness

25%Fix actually addresses the failure class.

Regression safety

regression_safety

20%Regression test guards the failure class going forward.

Post-mortem quality

postmortem_quality

20%Blameless, specific, with concrete follow-ups.

Communication

communication

15%Status updates during the live window were clear.

Failure modes

Self-checks the learner answers before submit. Critical checks block submission unless explicitly forced; the force flag is then surfaced to the mentor.

  • F1

    Did your hotfix include a regression test?

    critical

  • F2

    Did you publish at least one status update during the live window?

    reflective

  • F3

    Does the post-mortem name follow-ups with owners?

    reflective

Skill assertions on offer

On a passing review the mentor selects a subset of these to assert, with an asserted weight bounded by the per-skill ceiling shown below.

  • llm.safety.hallucination-mitigation

    LLM safety — hallucination mitigation

    max weight 1.00
  • llm.ops.incident-response

    LLM ops — incident response

    max weight 1.00

Mentor SLA

48h

From mentor claim to signoff.

Pass threshold

65%

Weighted-average overall score.

Re-attempts

1

Higher of the two scores flows to the credential.

Start this twin

The clock starts when you press start. Read the brief above first. You will be asked to sign in if you have not already.

Open jt-incident-4 in dashboard →