Credential recovery infrastructure

Keep the run.Replace the credential.

Reconnect identity to the exact checkpoint, then continue the original run without repeating committed work.

RECOVERY INCIDENT RECORD

Nightly executive briefing

CASERV-0248

EXECUTION

run_7f2

Checkpoint
05 / prepare brief
Action key
send_exec_05
Duplicate effects
0 observed
LIVE TRANSITION01 / 04

IDENTITY

Grant rejected

AADSTS700082

SAME LOGICAL RUNCREDENTIAL GENERATION 01
The provider rejects a credential, the account owner restores access, Revive advances the lease, and the original run resumes at its checkpoint.
CREDENTIAL SYSTEM
Grant rejected
Entra / Nango / Auth0
CONTINUITY LAYER
Run correlated
lease / checkpoint / action
DURABLE RUNTIME
Execution resumed
LangGraph / Temporal

What actually broke

A credential failure is not a workflow failure.

The missing record connects the broken identity to the exact work that stopped.

IDENTITY

Classify the failure

Separate a dead grant from a refreshable token or a transient provider error.

REVIVE

Bind the affected run

Join the credential lease, logical run, checkpoint and failed action.

ACCOUNT OWNER

Reauthorize the right account

Issue a short-lived recovery capability without exposing credentials to the worker.

RUNTIME

Resume the original execution

Rotate the lease, reconcile side effects and continue from the saved checkpoint.

The recovery contract

Same run ID.
New credential generation.

LOGICAL EXECUTIONPROTECTED
run_7f2
GRANT REJECTEDGENERATION 02
CHECKPOINT + ACTION KEY PRESERVED

Same logical run

Recovery advances the existing execution instead of spawning a replacement job.

Generation fencing

Workers holding the rejected credential generation cannot race the resumed run.

Process-independent checkpoint

The original worker can disappear while recovery remains actionable.

Replay evidence

Every mutating action keeps its idempotency key and reconciliation state.

Built between systems that already do their jobs well.

CREDENTIAL CUSTODY
Microsoft Entra, Nango, Auth0

Tokens stay with the identity layer.

RECOVERY CONTRACT
Revive

Correlates identity, execution and replay evidence.

DURABLE EXECUTION
LangGraph, Temporal

The runtime handles checkpointing and scheduling.

Add recovery at the action boundary.

Use the runtime and credential system already in production. Revive records the contract between them.

TypeScript SDK
Protect one mutating action
1const result = await revive.protectAction({
2 runId: workflow.runId,
3 checkpointId: workflow.checkpointId,
4 connectionId: "conn_microsoft_ops",
5 actionKey: "send-briefing",
6 credential: () => vault.lease("conn_microsoft_ops"),
7 execute: ({ credential, idempotencyKey }) =>
8 graph.sendMail(message, { credential, idempotencyKey }),
9 reconcile: ({ idempotencyKey }) =>
10 graph.findMailByIdempotencyKey(idempotencyKey),
11});

Clear boundaries.

How Revive fits with the infrastructure already running your workflows.

Is Revive another token vault?+

No. Nango, Auth0 and provider vaults keep token custody. Revive coordinates recovery for the affected run.

Why is normal workflow retry insufficient?+

Retrying with the rejected credential fails again. Blind replay can also repeat a remote side effect that already committed.

Does recovery survive a worker restart?+

Yes. The recovery case and checkpoint are durable, so another worker can resume the same logical run.

Which runtimes can Revive coordinate?+

The repository includes LangGraph and Temporal adapters. The contract is designed to support more durable runtimes.

Break the credential.
Keep the execution.

Run the local fault injection and inspect every recovery transition.

Open recovery lab