How Mejepa works

Five instruments. One signed verdict.

Every AI-written Python patch passes through five stages. The first four narrow the patch down to a calibrated verdict; the fifth makes the verdict portable. Each stage is anchored to the FSV plan with the citation in the margin.

A walnut cabinet with five drawers labeled CHUNKING, PANEL, PREDICTOR, DOCKER ORACLE, WITNESS CHAIN. Each drawer holds a tiny still-life appropriate to its stage.

Stage 1 · Chunking

The patch is decomposed into chunks.

Mejepa reads the patch and the base file as bytes-on-disk. The same chunk boundaries are used at training time and at inference time, so the predictor never sees a representation the frozen instruments haven't already scored. Chunks carry a SHAKE-256 content hash that propagates into the witness chain at stage 5.

Stage 2 · 15-slot panel

An array of per-embedder vectors, not a flat concat.

The panel is a fixed 15-slot array where each slot is a distinct per-embedder vector. Slot identity is preserved end-to-end. The 13 frozen frozen instruments author the panel; 15 = total slots including derived/composed slots. Two of the slots are reserved for cross-panel triangulation (issue #405, blocked-P0).

Stage 3 · Conformal predictor

Three binary predicates, one conformal split.

Mejepa's predictor evaluates three binary predicates per patch:

  • Q1 · claim_exists — does the patch make a recognizable claim about behavior?
  • Q2 · oracle_passes — would the Docker oracle accept the patch?
  • Q5 · predicted_shift_event_occurred — will the patch trigger a downstream test/state shift the agent did not predict?

The split-conformal head emits Pass, Fail, or Abstain. OOD patches are flagged with named reasons. Cold-cell patches get the same treatment. Q4 (perf / cost / reasoning class) was formally retired as wontfix-ambiguity-boundary — Mejepa does not predict subjective surfaces.

Stage 4 · Docker oracle

Ground truth is the Docker container, not a model.

For every patch, Mejepa runs python -m swebench.harness.run_evaluation in a sealed Docker container. The per-instance report.json is parsed into an OracleVerdict. The verdict carried in the signed packet always includes the oracle report.json SHA-256, so any auditor can re-run the same container and confirm the result independently. This is the falsifiable end of the system.

Stage 5 · Witness chain

ed25519 + SHAKE-256, replay offline in ~30s.

Every verdict (and every panel state, every training certificate, every feedback event) is signed with ed25519 and appended to a SHAKE-256-linked chain. The public key is published. The chain is verifiable offline with no Mejepa server reachable — an auditor, an underwriter, or opposing counsel can confirm the verdict in roughly 30 seconds.


The MCP surface · 57 mejepa_* tools

Any MCP-aware agent harness — Claude Code, Cursor, Windsurf, Cline, Continue, Replit — can call these tools directly. The capture infrastructure is multi-language (pytest, cargo-test, unittest, jest, vitest); the predictor calibration is Python-only today.

Tool groupStatusFSV ref
Mistake-driven loop
mejepa_record_mistake · mejepa_mistake_history · mejepa_mistake_loop_status
SHIPPED 2026-05-20 §1.5 · 04 §3.2
Skill↔code linkage
mejepa_skill_to_code · mejepa_code_to_skill · mejepa_skill_set_query · mejepa_skill_coverage_audit
SHIPPED 2026-05-19 §1.6 · 04 §3.3
Live capture / observe
mejepa_observe_shift · mejepa_record_agent_feedback · mejepa_pause_predictions · mejepa_subscriber_status · mejepa_capture_audit
SHIPPED §1.6 · 04 §3.5
Heal / operator override
mejepa_heal_status · mejepa_daemon_status · mejepa_operator_override_prediction · mejepa_promote_approval · mejepa_rollback_to
SHIPPED §1.6 · 04 §3.6
Eval / ship-gate
mejepa_eval_run · mejepa_eval_build_graph · mejepa_ship_gate_status · mejepa_weekly_eval_dashboard · mejepa_compression_progress · mejepa_bootstrap_status
SHIPPED §1.6 · 04 §3.8
Cross-panel
mejepa_cross_panel_score · mejepa_cross_panel_dashboard
PLANNED — #405 BLOCKED P0 §1.6 · 04 §9.2
Failure-mode wrappers
mejepa_list_failure_modes · mejepa_label_failure_cluster · …
PLANNED — #417 remaining §1.6 · 04 §9.1

Want to wire Mejepa into your agent loop today? Run the in-browser demo or request a pilot.