Mejepa Glossary — JEPA, DDA, TCT, Conformal Prediction, Witness Chain

JEPA — Joint Embedding Predictive Architecture

A self-supervised learning architecture proposed by Yann LeCun in 2022. JEPA learns the latent geometry of a substrate by predicting masked regions in embedding space — not by reconstructing pixels or tokens.

Mejepa applies a JEPA-class predictor to code. 12 trained embedders project the patch into a 5,120-dimensional joint space; 15 frozen instruments produce the supervisory signal; the predictor learns the mapping. JEPA is preferred over generative models for verification because it produces falsifiable predictions against a fixed substrate rather than plausible text.

Reference: LeCun, A Path Towards Autonomous Machine Intelligence (2022); Assran et al., I-JEPA (2023).

DDA — Derived Data Abundance

The technique of turning a single labeled input into many supervisory signals by running it through every frozen instrument.

One patch produces hundreds of independent labels — oracle outcome, AST distance, type-graph delta, data-flow change, lint score, dependency edge — instead of one. DDA is how Mejepa trains a high-fidelity predictor with limited labeled patches: the labels are derived, not collected.

TCT — Teleological Constellation Training

Mejepa's constraint mechanism that anchors the predictor to a fixed geometry of failure-mode centroids.

Each cluster of historical failure patterns becomes a "constellation" the predictor must respect. Predictions that drift toward un-anchored regions of embedding space are rejected. TCT is the architectural reason Mejepa resists representational collapse during training — a documented failure mode for self-supervised systems.

Conformal Prediction

A statistical method that produces calibrated confidence intervals around any predictor's output, with mathematical guarantees on coverage.

Mejepa uses conformal intervals to emit one of three honest verdicts — Pass, Fail, or Abstain — rather than a single confidence number. Abstain is the critical move: the verifier refuses to grade what it cannot grade reliably.

Reference: Vovk, Gammerman, Shafer, Algorithmic Learning in a Random World (Springer, 2005).

Witness Chain

Mejepa's append-only Merkle chain of signed verdicts.

Each verdict record is SHAKE-256 hashed and ed25519 signed; signatures link to the previous record's hash, forming an append-only chain. The public verification key is published at mejepa.com/keys. Anyone — auditors, customers, counsel, regulators — can replay the chain offline and confirm what was verified, by what method, at what time. See trust.html for the full replay procedure.

MCP — Model Context Protocol

The open protocol Anthropic released in 2024 that lets AI coding agents call external tools through a standard interface.

MCP-aware clients include Claude Code, Cursor, Windsurf, Cline, Continue, and Replit. Mejepa ships as an MCP server with 28 tools over stdio JSON-RPC. Any MCP-aware client reads inline verdicts before the agent writes the patch.

Reference: modelcontextprotocol.io.

Frozen Instruments

15 deterministic functions with zero trainable parameters that read durable bytes-on-disk.

The instruments: Docker oracle test outcomes, AST node kind and depth, AST diff against base, static data-flow features, type-graph features, import delta, test intent classifier, witness-chain hash, mutation category classifier, lineage edges, compile/build success, lint conformity, cyclomatic complexity, dependency change, file-system mutation log. Together they produce a 5,120-dimensional target panel that is math-provably collapse-immune.

Ship Gate

The single convergence metric that determines when Mejepa exits free-only availability.

Defined as: prediction-oracle Pearson correlation ρ ≥ 0.95, stable across four consecutive rolling windows, stratified per (mutation category × language) cell. Current measurement: ρ = 0.866667. Until the gate fires green, Pro and Team CI remain invite-only design partners.

ρ (Rho) — Prediction-Oracle Correlation

Pearson correlation coefficient between Mejepa's predicted oracle pass probability and the actual SWE-bench Lite oracle outcome.

The ship gate fires when ρ ≥ 0.95 stable across four consecutive rolling windows, stratified per cell. Current ρ = 0.866667. The live number is published on the ship-gate dashboard.

Compression Progress (CPΦ)

The rate at which the predictor's description length over frozen code reality decreases over time.

A Schmidhuber-style measure of "getting better at predicting the substrate" — the path from verification toward measured understanding. CPΦ is on the Phase I roadmap and will be published as a public telemetry stream alongside the ship-gate ρ.

Reality Compiler

Mejepa's per-repository specialization mechanism.

A reality compiler ingests a repo's history and trains its own predictor head against that repo's actual failure modes, weighted at 6× operator feedback. Each compiled cell is enterprise-scoped — its training never crosses customer boundaries. Reality compilers are how Mejepa gets sharper per-repo over time without compromising tenant isolation.

ed25519

An Edwards-curve digital signature algorithm.

Mejepa signs every verdict with ed25519, producing a 64-byte signature that is verifiable offline using the public key at mejepa.com/keys. ed25519 is fast, deterministic, has no known cryptanalytic weaknesses, and is endorsed by NIST FIPS 186-5 (2023).

SHAKE-256

A cryptographic hash function from the SHA-3 family that produces variable-length output.

Mejepa uses SHAKE-256 to hash each verdict record. Combined with the previous record's hash in the signature input, the chain becomes an append-only Merkle structure that any auditor can replay offline.

SWE-bench Lite

A 300-task subset of the SWE-bench benchmark — a public test suite of real Python GitHub issues with reference patches.

Mejepa's ship gate measures Pearson correlation between its predicted oracle score and the actual SWE-bench Lite oracle outcome. Why SWE-bench Lite and not SWE-bench Verified: SWE-bench Lite has wider failure-mode coverage and a published baseline. See the methodology page for the full computation.

Reference: Jimenez et al., SWE-bench, Princeton NLP / NeurIPS 2024.

Verdict

The output of a Mejepa prediction. Four possible values:

Pass — predicted to satisfy the oracle
Fail — predicted to violate the oracle, with predicted exception class
Abstain — conformal interval too wide to grade reliably
OutOfDistribution — patch sits outside the training distribution

Each verdict includes a conformal interval inside [0, 1], the closest exemplars from the corpus, the predicted failure mode if Fail, and an ed25519 signature appended to the witness chain.

Verification Gap

The structural failure mode where an AI coding agent self-reports a task as complete but the work has not been independently verified against deterministic instruments.

Named in the market by Nate Jones in The Verification Gap (January 2026): "'Done' isn't a capability problem. It's an accountability problem." Mejepa closes the gap by running 15 frozen instruments against the agent's claimed change before merge.

Reference: Jones, The Verification Gap, Natesnewsletter, 2026-01-07.

Definitions for code reality verification.