Certification Pipeline
5 min read
The certification pipeline is the core workflow that transforms raw MCP server source code into a certified, scored, and published artifact. It operates as an asynchronous, event-driven system using AMQP (LavinMQ) for job distribution between services.
Pipeline Overview
The pipeline consists of five phases, each handled by a dedicated job type:
1. Detection Scheduler detects new commit via git ls-remote
|
v
2. Ingestion Worker clones repo, creates tarball, uploads to S3
|
v
3. Analysis Scan-worker runs security analysis (MCP-Scan + optional Trivy)
|
v
4. Scoring Hub-worker maps findings to controls, computes scores, creates snapshot
|
v
5. Publishing Hub-worker publishes certified artifact to mcp-registry
Phase 1: Detection
The scheduler service runs continuously and polls registered MCP repositories for new commits. It executes git ls-remote on each active MCP’s configured branch and compares the result against the last known commit hash in the database.
- Polling interval: Daily by default; configurable per MCP for Enterprise organizations.
- Jitter: A small random delay is added to each poll to avoid thundering herd problems.
- Backoff: If a poll fails (e.g., repository is temporarily unavailable), the scheduler applies exponential backoff before retrying.
When a new commit is detected:
- A new
MCPVersionrecord is created with statusPENDING_ANALYSIS. - An
INGEST_COMMITmessage is published to the AMQP exchange.
Alternative triggers include Git webhooks (for real-time detection) and direct uploads via the CLI.
Phase 2: Ingestion (INGEST_COMMIT)
The hub-worker consumes the INGEST_COMMIT message and performs the following steps:
- Clone: Clone the Git repository at the specific commit hash.
- Verify: Confirm the commit hash matches the expected value.
- Package: Create a compressed tarball (
.tar.gz) of the source code. - Upload: Upload the tarball to S3 storage with a deterministic key:
sources/mcp-{uuid}/{commit_hash}.tar.gz. - Update: Set the version status to
INGESTEDand record the S3 key and file size. - Dispatch: Publish an
ANALYZE_VERSIONmessage to AMQP for the next phase.
Size limits and timeouts are enforced to protect against zip bombs and excessively large repositories. The ingestion timeout is 10-30 minutes depending on repository size.
Phase 3: Analysis (ANALYZE_VERSION)
Analysis is delegated to the scan-worker via AMQP. The hub-worker publishes an ANALYZE job, and the scan-worker processes it independently:
- The hub-worker uploads the source tarball to S3 and publishes an
ANALYZEmessage. - The scan-worker (running mcp-scan) downloads the tarball, extracts it, and runs the security analysis toolchain.
- Analysis results are uploaded to S3 as structured JSON.
- The scan-worker publishes an
ANALYZE_COMPLETEmessage back to AMQP.
The analysis toolchain includes:
- MCP-Scan: Specialized static analysis engine that detects 14 vulnerability classes (A through N) using pattern matching, taint analysis, and optional AI-assisted detection. Supports Python, TypeScript, JavaScript, and Go.
- Trivy (optional): For SBOM (Software Bill of Materials) generation and dependency vulnerability scanning.
The scan-worker is a separate container with all analysis tools pre-installed. The hub-worker itself does not execute any analysis tools locally.
Phase 4: Scoring and Snapshot Creation
When the hub-worker receives the ANALYZE_COMPLETE message, it downloads the analysis results from S3 and performs post-analysis processing:
Findings Normalization
Raw tool outputs are normalized into a standard findings format with consistent fields for severity, location, description, and remediation guidance.
Controls Mapping
Normalized findings are mapped to the platform’s controls catalog – a set of stable, tool-agnostic security control identifiers:
Raw Findings (tool-specific)
--> Normalized Findings
--> Controls (tool-agnostic)
--> Scores (deterministic)
Examples of controls: SECRET_EXPOSED, KNOWN_VULNERABLE_DEPENDENCY_CRITICAL, NO_LOCKFILE, UNPINNED_DEPENDENCIES.
Score Computation
Scores are computed deterministically across four dimensions:
| Dimension | Weight | What It Measures |
|---|---|---|
| Security Score | 50% | Vulnerabilities, secrets, insecure patterns |
| Supply Chain Score | 30% | Dependencies, licenses, lockfiles |
| Maturity Score | 20% | Documentation, tests, CI/CD configuration |
| Global Score | – | Weighted composite of the above three |
Penalty values are applied based on finding severity: Critical (-40), High (-20), Medium (-10), Low (-5). The minimum score is 0.
Snapshot Creation
An immutable security snapshot is created containing all scores, control results, findings counts, and versioning metadata (toolchain version, scoring version, controls catalog version). The version status is updated to CERTIFIED.
Phase 5: Publishing (PUBLISH_TO_REGISTRY)
After certification, the hub-worker publishes the certified artifact to mcp-registry:
- The global score is mapped to a certification level (0-3).
- A manifest is generated from the snapshot data.
- The hub-worker calls the registry’s publish API with a service token.
- The version record is updated with the registry URL and tarball SHA-256 hash.
The certified MCP is now available for download from the registry by end users via mcp-client.
AMQP Job Flow
The pipeline uses a direct exchange (mcp.jobs) with dedicated queues per job type:
| Queue | Routing Key | Purpose |
|---|---|---|
mcp.jobs.ingest_commit | ingest_commit | Clone and upload source |
mcp.jobs.analyze_version | analyze_version | Run security analysis |
mcp.jobs.publish_to_registry | publish_to_registry | Publish to registry |
mcp.jobs.generate_report | generate_report | Generate PDF report |
mcp.jobs.reevaluate_version | reevaluate_version | Re-analyze existing version |
Retry and Error Handling
Failed jobs are retried with exponential backoff using dedicated retry queues:
| Attempt | Retry Queue | Delay |
|---|---|---|
| 1 | mcp.jobs.retry.1m | 1 minute |
| 2 | mcp.jobs.retry.2m | 2 minutes |
| 3 | mcp.jobs.retry.5m | 5 minutes |
| 4 | mcp.jobs.retry.15m | 15 minutes |
| 5 | mcp.jobs.retry.1h | 1 hour |
After all retry attempts are exhausted, the job is moved to a dead letter queue (mcp.jobs.dead_letter) for manual inspection.
Message Format
Each AMQP message contains:
{
"job_id": "uuid-v4",
"type": "INGEST_COMMIT",
"idempotency_key": "mcp-uuid:commit-hash",
"mcp_id": "uuid-v4",
"version_id": "uuid-v4",
"org_id": "public",
"payload": { ... },
"attempts": 0,
"max_attempts": 5,
"created_at": "2026-01-22T10:00:00Z"
}
Messages are persistent (delivery_mode: 2) and use idempotency keys to prevent duplicate processing.
Delivery Guarantees
- At-least-once delivery: If a worker fails to acknowledge a message, it is redelivered.
- No concurrent duplicates: Each message is delivered to exactly one worker at a time (via AMQP consumer prefetch).
- Horizontal scaling: Multiple workers can consume from the same queues; AMQP distributes messages via round-robin.
Idempotency
The pipeline is designed for at-least-once processing, so all operations must be idempotent:
- Version uniqueness:
UNIQUE (mcp_id, commit_hash)prevents duplicate versions. - Job deduplication:
UNIQUE (type, idempotency_key)prevents duplicate jobs. - S3 key determinism: Storage keys are derived deterministically from version and snapshot identifiers.
- Snapshot uniqueness:
UNIQUE (version_id, toolchain_version, scoring_version, controls_catalog_version)prevents duplicate snapshots.