Interpreting Results

How to read mcp-scan output, understand findings, taint traces, MSSS scores, and work with baselines for tracking known issues.

MCP Scanner produces structured output in three formats: JSON, SARIF, and Evidence Bundle. This guide explains how to read each format, interpret the key fields, and use baselines to manage findings over time.

JSON Output

JSON is the default output format. It contains the complete scan results in a single structured document.

mcp-scan scan . --output json

Top-Level Structure

{
  "version": "1.0.0",
  "scan_id": "20240115-143022-abc123",
  "timestamp": "2024-01-15T14:30:22Z",
  "config": { "..." : "..." },
  "manifest": { "..." : "..." },
  "mcp_surface": { "..." : "..." },
  "msss_score": { "..." : "..." },
  "findings": [ "..." ],
  "summary": { "..." : "..." },
  "duration": "2.345s"
}
FieldDescription
versionOutput schema version
scan_idUnique identifier for this scan run
timestampWhen the scan was performed (UTC)
configScan configuration used, including a SHA-256 hash for reproducibility
manifestAll scanned files with SHA-256 hashes and sizes
mcp_surfaceDetected MCP tools, transport, and auth signals
msss_scoreSecurity score breakdown and compliance level
findingsArray of all detected vulnerabilities
summaryAggregate counts by severity, class, and language
durationTotal scan time

Understanding Findings

Each finding represents a detected vulnerability. Here is a complete finding with all fields:

{
  "id": "abc123def456",
  "rule_id": "MCP-A003",
  "severity": "critical",
  "confidence": "high",
  "language": "python",
  "location": {
    "file": "src/handler.py",
    "start_line": 42,
    "start_col": 5,
    "end_line": 42,
    "end_col": 35
  },
  "mcp_context": {
    "tool_name": "run_command",
    "handler_name": "handle_run",
    "transport": "stdio"
  },
  "trace": {
    "source": {
      "file": "src/handler.py",
      "start_line": 30,
      "start_col": 10
    },
    "sink": {
      "file": "src/handler.py",
      "start_line": 42,
      "start_col": 5
    },
    "steps": [
      {
        "location": { "file": "src/handler.py", "start_line": 30 },
        "action": "source",
        "variable": "user_input"
      },
      {
        "location": { "file": "src/handler.py", "start_line": 35 },
        "action": "assign",
        "variable": "cmd"
      },
      {
        "location": { "file": "src/handler.py", "start_line": 42 },
        "action": "sink",
        "variable": "cmd"
      }
    ]
  },
  "evidence": {
    "snippet": "subprocess.run(cmd, shell=True)",
    "snippet_hash": "sha256:789abc..."
  },
  "description": "Direct shell command execution detected",
  "remediation": "Use subprocess with shell=False and explicit command list"
}

Key Finding Fields

Identification:

FieldDescription
idUnique hash identifying this specific finding (used for baseline matching)
rule_idThe detection rule identifier (e.g., MCP-A003). The prefix letter indicates the vulnerability class

Severity and confidence:

FieldValuesDescription
severitycritical, high, medium, low, infoHow dangerous the vulnerability is
confidencehigh, medium, lowHow certain the scanner is that this is a real vulnerability

A critical severity, high confidence finding is the most urgent: a dangerous vulnerability that the scanner is very sure about. A low severity, low confidence finding may be a false positive or a minor issue worth reviewing later.

Location:

The location object pinpoints exactly where the vulnerability was found:

  • file – Relative path to the source file
  • start_line / end_line – Line range
  • start_col / end_col – Column range

MCP context:

The mcp_context object tells you which MCP tool is affected:

  • tool_name – The MCP tool name exposed to clients
  • handler_name – The function implementing the tool
  • transport – The transport type (e.g., stdio, sse)

Findings inside MCP tool handlers receive a 1.3x penalty multiplier because they are directly reachable from external clients.


Understanding Taint Traces

The trace object shows the complete data flow from source (where tainted data enters) to sink (where it reaches a dangerous operation).

Trace Structure

{
  "trace": {
    "source": { "file": "src/handler.py", "start_line": 30 },
    "sink": { "file": "src/handler.py", "start_line": 42 },
    "steps": [
      { "action": "source", "variable": "user_input", "location": { "..." : "..." } },
      { "action": "assign", "variable": "cmd", "location": { "..." : "..." } },
      { "action": "sink", "variable": "cmd", "location": { "..." : "..." } }
    ]
  }
}

Step Actions

ActionMeaning
sourceWhere tainted data originates (tool parameter, user input, file read)
assignVariable assignment propagating taint
callFunction call passing tainted data (deep mode only)
returnReturn value carrying taint (deep mode only)
concatString concatenation incorporating tainted data
sinkDangerous operation receiving tainted data

Reading a Trace

Follow the steps from top to bottom to understand the data flow:

  1. Source – “Where does the dangerous data come from?” (line 30: user_input parameter)
  2. Propagation – “How does it move through the code?” (line 35: assigned to cmd)
  3. Sink – “Where does it reach a dangerous operation?” (line 42: passed to shell execution)

In deep mode, traces can span multiple files and functions. Each step includes the file path so you can follow cross-file flows.


MSSS Score Breakdown

The MSSS (MCP Server Security Standard) score is a 0-100 numeric score that maps to certification compliance levels 0-3. Higher scores mean better security.

Score in JSON Output

{
  "msss_score": {
    "total": 49.0,
    "level": 0,
    "compliant": false,
    "version": "2.0",
    "categories": {
      "A": { "score": 7.0, "max_score": 22.0, "findings": 1, "penalties": 15.0 },
      "B": { "score": 0.0, "max_score": 13.0, "findings": 1, "penalties": 15.0 }
    },
    "score_breakdown": {
      "base_score": 100.0,
      "total_penalties": 30.0,
      "severity_multiplier": 0.70,
      "critical_count": 0,
      "high_count": 2,
      "formula": "(100 - 30.0) x 0.70 = 49.0"
    }
  }
}

Score Fields

FieldDescription
totalFinal score (0-100)
levelCompliance level (0-3)
compliantWhether the server meets minimum compliance
versionScoring algorithm version
categoriesPer-class breakdown showing score, maximum possible, findings count, and penalties
score_breakdownDetailed calculation showing base score, penalties, multiplier, and formula

Compliance Levels

LevelNameScore RequiredFindings Allowed
0Not Compliant< 60, or any critical, or > 3 highAny
1Basic Compliance>= 60Up to 3 high, 0 critical
2Enterprise Ready>= 800 high, 0 critical
3Certified>= 900 high, 0 critical

How the Score Is Calculated

The v2.0 scoring formula is:

FinalScore = max(0, 100 - TotalPenalties) x SeverityMultiplier

Penalty per finding:

penalty = base_penalty x confidence_multiplier x mcp_multiplier
SeverityBase Penalty
Critical25.0
High15.0
Medium5.0
Low1.0
Info0.2
ConfidenceMultiplier
High1.0
Medium0.7
Low0.4
MCP ContextMultiplier
Inside tool handler1.3
Not in tool handler1.0

Severity multiplier (compounding effect):

ConditionMultiplier
No high or critical1.0
1 high (0 critical)0.85
2 high (0 critical)0.70
3 high (0 critical)0.55
4+ high (0 critical)0.45
1 critical0.50
2 critical0.35
3+ critical0.25

Score Calculation Examples

Clean server (no findings):

Penalties = 0, Multiplier = 1.0
Score = (100 - 0) x 1.0 = 100 --> Level 3

One HIGH finding (high confidence):

Penalty = 15.0 x 1.0 x 1.0 = 15.0
Multiplier = 0.85
Score = (100 - 15) x 0.85 = 72.25 --> Level 1

Two HIGH findings (high confidence):

Penalties = 15.0 + 15.0 = 30.0
Multiplier = 0.70
Score = (100 - 30) x 0.70 = 49.0 --> Level 0

One CRITICAL finding (high confidence):

Penalty = 25.0 x 1.0 x 1.0 = 25.0
Multiplier = 0.50
Score = (100 - 25) x 0.50 = 37.5 --> Level 0

Five MEDIUM findings (high confidence):

Penalties = 5.0 x 5 = 25.0
Multiplier = 1.0
Score = (100 - 25) x 1.0 = 75.0 --> Level 1

Improving Your Score

  1. Fix critical findings first – Each critical finding applies a 0.50x multiplier (or worse)
  2. Reduce high findings – Each additional high finding compounds the penalty
  3. Focus on high-weight classes – Class A (RCE, weight 22.0) has the most impact
  4. Fix findings in MCP tool handlers – They receive a 1.3x penalty boost

Level Progression

To reach Level 1 (>= 60):

  • Fix all critical findings
  • Reduce high findings to 3 or fewer
  • Fix the highest-weight vulnerabilities first

To reach Level 2 (>= 80):

  • Fix all high and critical findings
  • Address medium findings with the highest penalties

To reach Level 3 (>= 90):

  • Fix all high and critical findings
  • Minimize medium findings
  • Run deep mode analysis (required for full certification)

SARIF Output

SARIF (Static Analysis Results Interchange Format) version 2.1.0 is the standard format for integrating with CI/CD tools.

mcp-scan scan . --output sarif > results.sarif

Severity Mapping

MCP Scanner severities are mapped to SARIF levels:

MCP-Scan SeveritySARIF Level
criticalerror
higherror
mediumwarning
lownote
infonote

Taint Traces in SARIF

Taint traces are represented as codeFlows with threadFlows in the SARIF output. Each step in the trace becomes a location in the thread flow, preserving the full data flow path for review in compatible tools.

Compatible Tools

SARIF output can be consumed by:

  • GitHub Code Scanning – Upload via github/codeql-action/upload-sarif
  • Azure DevOps – Native SARIF support in pipelines
  • VS Code – SARIF Viewer extension
  • GitLab – Security Dashboard (via SAST report)

Evidence Bundle

The evidence bundle is a directory-based format designed for security audits and compliance documentation.

mcp-scan scan . --mode deep --output evidence

Directory Structure

evidence-bundle/
  manifest.json         # All scanned files with SHA-256 hashes
  results.json          # Summary and aggregate statistics
  config.json           # Exact scan configuration used
  surface.json          # MCP surface analysis
  msss-score.json       # MSSS compliance breakdown
  evidences/
    finding-001.json    # Individual evidence per finding
    finding-002.json
    ...

Individual Evidence Files

Each finding gets its own evidence file with full context:

{
  "finding_id": "abc123def456",
  "rule_id": "MCP-A003",
  "severity": "critical",
  "confidence": "high",
  "location": {
    "file": "src/handler.py",
    "start_line": 42,
    "end_line": 42
  },
  "evidence": {
    "full_snippet": "def handle_run(cmd):\n    subprocess.run(cmd, shell=True)",
    "context_before": ["def handle_run(cmd):"],
    "context_after": ["    return result"],
    "snippet_hash": "sha256:789abc..."
  },
  "trace": { "..." : "..." },
  "mcp_context": { "..." : "..." },
  "analysis_metadata": {
    "mode": "deep",
    "scan_duration": "2.345s",
    "analyzer_version": "1.0.0"
  }
}

Evidence bundles are ideal for:

  • Formal security audits
  • Compliance documentation
  • Detailed code review where each finding needs individual assessment
  • Archiving scan results for regulatory requirements

Choosing a Format

FeatureJSONSARIFEvidence Bundle
CI/CD integrationGoodBestLimited
Human readabilityGoodPoorBest
GitHub Code ScanningNoYesNo
Audit trailGoodGoodBest
MSSS score includedYesNoYes
Individual evidence filesNoNoYes
Taint tracesYesYes (codeFlows)Yes
File sizeMediumLargeLargest

Recommendations:

  • JSON – General purpose, scripting, custom processing, daily scans
  • SARIF – CI/CD pipelines, GitHub/GitLab integration, IDE plugins
  • Evidence – Security audits, compliance documentation, formal review

Working with Baselines

Baselines let you track findings that have been reviewed and accepted, separating them from new findings in subsequent scans.

Creating a Baseline

# Generate baseline from current findings
mcp-scan baseline generate ./my-project \
  --reason "Initial security review" \
  --accepted-by "[email protected]"

Using a Baseline in Scans

# Scan with baseline applied -- accepted findings are excluded
mcp-scan scan . --baseline .mcp-scan-baseline.json

When a baseline is applied, the output includes a count of how many findings were filtered:

Total: 1 new finding (1 high)
Baselined: 3 findings filtered

In JSON output, the summary includes a baselined count:

{
  "summary": {
    "total": 1,
    "baselined": 3,
    "by_severity": { "high": 1 }
  }
}

Baseline Matching

Findings are matched to baseline entries using a combination of rule_id and location_hash (a SHA-256 hash of the file path and line number). If the code moves to a different line, the baseline entry will no longer match, and the finding will appear as new.

Viewing Baseline Contents

mcp-scan baseline show .mcp-scan-baseline.json

Merging Baselines

When multiple teams maintain separate baselines, merge them into one:

mcp-scan baseline merge team1.json team2.json --output combined.json

Baseline Best Practices

  1. Always document reasons – Include --reason and --accepted-by when generating baselines
  2. Version control your baselines – Commit .mcp-scan-baseline.json alongside your source code
  3. Review periodically – Quarterly review of all baseline entries to remove stale entries
  4. Separate by type – Maintain separate baselines for false positives, accepted risks, and legacy code, then merge for scanning
  5. Automate validation – In CI, check that the baseline does not grow beyond a threshold

Summary Section

The summary object provides aggregate statistics useful for dashboards and reporting:

{
  "summary": {
    "total": 15,
    "by_severity": {
      "critical": 3,
      "high": 5,
      "medium": 7
    },
    "by_class": {
      "A": 3,
      "E": 2,
      "L": 1
    },
    "by_language": {
      "python": 10,
      "typescript": 5
    }
  }
}

Use by_severity to understand the overall risk profile. Use by_class to identify which vulnerability categories need the most attention. Use by_language to see which parts of a multi-language project have the most issues.