Interpreting Results

How to read mcp-scan output, understand findings, taint traces, MSSS scores, and work with baselines for tracking known issues.

10 min read

MCP Scanner produces structured output in three formats: JSON, SARIF, and Evidence Bundle. This guide explains how to read each format, interpret the key fields, and use baselines to manage findings over time.

JSON Output

JSON is the default output format. It contains the complete scan results in a single structured document.

mcp-scan scan . --output json

Top-Level Structure

{
  "version": "1.0.0",
  "scan_id": "20240115-143022-abc123",
  "timestamp": "2024-01-15T14:30:22Z",
  "config": { "..." : "..." },
  "manifest": { "..." : "..." },
  "mcp_surface": { "..." : "..." },
  "msss_score": { "..." : "..." },
  "findings": [ "..." ],
  "summary": { "..." : "..." },
  "duration": "2.345s"
}

Field	Description
`version`	Output schema version
`scan_id`	Unique identifier for this scan run
`timestamp`	When the scan was performed (UTC)
`config`	Scan configuration used, including a SHA-256 hash for reproducibility
`manifest`	All scanned files with SHA-256 hashes and sizes
`mcp_surface`	Detected MCP tools, transport, and auth signals
`msss_score`	Security score breakdown and compliance level
`findings`	Array of all detected vulnerabilities
`summary`	Aggregate counts by severity, class, and language
`duration`	Total scan time

Understanding Findings

Each finding represents a detected vulnerability. Here is a complete finding with all fields:

{
  "id": "abc123def456",
  "rule_id": "MCP-A003",
  "severity": "critical",
  "confidence": "high",
  "language": "python",
  "location": {
    "file": "src/handler.py",
    "start_line": 42,
    "start_col": 5,
    "end_line": 42,
    "end_col": 35
  },
  "mcp_context": {
    "tool_name": "run_command",
    "handler_name": "handle_run",
    "transport": "stdio"
  },
  "trace": {
    "source": {
      "file": "src/handler.py",
      "start_line": 30,
      "start_col": 10
    },
    "sink": {
      "file": "src/handler.py",
      "start_line": 42,
      "start_col": 5
    },
    "steps": [
      {
        "location": { "file": "src/handler.py", "start_line": 30 },
        "action": "source",
        "variable": "user_input"
      },
      {
        "location": { "file": "src/handler.py", "start_line": 35 },
        "action": "assign",
        "variable": "cmd"
      },
      {
        "location": { "file": "src/handler.py", "start_line": 42 },
        "action": "sink",
        "variable": "cmd"
      }
    ]
  },
  "evidence": {
    "snippet": "subprocess.run(cmd, shell=True)",
    "snippet_hash": "sha256:789abc..."
  },
  "description": "Direct shell command execution detected",
  "remediation": "Use subprocess with shell=False and explicit command list"
}

Key Finding Fields

Identification:

Field	Description
`id`	Unique hash identifying this specific finding (used for baseline matching)
`rule_id`	The detection rule identifier (e.g., `MCP-A003`). The prefix letter indicates the vulnerability class

Severity and confidence:

Field	Values	Description
`severity`	`critical`, `high`, `medium`, `low`, `info`	How dangerous the vulnerability is
`confidence`	`high`, `medium`, `low`	How certain the scanner is that this is a real vulnerability

A critical severity, high confidence finding is the most urgent: a dangerous vulnerability that the scanner is very sure about. A low severity, low confidence finding may be a false positive or a minor issue worth reviewing later.

Location:

The location object pinpoints exactly where the vulnerability was found:

file – Relative path to the source file
start_line / end_line – Line range
start_col / end_col – Column range

MCP context:

The mcp_context object tells you which MCP tool is affected:

tool_name – The MCP tool name exposed to clients
handler_name – The function implementing the tool
transport – The transport type (e.g., stdio, sse)

Findings inside MCP tool handlers receive a 1.3x penalty multiplier because they are directly reachable from external clients.

Understanding Taint Traces

The trace object shows the complete data flow from source (where tainted data enters) to sink (where it reaches a dangerous operation).

Trace Structure

{
  "trace": {
    "source": { "file": "src/handler.py", "start_line": 30 },
    "sink": { "file": "src/handler.py", "start_line": 42 },
    "steps": [
      { "action": "source", "variable": "user_input", "location": { "..." : "..." } },
      { "action": "assign", "variable": "cmd", "location": { "..." : "..." } },
      { "action": "sink", "variable": "cmd", "location": { "..." : "..." } }
    ]
  }
}

Step Actions

Action	Meaning
`source`	Where tainted data originates (tool parameter, user input, file read)
`assign`	Variable assignment propagating taint
`call`	Function call passing tainted data (deep mode only)
`return`	Return value carrying taint (deep mode only)
`concat`	String concatenation incorporating tainted data
`sink`	Dangerous operation receiving tainted data

Reading a Trace

Follow the steps from top to bottom to understand the data flow:

Source – “Where does the dangerous data come from?” (line 30: user_input parameter)
Propagation – “How does it move through the code?” (line 35: assigned to cmd)
Sink – “Where does it reach a dangerous operation?” (line 42: passed to shell execution)

In deep mode, traces can span multiple files and functions. Each step includes the file path so you can follow cross-file flows.

MSSS Score Breakdown

The MSSS (MCP Server Security Standard) score is a 0-100 numeric score that maps to certification compliance levels 0-3. Higher scores mean better security.

Score in JSON Output

{
  "msss_score": {
    "total": 49.0,
    "level": 0,
    "compliant": false,
    "version": "2.0",
    "categories": {
      "A": { "score": 7.0, "max_score": 22.0, "findings": 1, "penalties": 15.0 },
      "B": { "score": 0.0, "max_score": 13.0, "findings": 1, "penalties": 15.0 }
    },
    "score_breakdown": {
      "base_score": 100.0,
      "total_penalties": 30.0,
      "severity_multiplier": 0.70,
      "critical_count": 0,
      "high_count": 2,
      "formula": "(100 - 30.0) x 0.70 = 49.0"
    }
  }
}

Score Fields

Field	Description
`total`	Final score (0-100)
`level`	Compliance level (0-3)
`compliant`	Whether the server meets minimum compliance
`version`	Scoring algorithm version
`categories`	Per-class breakdown showing score, maximum possible, findings count, and penalties
`score_breakdown`	Detailed calculation showing base score, penalties, multiplier, and formula

Compliance Levels

Level	Name	Score Required	Findings Allowed
0	Not Compliant	< 60, or any critical, or > 3 high	Any
1	Basic Compliance	>= 60	Up to 3 high, 0 critical
2	Enterprise Ready	>= 80	0 high, 0 critical
3	Certified	>= 90	0 high, 0 critical

How the Score Is Calculated

The v2.0 scoring formula is:

FinalScore = max(0, 100 - TotalPenalties) x SeverityMultiplier

Penalty per finding:

penalty = base_penalty x confidence_multiplier x mcp_multiplier

Severity	Base Penalty
Critical	25.0
High	15.0
Medium	5.0
Low	1.0
Info	0.2

Confidence	Multiplier
High	1.0
Medium	0.7
Low	0.4

MCP Context	Multiplier
Inside tool handler	1.3
Not in tool handler	1.0

Severity multiplier (compounding effect):

Condition	Multiplier
No high or critical	1.0
1 high (0 critical)	0.85
2 high (0 critical)	0.70
3 high (0 critical)	0.55
4+ high (0 critical)	0.45
1 critical	0.50
2 critical	0.35
3+ critical	0.25

Score Calculation Examples

Clean server (no findings):

Penalties = 0, Multiplier = 1.0
Score = (100 - 0) x 1.0 = 100 --> Level 3

One HIGH finding (high confidence):

Penalty = 15.0 x 1.0 x 1.0 = 15.0
Multiplier = 0.85
Score = (100 - 15) x 0.85 = 72.25 --> Level 1

Two HIGH findings (high confidence):

Penalties = 15.0 + 15.0 = 30.0
Multiplier = 0.70
Score = (100 - 30) x 0.70 = 49.0 --> Level 0

One CRITICAL finding (high confidence):

Penalty = 25.0 x 1.0 x 1.0 = 25.0
Multiplier = 0.50
Score = (100 - 25) x 0.50 = 37.5 --> Level 0

Five MEDIUM findings (high confidence):

Penalties = 5.0 x 5 = 25.0
Multiplier = 1.0
Score = (100 - 25) x 1.0 = 75.0 --> Level 1

Improving Your Score

Fix critical findings first – Each critical finding applies a 0.50x multiplier (or worse)
Reduce high findings – Each additional high finding compounds the penalty
Focus on high-weight classes – Class A (RCE, weight 22.0) has the most impact
Fix findings in MCP tool handlers – They receive a 1.3x penalty boost

Level Progression

To reach Level 1 (>= 60):

Fix all critical findings
Reduce high findings to 3 or fewer
Fix the highest-weight vulnerabilities first

To reach Level 2 (>= 80):

Fix all high and critical findings
Address medium findings with the highest penalties

To reach Level 3 (>= 90):

Fix all high and critical findings
Minimize medium findings
Run deep mode analysis (required for full certification)

SARIF Output

SARIF (Static Analysis Results Interchange Format) version 2.1.0 is the standard format for integrating with CI/CD tools.

mcp-scan scan . --output sarif > results.sarif

Severity Mapping

MCP Scanner severities are mapped to SARIF levels:

MCP-Scan Severity	SARIF Level
critical	error
high	error
medium	warning
low	note
info	note

Taint Traces in SARIF

Taint traces are represented as codeFlows with threadFlows in the SARIF output. Each step in the trace becomes a location in the thread flow, preserving the full data flow path for review in compatible tools.

Compatible Tools

SARIF output can be consumed by:

GitHub Code Scanning – Upload via github/codeql-action/upload-sarif
Azure DevOps – Native SARIF support in pipelines
VS Code – SARIF Viewer extension
GitLab – Security Dashboard (via SAST report)

Evidence Bundle

The evidence bundle is a directory-based format designed for security audits and compliance documentation.

mcp-scan scan . --mode deep --output evidence

Directory Structure

evidence-bundle/
  manifest.json         # All scanned files with SHA-256 hashes
  results.json          # Summary and aggregate statistics
  config.json           # Exact scan configuration used
  surface.json          # MCP surface analysis
  msss-score.json       # MSSS compliance breakdown
  evidences/
    finding-001.json    # Individual evidence per finding
    finding-002.json
    ...

Individual Evidence Files

Each finding gets its own evidence file with full context:

{
  "finding_id": "abc123def456",
  "rule_id": "MCP-A003",
  "severity": "critical",
  "confidence": "high",
  "location": {
    "file": "src/handler.py",
    "start_line": 42,
    "end_line": 42
  },
  "evidence": {
    "full_snippet": "def handle_run(cmd):\n    subprocess.run(cmd, shell=True)",
    "context_before": ["def handle_run(cmd):"],
    "context_after": ["    return result"],
    "snippet_hash": "sha256:789abc..."
  },
  "trace": { "..." : "..." },
  "mcp_context": { "..." : "..." },
  "analysis_metadata": {
    "mode": "deep",
    "scan_duration": "2.345s",
    "analyzer_version": "1.0.0"
  }
}

Evidence bundles are ideal for:

Formal security audits
Compliance documentation
Detailed code review where each finding needs individual assessment
Archiving scan results for regulatory requirements

Choosing a Format

Feature	JSON	SARIF	Evidence Bundle
CI/CD integration	Good	Best	Limited
Human readability	Good	Poor	Best
GitHub Code Scanning	No	Yes	No
Audit trail	Good	Good	Best
MSSS score included	Yes	No	Yes
Individual evidence files	No	No	Yes
Taint traces	Yes	Yes (codeFlows)	Yes
File size	Medium	Large	Largest

Recommendations:

JSON – General purpose, scripting, custom processing, daily scans
SARIF – CI/CD pipelines, GitHub/GitLab integration, IDE plugins
Evidence – Security audits, compliance documentation, formal review

Working with Baselines

Baselines let you track findings that have been reviewed and accepted, separating them from new findings in subsequent scans.

Creating a Baseline

# Generate baseline from current findings
mcp-scan baseline generate ./my-project \
  --reason "Initial security review" \
  --accepted-by "[email protected]"

Using a Baseline in Scans

# Scan with baseline applied -- accepted findings are excluded
mcp-scan scan . --baseline .mcp-scan-baseline.json

When a baseline is applied, the output includes a count of how many findings were filtered:

Total: 1 new finding (1 high)
Baselined: 3 findings filtered

In JSON output, the summary includes a baselined count:

{
  "summary": {
    "total": 1,
    "baselined": 3,
    "by_severity": { "high": 1 }
  }
}

Baseline Matching

Findings are matched to baseline entries using a combination of rule_id and location_hash (a SHA-256 hash of the file path and line number). If the code moves to a different line, the baseline entry will no longer match, and the finding will appear as new.

Viewing Baseline Contents

mcp-scan baseline show .mcp-scan-baseline.json

Merging Baselines

When multiple teams maintain separate baselines, merge them into one:

mcp-scan baseline merge team1.json team2.json --output combined.json

Baseline Best Practices

Always document reasons – Include --reason and --accepted-by when generating baselines
Version control your baselines – Commit .mcp-scan-baseline.json alongside your source code
Review periodically – Quarterly review of all baseline entries to remove stale entries
Separate by type – Maintain separate baselines for false positives, accepted risks, and legacy code, then merge for scanning
Automate validation – In CI, check that the baseline does not grow beyond a threshold

Summary Section

The summary object provides aggregate statistics useful for dashboards and reporting:

{
  "summary": {
    "total": 15,
    "by_severity": {
      "critical": 3,
      "high": 5,
      "medium": 7
    },
    "by_class": {
      "A": 3,
      "E": 2,
      "L": 1
    },
    "by_language": {
      "python": 10,
      "typescript": 5
    }
  }
}

Use by_severity to understand the overall risk profile. Use by_class to identify which vulnerability categories need the most attention. Use by_language to see which parts of a multi-language project have the most issues.