Security Standards - CraftedTrust Methodology

Score model

100 points across 12 categories

The score buyers see is a 0-100 point model. Each category has a fixed weight, and the weights always add up to 100.

Underlying checks

63 checks across 9 research domains

Touchstone runs the deeper checks. Those checks feed the 12 public categories so the registry stays readable without hiding the underlying work.

Scan depth

Coverage changes confidence

A score is stronger when coverage is deeper. Metadata-only evidence is lighter than package verification, live endpoint scans, or manual review.

Authentication & Access

Identity & Auth

10 points. Auth requirements, documented auth flow, and obvious credential-handling risk.

Authentication & Access

Permission Scope

8 points. Whether the server asks for more power than it appears to need.

Server Security

Transport Security

8 points. HTTPS, TLS posture, and basic transport-layer safety signals.

Server Security

Network Behavior

10 points. Observed outbound behavior, undeclared connections, and suspicious network activity.

Server Security

Protocol Compliance

8 points. MCP compatibility, capability negotiation, and basic protocol correctness.

Tool Safety

Declaration Accuracy

8 points. Whether declared tools and resources match what is actually exposed.

Tool Safety

Tool Integrity

10 points. Prompt-injection risk, tool tampering patterns, and risky hidden behavior.

Tool Safety

Input Validation

8 points. Input constraints, schema quality, and common injection resistance signals.

Supply Chain

8 points. Dependency risk, package provenance, and known vulnerability exposure.

Supply Chain

Code Transparency

6 points. Source availability, repository health, and basic documentation quality.

Supply Chain

Publisher Trust

8 points. Verified publisher signals, review history, and public accountability.

Data Handling

Data Protection

8 points. Exposure of credentials, sensitive data, and avoidable data-handling risk.

Research domain	Checks	Feeds these public categories
Authentication	9	Identity & Auth, Permission Scope
Tool Security	10	Declaration Accuracy, Tool Integrity
Input Validation	9	Input Validation, Data Protection
Data Security	6	Data Protection, Network Behavior
Supply Chain	8	Supply Chain, Code Transparency, Publisher Trust
Infrastructure	8	Transport Security, Network Behavior, Protocol Compliance
Runtime Behavior	5	Tool Integrity, Network Behavior, Protocol Compliance
A2A Agent Cards	5	Declaration Accuracy, Protocol Compliance
Fairness & Bias	3	Data Protection, Publisher Trust

Not every check carries the same weight, and not every scan runs every check. That is why CraftedTrust separates the public score categories from the underlying research domains and then shows scan depth and confidence separately.

Depth 1

Metadata only

Basic listing information is present, but package or live behavior has not been fully verified yet. Lowest confidence.

Depth 2

Package verified

Package and source metadata were reviewed. Useful for supply-chain evidence, but still lighter than a live scan.

Depth 3

Live endpoint reached

CraftedTrust successfully contacted the live server and recorded behavior. This is stronger evidence for buyer review.

Depth 4

Manual review performed

A deeper publisher review or certification pass exists. This adds the strongest public confidence signal.

Grades

Letter grades stay fixed

A: 90-100
B: 75-89
C: 60-74
D: 40-59
F: 0-39

Framework mapping

Support material, not a second score

CoSAI, OWASP MCP and agentic AI guidance, and selected buyer diligence mappings help translate findings. They do not replace the core 12-category score model.

Publisher checklist Check reference

How CraftedTrust scores MCP servers.

One canonical model

100 points across 12 categories

63 checks across 9 research domains

Coverage changes confidence

The 12 public score categories

Identity & Auth

Permission Scope

Transport Security

Network Behavior

Protocol Compliance

Declaration Accuracy

Tool Integrity

Input Validation

Supply Chain

Code Transparency

Publisher Trust

Data Protection

How the 63 checks feed the 12 categories

Scan depth and confidence

Metadata only

Package verified

Live endpoint reached

Manual review performed

Letter grades stay fixed

Support material, not a second score