SKILL PLAYGROUND · SAFETY SCANNER

Skillcheck

Skillcheck verifies third-party skills before you deploy. Run tasks in a controlled sandbox, capture evidence, and scan for safety issues before anything touches production.

Request Skillcheck preview See the flow

Skillcheck v0 is a static concept page. The runtime + scanner are in buildout.

Example Skill Report

Skill

Supply Chain Alert Bot

Task

Monitor API drift + notify Slack

Risk Scan

2 warnings · 0 blocks

Evidence

Trace log + artifacts bundle

Every run ships with a reproducible trace, inputs, outputs, and a safety summary.

Bring any skill

Point to a SKILL.md or bundle. Skillcheck pulls inputs, prompts, and constraints into a single run sheet.

Run with guardrails

Execute in a controlled environment with explicit tool policies and logged side effects.

Ship evidence

Each result includes the trace, outputs, and a structured safety report you can audit.

Why now

Five signals making Skillcheck urgent right now.

Model choices are exploding; teams need cross-vendor evaluation to pick the right stack now.
Safety expectations are rising; audit-ready evidence is becoming a baseline requirement.
LLM features are shipping weekly; repeatable skill tests are the only way to prevent regressions.
Safety tooling is fragmented; a unified skill + safety score reduces decision friction.
Cost pressure is real; measurable skill performance is required to justify spend.

Example Skillcheck

A concrete snapshot of what a skill review looks like before it ships.

Scenario

Vendor risk monitor

Scan new vendors weekly, flag anomalies, and notify the risk queue.

Inputs

SKILL.md + config bundle
Fixture dataset (100 vendors)
Tool policy: read-only APIs

Outputs

Trace log + JSON bundle
Safety scan summary
Risk alerts with evidence

How Skillcheck works

Three steps, one truth: prove the skill before you deploy it.

Ingest

Import a skill definition, configuration, and target task.

Execute

Run the skill against a safe fixture with bounded tools and explicit permissions.

Verify

Review the evidence pack: trace, outputs, and safety annotations.

Preview console (static)

Sketch what you want to test. The runtime button is wired to a placeholder for now.

Skill URL Task prompt

Coming soon.

Now

Visual + narrative v0

Clear story, honest scope, and a preview layout ready for runtime wiring.

Execution harness

Run skills in a sandbox with tool policies, evidence capture, and trace bundling.

Later

Safety scanner

Automatic policy checks, diff reports, and deploy-ready confidence scores.