feat: initial commit — BMAD tooling, Claude memories, firmware scaffold

Adds the complete project foundation:
- BMAD BMM workflow tooling (_bmad/)
- Claude slash commands, skills, and project memories (.claude/)
- ESP32 firmware scaffold (PlatformIO + Waveshare e-ink driver)
- .gitignore excluding _bmad-output/ and .pio/ build artifacts

Planning artifacts (PRD, architecture, epics) are intentionally not
tracked — they live in _bmad-output/ per project convention.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-27 15:38:46 -04:00
parent f40d902b7c
commit a536baabd6
2010 changed files with 352821 additions and 0 deletions
@@ -0,0 +1,46 @@
# Quality Dimensions — Quick Reference
Six dimensions to keep in mind when building agent skills. The quality scanners check these automatically during optimization — this is a mental checklist for the build phase.
## 1. Informed Autonomy
The executing agent needs enough context to make judgment calls when situations don't match the script. The Overview section establishes this: domain framing, theory of mind, design rationale.
- Simple agents with 1-2 capabilities need minimal context
- Agents with memory, autonomous mode, or complex capabilities need domain understanding, user perspective, and rationale for non-obvious choices
- When in doubt, explain *why* — an agent that understands the mission improvises better than one following blind steps
## 2. Intelligence Placement
Scripts handle plumbing (fetch, transform, validate). Prompts handle judgment (interpret, classify, decide).
**Test:** If a script contains an `if` that decides what content *means*, intelligence has leaked.
**Reverse test:** If a prompt validates structure, counts items, parses known formats, compares against schemas, or checks file existence — determinism has leaked into the LLM. That work belongs in a script. Scripts have access to full bash, Python with standard library plus PEP 723 dependencies, and system tools — think broadly about what can be offloaded.
## 3. Progressive Disclosure
SKILL.md stays focused. Detail goes where it belongs.
- Capability instructions → `./references/`
- Reference data, schemas, large tables → `./references/`
- Templates, starter files → `./assets/`
- Memory discipline → `./references/memory-system.md`
- Multi-capability SKILL.md under ~250 lines: fine as-is
- Single-purpose up to ~500 lines: acceptable if focused
## 4. Description Format
Two parts: `[5-8 word summary]. [Use when user says 'X' or 'Y'.]`
Default to conservative triggering. See `./references/standard-fields.md` for full format and examples.
## 5. Path Construction
Only use `{project-root}` for `_bmad` paths. Config variables used directly — they already contain `{project-root}`.
See `./references/standard-fields.md` for correct/incorrect patterns.
## 6. Token Efficiency
Remove genuine waste (repetition, defensive padding, meta-explanation). Preserve context that enables judgment (domain framing, theory of mind, design rationale). These are different things — the prompt-craft scanner distinguishes between them.
@@ -0,0 +1,343 @@
# Quality Scan Script Opportunities — Reference Guide
**Reference: `references/script-standards.md` for script creation guidelines.**
This document identifies deterministic operations that should be offloaded from the LLM into scripts for quality validation of BMad agents.
---
## Core Principle
Scripts validate structure and syntax (deterministic). Prompts evaluate semantics and meaning (judgment). Create scripts for checks that have clear pass/fail criteria.
---
## How to Spot Script Opportunities
During build, walk through every capability/operation and apply these tests:
### The Determinism Test
For each operation the agent performs, ask:
- Given identical input, will this ALWAYS produce identical output? → Script
- Does this require interpreting meaning, tone, context, or ambiguity? → Prompt
- Could you write a unit test with expected output for every input? → Script
### The Judgment Boundary
Scripts handle: fetch, transform, validate, count, parse, compare, extract, format, check structure
Prompts handle: interpret, classify with ambiguity, create, decide with incomplete info, evaluate quality, synthesize meaning
### Pattern Recognition Checklist
Table of signal verbs/patterns mapping to script types:
| Signal Verb/Pattern | Script Type |
|---------------------|-------------|
| "validate", "check", "verify" | Validation script |
| "count", "tally", "aggregate", "sum" | Metric/counting script |
| "extract", "parse", "pull from" | Data extraction script |
| "convert", "transform", "format" | Transformation script |
| "compare", "diff", "match against" | Comparison script |
| "scan for", "find all", "list all" | Pattern scanning script |
| "check structure", "verify exists" | File structure checker |
| "against schema", "conforms to" | Schema validation script |
| "graph", "map dependencies" | Dependency analysis script |
### The Outside-the-Box Test
Beyond obvious validation, consider:
- Could any data gathering step be a script that returns structured JSON for the LLM to interpret?
- Could pre-processing reduce what the LLM needs to read?
- Could post-processing validate what the LLM produced?
- Could metric collection feed into LLM decision-making without the LLM doing the counting?
### Your Toolbox
Scripts have access to full capabilities — think broadly:
- **Bash**: Full shell — `jq`, `grep`, `awk`, `sed`, `find`, `diff`, `wc`, `sort`, `uniq`, `curl`, plus piping and composition
- **Python**: Standard library (`json`, `yaml`, `pathlib`, `re`, `argparse`, `collections`, `difflib`, `ast`, `csv`, `xml`, etc.) plus PEP 723 inline-declared dependencies (`tiktoken`, `jsonschema`, `pyyaml`, etc.)
- **System tools**: `git` commands for history/diff/blame, filesystem operations, process execution
If you can express the logic as deterministic code, it's a script candidate.
### The --help Pattern
All scripts use PEP 723 and `--help`. When a skill's prompt needs to invoke a script, it can say "Run `scripts/foo.py --help` to understand inputs/outputs, then invoke appropriately" instead of inlining the script's interface. This saves tokens in prompts and keeps a single source of truth for the script's API.
---
## Priority 1: High-Value Validation Scripts
### 1. Frontmatter Validator
**What:** Validate SKILL.md frontmatter structure and content
**Why:** Frontmatter is the #1 factor in skill triggering. Catch errors early.
**Checks:**
```python
# checks:
- name exists and is kebab-case
- description exists and follows pattern "Use when..."
- No forbidden fields (XML, reserved prefixes)
- Optional fields have valid values if present
```
**Output:** JSON with pass/fail per field, line numbers for errors
**Implementation:** Python with argparse, no external deps needed
---
### 2. Template Artifact Scanner
**What:** Scan for orphaned template substitution artifacts
**Why:** Build process may leave `{if-autonomous}`, `{displayName}`, etc.
**Output:** JSON with file path, line number, artifact type
**Implementation:** Bash script with JSON output via jq
---
### 3. Access Boundaries Extractor
**What:** Extract and validate access boundaries from memory-system.md
**Why:** Security critical — must be defined before file operations
**Checks:**
```python
# Parse memory-system.md for:
- ## Read Access section exists
- ## Write Access section exists
- ## Deny Zones section exists (can be empty)
- Paths use placeholders correctly ({project-root} for _bmad paths, relative for skill-internal)
```
**Output:** Structured JSON of read/write/deny zones
**Implementation:** Python with markdown parsing
---
---
## Priority 2: Analysis Scripts
### 4. Token Counter
**What:** Count tokens in each file of an agent
**Why:** Identify verbose files that need optimization
**Checks:**
```python
# For each .md file:
- Total tokens (approximate: chars / 4)
- Code block tokens
- Token density (tokens / meaningful content)
```
**Output:** JSON with file path, token count, density score
**Implementation:** Python with tiktoken for accurate counting, or char approximation
---
### 5. Dependency Graph Generator
**What:** Map skill → external skill dependencies
**Why:** Understand agent's dependency surface
**Checks:**
```python
# Parse SKILL.md for skill invocation patterns
# Parse prompt files for external skill references
# Build dependency graph
```
**Output:** DOT format (GraphViz) or JSON adjacency list
**Implementation:** Python, JSON parsing only
---
### 6. Activation Flow Analyzer
**What:** Parse SKILL.md On Activation section for sequence
**Why:** Validate activation order matches best practices
**Checks:**
Validate that the activation sequence is logically ordered (e.g., config loads before config is used, memory loads before memory is referenced).
**Output:** JSON with detected steps, missing steps, out-of-order warnings
**Implementation:** Python with regex pattern matching
---
### 7. Memory Structure Validator
**What:** Validate memory-system.md structure
**Why:** Memory files have specific requirements
**Checks:**
```python
# Required sections:
- ## Core Principle
- ## File Structure
- ## Write Discipline
- ## Memory Maintenance
```
**Output:** JSON with missing sections, validation errors
**Implementation:** Python with markdown parsing
---
### 8. Subagent Pattern Detector
**What:** Detect if agent uses BMAD Advanced Context Pattern
**Why:** Agents processing 5+ sources MUST use subagents
**Checks:**
```python
# Pattern detection in SKILL.md:
- "DO NOT read sources yourself"
- "delegate to sub-agents"
- "/tmp/analysis-" temp file pattern
- Sub-agent output template (50-100 token summary)
```
**Output:** JSON with pattern found/missing, recommendations
**Implementation:** Python with keyword search and context extraction
---
## Priority 3: Composite Scripts
### 9. Agent Health Check
**What:** Run all validation scripts and aggregate results
**Why:** One-stop shop for agent quality assessment
**Composition:** Runs Priority 1 scripts, aggregates JSON outputs
**Output:** Structured health report with severity levels
**Implementation:** Bash script orchestrating Python scripts, jq for aggregation
---
### 10. Comparison Validator
**What:** Compare two versions of an agent for differences
**Why:** Validate changes during iteration
**Checks:**
```bash
# Git diff with structure awareness:
- Frontmatter changes
- Capability additions/removals
- New prompt files
- Token count changes
```
**Output:** JSON with categorized changes
**Implementation:** Bash with git, jq, python for analysis
---
## Script Output Standard
All scripts MUST output structured JSON for agent consumption:
```json
{
"script": "script-name",
"version": "1.0.0",
"agent_path": "/path/to/agent",
"timestamp": "2025-03-08T10:30:00Z",
"status": "pass|fail|warning",
"findings": [
{
"severity": "critical|high|medium|low|info",
"category": "structure|security|performance|consistency",
"location": {"file": "SKILL.md", "line": 42},
"issue": "Clear description",
"fix": "Specific action to resolve"
}
],
"summary": {
"total": 10,
"critical": 1,
"high": 2,
"medium": 3,
"low": 4
}
}
```
---
## Implementation Checklist
When creating validation scripts:
- [ ] Uses `--help` for documentation
- [ ] Accepts `--agent-path` for target agent
- [ ] Outputs JSON to stdout
- [ ] Writes diagnostics to stderr
- [ ] Returns meaningful exit codes (0=pass, 1=fail, 2=error)
- [ ] Includes `--verbose` flag for debugging
- [ ] Has tests in `scripts/tests/` subfolder
- [ ] Self-contained (PEP 723 for Python)
- [ ] No interactive prompts
---
## Integration with Quality Optimizer
The Quality Optimizer should:
1. **First**: Run available scripts for fast, deterministic checks
2. **Then**: Use sub-agents for semantic analysis (requires judgment)
3. **Finally**: Synthesize both sources into report
**Example flow:**
```bash
# Run all validation scripts
python scripts/validate-frontmatter.py --agent-path {path}
bash scripts/scan-template-artifacts.sh --agent-path {path}
# Collect JSON outputs
# Spawn sub-agents only for semantic checks
# Synthesize complete report
```
---
## Script Creation Priorities
**Phase 1 (Immediate value):**
1. Template Artifact Scanner (Bash + jq)
2. Access Boundaries Extractor (Python)
**Phase 2 (Enhanced validation):**
4. Token Counter (Python)
5. Subagent Pattern Detector (Python)
6. Activation Flow Analyzer (Python)
**Phase 3 (Advanced features):**
7. Dependency Graph Generator (Python)
8. Memory Structure Validator (Python)
9. Agent Health Check orchestrator (Bash)
**Phase 4 (Comparison tools):**
10. Comparison Validator (Bash + Python)
@@ -0,0 +1,109 @@
# Skill Authoring Best Practices
For field definitions and description format, see `./standard-fields.md`. For quality dimensions, see `./quality-dimensions.md`.
## Core Philosophy: Outcome-Based Authoring
Skills should describe **what to achieve**, not **how to achieve it**. The LLM is capable of figuring out the approach — it needs to know the goal, the constraints, and the why.
**The test for every instruction:** Would removing this cause the LLM to produce a worse outcome? If the LLM would do it anyway — or if it's just spelling out mechanical steps — cut it.
### Outcome vs Prescriptive
| Prescriptive (avoid) | Outcome-based (prefer) |
|---|---|
| "Step 1: Ask about goals. Step 2: Ask about constraints. Step 3: Summarize and confirm." | "Ensure the user's vision is fully captured — goals, constraints, and edge cases — before proceeding." |
| "Load config. Read user_name. Read communication_language. Greet the user by name in their language." | "Load available config and greet the user appropriately." |
| "Create a file. Write the header. Write section 1. Write section 2. Save." | "Produce a report covering X, Y, and Z." |
The prescriptive versions miss requirements the author didn't think of. The outcome-based versions let the LLM adapt to the actual situation.
### Why This Works
- **Why over what** — When you explain why something matters, the LLM adapts to novel situations. When you just say what to do, it follows blindly even when it shouldn't.
- **Context enables judgment** — Give domain knowledge, constraints, and goals. The LLM figures out the approach. It's better at adapting to messy reality than any script you could write.
- **Prescriptive steps create brittleness** — When reality doesn't match the script, the LLM either follows the wrong script or gets confused. Outcomes let it adapt.
- **Every instruction should carry its weight** — If the LLM would do it anyway, the instruction is noise. If the LLM wouldn't know to do it without being told, that's signal.
### When Prescriptive Is Right
Reserve exact steps for **fragile operations** where getting it wrong has consequences — script invocations, exact file paths, specific CLI commands, API calls with precise parameters. These need low freedom because there's one right way to do them.
| Freedom | When | Example |
|---------|------|---------|
| **High** (outcomes) | Multiple valid approaches, LLM judgment adds value | "Ensure the user's requirements are complete" |
| **Medium** (guided) | Preferred approach exists, some variation OK | "Present findings in a structured report with an executive summary" |
| **Low** (exact) | Fragile, one right way, consequences for deviation | `python3 scripts/scan-path-standards.py {skill-path}` |
## Patterns
These are patterns that naturally emerge from outcome-based thinking. Apply them when they fit — they're not a checklist.
### Soft Gate Elicitation
At natural transitions, invite contribution without demanding it: "Anything else, or shall we move on?" Users almost always remember one more thing when given a graceful exit ramp. This produces richer artifacts than rigid section-by-section questioning.
### Intent-Before-Ingestion
Understand why the user is here before scanning documents or project context. Intent gives you the relevance filter — without it, scanning is noise.
### Capture-Don't-Interrupt
When users provide information beyond the current scope, capture it for later rather than redirecting. Users in creative flow share their best insights unprompted — interrupting loses them.
### Dual-Output: Human Artifact + LLM Distillate
Artifact-producing skills can output both a polished human-facing document and a token-efficient distillate for downstream LLM consumption. The distillate captures overflow, rejected ideas, and detail that doesn't belong in the human doc but has value for the next workflow. Always optional.
### Parallel Review Lenses
Before finalizing significant artifacts, fan out reviewers with different perspectives — skeptic, opportunity spotter, domain-specific lens. If subagents aren't available, do a single critical self-review pass. Multiple perspectives catch blind spots no single reviewer would.
### Three-Mode Architecture (Guided / Yolo / Headless)
Consider whether the skill benefits from multiple execution modes:
| Mode | When | Behavior |
|------|------|----------|
| **Guided** | Default | Conversational discovery with soft gates |
| **Yolo** | "just draft it" | Ingest everything, draft complete artifact, then refine |
| **Headless** | `--headless` / `-H` | Complete the task without user input, using sensible defaults |
Not all skills need all three. But considering them during design prevents locking into a single interaction model.
### Graceful Degradation
Every subagent-dependent feature should have a fallback path. A skill that hard-fails without subagents is fragile — one that falls back to sequential processing works everywhere.
### Verifiable Intermediate Outputs
For complex tasks with consequences: plan → validate → execute → verify. Create a verifiable plan before executing, validate with scripts where possible. Catches errors early and makes the work reversible.
## Writing Guidelines
- **Consistent terminology** — one term per concept, stick to it
- **Third person** in descriptions — "Processes files" not "I help process files"
- **Descriptive file names** — `form_validation_rules.md` not `doc2.md`
- **Forward slashes** in all paths — cross-platform
- **One level deep** for reference files — SKILL.md → reference.md, never chains
- **TOC for long files** — >100 lines
## Anti-Patterns
| Anti-Pattern | Fix |
|---|---|
| Numbered steps for things the LLM would figure out | Describe the outcome and why it matters |
| Explaining how to load config (the mechanic) | List the config keys and their defaults (the outcome) |
| Prescribing exact greeting/menu format | "Greet the user and present capabilities" |
| Spelling out headless mode in detail | "If headless, complete without user input" |
| Too many options upfront | One default with escape hatch |
| Deep reference nesting (A→B→C) | Keep references 1 level from SKILL.md |
| Inconsistent terminology | Choose one term per concept |
| Scripts that classify meaning via regex | Intelligence belongs in prompts, not scripts |
## Scripts in Skills
- **Execute vs reference** — "Run `analyze.py`" (execute) vs "See `analyze.py` for the algorithm" (read)
- **Document constants** — explain why `TIMEOUT = 30`, not just what
- **PEP 723 for Python** — self-contained with inline dependency declarations
- **MCP tools** — use fully qualified names: `ServerName:tool_name`
@@ -0,0 +1,79 @@
# Standard Agent Fields
## Frontmatter Fields
Only these fields go in the YAML frontmatter block:
| Field | Description | Example |
|-------|-------------|---------|
| `name` | Full skill name (kebab-case, same as folder name) | `bmad-agent-tech-writer`, `bmad-cis-agent-lila` |
| `description` | [What it does]. [Use when user says 'X' or 'Y'.] | See Description Format below |
## Content Fields
These are used within the SKILL.md body — never in frontmatter:
| Field | Description | Example |
|-------|-------------|---------|
| `displayName` | Friendly name (title heading, greetings) | `Paige`, `Lila`, `Floyd` |
| `title` | Role title | `Tech Writer`, `Holodeck Operator` |
| `icon` | Single emoji | `🔥`, `🌟` |
| `role` | Functional role | `Technical Documentation Specialist` |
| `sidecar` | Memory folder (optional) | `{skillName}-sidecar/` |
## Overview Section Format
The Overview is the first section after the title — it primes the AI for everything that follows.
**3-part formula:**
1. **What** — What this agent does
2. **How** — How it works (role, approach, modes)
3. **Why/Outcome** — Value delivered, quality standard
**Templates by agent type:**
**Companion agents:**
```markdown
This skill provides a {role} who helps users {primary outcome}. Act as {displayName} — {key quality}. With {key features}, {displayName} {primary value proposition}.
```
**Workflow agents:**
```markdown
This skill helps you {outcome} through {approach}. Act as {role}, guiding users through {key stages/phases}. Your output is {deliverable}.
```
**Utility agents:**
```markdown
This skill {what it does}. Use when {when to use}. Returns {output format} with {key feature}.
```
## SKILL.md Description Format
```
{description of what the agent does}. Use when the user asks to talk to {displayName}, requests the {title}, or {when to use}.
```
## Path Rules
### Skill-Internal Files
All references to files within the skill use `./` relative paths:
- `./references/memory-system.md`
- `./references/some-guide.md`
- `./scripts/calculate-metrics.py`
This distinguishes skill-internal files from `{project-root}` paths — without the `./` prefix the LLM may confuse them.
### Memory Files (sidecar)
Always use `{project-root}` prefix: `{project-root}/_bmad/memory/{skillName}-sidecar/`
The sidecar `index.md` is the single entry point to the agent's memory system — it tells the agent what else to load (boundaries, logs, references, etc.). Load it once on activation; don't duplicate load instructions for individual memory files.
### Config Variables
Use directly — they already contain `{project-root}` in their resolved values:
- `{output_folder}/file.md`
- Correct: `{bmad_builder_output_folder}/agent.md`
- Wrong: `{project-root}/{bmad_builder_output_folder}/agent.md` (double-prefix)
@@ -0,0 +1,44 @@
# Template Substitution Rules
The SKILL-template provides a minimal skeleton: frontmatter, overview, agent identity sections, sidecar, and activation with config loading. Everything beyond that is crafted by the builder based on what was learned during discovery and requirements phases.
## Frontmatter
- `{module-code-or-empty}` → Module code prefix with hyphen (e.g., `cis-`) or empty for standalone
- `{agent-name}` → Agent functional name (kebab-case)
- `{skill-description}` → Two parts: [4-6 word summary]. [trigger phrases]
- `{displayName}` → Friendly display name
- `{skillName}` → Full skill name with module prefix
## Module Conditionals
### For Module-Based Agents
- `{if-module}` ... `{/if-module}` → Keep the content inside
- `{if-standalone}` ... `{/if-standalone}` → Remove the entire block including markers
- `{module-code}` → Module code without trailing hyphen (e.g., `cis`)
- `{module-setup-skill}` → Name of the module's setup skill (e.g., `bmad-cis-setup`)
### For Standalone Agents
- `{if-module}` ... `{/if-module}` → Remove the entire block including markers
- `{if-standalone}` ... `{/if-standalone}` → Keep the content inside
## Sidecar Conditionals
- `{if-sidecar}` ... `{/if-sidecar}` → Keep if agent has persistent memory, otherwise remove
- `{if-no-sidecar}` ... `{/if-no-sidecar}` → Inverse of above
## Headless Conditional
- `{if-headless}` ... `{/if-headless}` → Keep if agent supports headless mode, otherwise remove
## Beyond the Template
The builder determines the rest of the agent structure — capabilities, activation flow, sidecar initialization, capability routing, external skills, scripts — based on the agent's requirements. The template intentionally does not prescribe these.
## Path References
All generated agents use `./` prefix for skill-internal paths:
- `./references/init.md` — First-run onboarding (if sidecar)
- `./references/{capability}.md` — Individual capability prompts
- `./references/memory-system.md` — Memory discipline (if sidecar)
- `./scripts/` — Python/shell scripts for deterministic operations
@@ -0,0 +1,267 @@
# Universal Scanner Output Schema
All quality scanners — both LLM-based and deterministic lint scripts — MUST produce output conforming to this schema. No exceptions.
## Top-Level Structure
```json
{
"scanner": "scanner-name",
"skill_path": "{path}",
"findings": [],
"assessments": {},
"summary": {
"total_findings": 0,
"by_severity": {},
"assessment": "1-2 sentence overall assessment"
}
}
```
| Key | Type | Required | Description |
|-----|------|----------|-------------|
| `scanner` | string | yes | Scanner identifier (e.g., `"workflow-integrity"`, `"prompt-craft"`) |
| `skill_path` | string | yes | Absolute path to the skill being scanned |
| `findings` | array | yes | ALL items — issues, strengths, suggestions, opportunities. Always an array, never an object |
| `assessments` | object | yes | Scanner-specific structured analysis (cohesion tables, health metrics, user journeys, etc.). Free-form per scanner |
| `summary` | object | yes | Aggregate counts and brief overall assessment |
## Finding Schema (7 fields)
Every item in `findings[]` has exactly these 7 fields:
```json
{
"file": "SKILL.md",
"line": 42,
"severity": "high",
"category": "frontmatter",
"title": "Brief headline of the finding",
"detail": "Full context — rationale, what was observed, why it matters",
"action": "What to do about it — fix, suggestion, or script to create"
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `file` | string | yes | Relative path to the affected file (e.g., `"SKILL.md"`, `"scripts/build.py"`). Empty string if not file-specific |
| `line` | int\|null | no | Line number (1-based). `null` or `0` if not line-specific |
| `severity` | string | yes | One of the severity values below |
| `category` | string | yes | Scanner-specific category (e.g., `"frontmatter"`, `"token-waste"`, `"lint"`) |
| `title` | string | yes | Brief headline (1 sentence). This is the primary display text |
| `detail` | string | yes | Full context — fold rationale, observation, impact, nuance into one narrative. Empty string if title is self-explanatory |
| `action` | string | yes | What to do — fix instruction, suggestion, or script to create. Empty string for strengths/notes |
## Severity Values (complete enum)
```
critical | high | medium | low | high-opportunity | medium-opportunity | low-opportunity | suggestion | strength | note
```
**Routing rules:**
- `critical`, `high` → "Truly Broken" section in report
- `medium`, `low` → category-specific findings sections
- `high-opportunity`, `medium-opportunity`, `low-opportunity` → enhancement/creative sections
- `suggestion` → creative suggestions section
- `strength` → strengths section (positive observations worth preserving)
- `note` → informational observations, also routed to strengths
## Assessment Sub-Structure Contracts
The `assessments` object is free-form per scanner, but the HTML report renderer expects specific shapes for specific keys. These are the canonical formats.
### user_journeys (enhancement-opportunities scanner)
**Always an array of objects. Never an object keyed by persona.**
```json
"user_journeys": [
{
"archetype": "first-timer",
"summary": "Brief narrative of this user's experience",
"friction_points": ["moment 1", "moment 2"],
"bright_spots": ["what works well"]
}
]
```
### autonomous_assessment (enhancement-opportunities scanner)
```json
"autonomous_assessment": {
"potential": "headless-ready|easily-adaptable|partially-adaptable|fundamentally-interactive",
"hitl_points": 3,
"auto_resolvable": 2,
"needs_input": 1,
"notes": "Brief assessment"
}
```
### top_insights (enhancement-opportunities scanner)
**Always an array of objects with title/detail/action (same shape as findings but without file/line/severity/category).**
```json
"top_insights": [
{
"title": "The key observation",
"detail": "Why it matters",
"action": "What to do about it"
}
]
```
### cohesion_analysis (skill-cohesion / agent-cohesion scanner)
```json
"cohesion_analysis": {
"dimension_name": { "score": "strong|moderate|weak", "notes": "explanation" }
}
```
Dimension names are scanner-specific (e.g., `stage_flow_coherence`, `persona_alignment`). The report renderer iterates all keys and renders a table row per dimension.
### skill_identity / agent_identity (cohesion scanners)
```json
"skill_identity": {
"name": "skill-name",
"purpose_summary": "Brief characterization",
"primary_outcome": "What this skill produces"
}
```
### skillmd_assessment (prompt-craft scanner)
```json
"skillmd_assessment": {
"overview_quality": "appropriate|excessive|missing",
"progressive_disclosure": "good|needs-extraction|monolithic",
"notes": "brief assessment"
}
```
Agent variant adds `"persona_context": "appropriate|excessive|missing"`.
### prompt_health (prompt-craft scanner)
```json
"prompt_health": {
"total_prompts": 3,
"with_config_header": 2,
"with_progression": 1,
"self_contained": 3
}
```
### skill_understanding (enhancement-opportunities scanner)
```json
"skill_understanding": {
"purpose": "what this skill does",
"primary_user": "who it's for",
"assumptions": ["assumption 1", "assumption 2"]
}
```
### stage_summary (workflow-integrity scanner)
```json
"stage_summary": {
"total_stages": 0,
"missing_stages": [],
"orphaned_stages": [],
"stages_without_progression": [],
"stages_without_config_header": []
}
```
### metadata (structure scanner)
Free-form key-value pairs. Rendered as a metadata block.
### script_summary (scripts lint)
```json
"script_summary": {
"total_scripts": 5,
"by_type": {"python": 3, "shell": 2},
"missing_tests": ["script1.py"]
}
```
### existing_scripts (script-opportunities scanner)
Array of strings (script paths that already exist).
## Complete Example
```json
{
"scanner": "workflow-integrity",
"skill_path": "/path/to/skill",
"findings": [
{
"file": "SKILL.md",
"line": 12,
"severity": "high",
"category": "frontmatter",
"title": "Missing 'description' field in frontmatter",
"detail": "The SKILL.md frontmatter is missing the description field. Without a description, the skill cannot be triggered reliably by the help system.",
"action": "Add a description with trigger phrases to the YAML frontmatter block"
},
{
"file": "build-process.md",
"line": null,
"severity": "strength",
"category": "design",
"title": "Excellent progressive disclosure pattern in build stages",
"detail": "Each stage provides exactly the context needed without front-loading information. This reduces token waste and improves LLM comprehension.",
"action": ""
},
{
"file": "SKILL.md",
"line": 45,
"severity": "medium-opportunity",
"category": "experience-gap",
"title": "No guidance for first-time users unfamiliar with build workflows",
"detail": "A user encountering this skill for the first time has no onboarding path. The skill assumes familiarity with stage-based workflows, which creates friction for newcomers.",
"action": "Add a 'Getting Started' section or link to onboarding documentation"
}
],
"assessments": {
"stage_summary": {
"total_stages": 7,
"missing_stages": [],
"orphaned_stages": ["cleanup"]
}
},
"summary": {
"total_findings": 3,
"by_severity": {"high": 1, "medium-opportunity": 1, "strength": 1},
"assessment": "Well-structured skill with one critical frontmatter gap. Progressive disclosure is a notable strength."
}
}
```
## DO NOT
- **DO NOT** rename fields. Use exactly: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`
- **DO NOT** use `issues` instead of `findings` — the array is always called `findings`
- **DO NOT** add fields to findings beyond the 7 defined above. Put scanner-specific structured data in `assessments`
- **DO NOT** use separate arrays for strengths, suggestions, or opportunities — they go in `findings` with appropriate severity values
- **DO NOT** change `user_journeys` from an array to an object keyed by persona name
- **DO NOT** restructure assessment sub-objects — use the shapes defined above
- **DO NOT** put free-form narrative data into `assessments` — that belongs in `detail` fields of findings or in `summary.assessment`
## Self-Check Before Output
Before writing your JSON output, verify:
1. Is your array called `findings` (not `issues`, not `opportunities`)?
2. Does every item in `findings` have all 7 fields: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`?
3. Are strengths in `findings` with `severity: "strength"` (not in a separate `strengths` array)?
4. Are suggestions in `findings` with `severity: "suggestion"` (not in a separate `creative_suggestions` array)?
5. Is `assessments` an object containing structured analysis data (not items that belong in findings)?
6. Is `user_journeys` an array of objects (not an object keyed by persona)?
7. Do `top_insights` items use `title`/`detail`/`action` (not `insight`/`suggestion`/`why_it_matters`)?