Skill Forge: Build Claude Skills

Skill Forge — build a Claude Skill that actually triggers
Skill Forge a checklist that won't let you ship a thin skill

What should it do

One concrete job, named the way you'd describe it to a colleague — not a category. "Helps write blog posts" is a category. "Turns a rough outline into a publish-ready Substack draft in my voice" is a job.

Lowercase, hyphens, no spaces. This becomes the folder name and the YAML name field. e.g. radiology-report-structurer

Not "helps with X" — name the concrete output and the person receiving it.

Devil's advocate If you can't write this in one sentence without the word "and," you probably have two skills, not one. Splitting is usually right — it makes each one's trigger description sharper, which is most of what makes a skill actually fire when needed.

When it should trigger most failures live here

Claude decides whether to use a skill almost entirely from its description. A vague description means the skill quietly never fires. This is the single highest-leverage section in the whole form.

Aim for 5+. Include indirect phrasings, not just the obvious one — people rarely ask for things the way you'd expect.

Aim for 2+. This is the section almost everyone skips, and it's why skills mis-fire on adjacent requests. What's close but different?

Devil's advocate A description tuned to fire on everything will also fire when it shouldn't — over-triggering is as real a failure as under-triggering, just less visible to you, because you only notice the times it misses, not the times it wrongly volunteers.

How it should behave

The steps Claude follows once the skill is active. Write this as instructions to a careful colleague, not as marketing copy about the skill.

3-8 steps is typical. Be concrete about decisions Claude needs to make along the way, not just "do the task."

Standards, templates, terminology, citation style — anything Claude should check against rather than improvise. If this is a clinical or governance skill, name the actual standard, not "best practices."

What it produces

Concrete shape of the output, plus one worked example. The example does more to anchor quality than another paragraph of description.

A doc, a table, a specific section structure, a file type — be literal.

Guardrails where unsafe skills get caught

What the skill must never do, and the edge cases that break naive versions of it. Skip this section and the skill will look fine in your one test and fail on the case you didn't think of.

Especially relevant for clinical, legal, financial, or anything with a real-world consequence if wrong.

What input would break this, or produce a confidently wrong answer?

Devil's advocate A guardrail list you wrote in thirty seconds is a guardrail list that will miss the actual failure mode. The honest version of this section names the specific way this exact skill goes wrong — not a generic "be accurate" disclaimer.

Review & export

Check the quality panel on the right before downloading. A skill scoring under 70 will likely under-trigger or mis-behave in ways you won't catch until someone else uses it.

Anatomy follows Anthropic's documented skill structure (YAML frontmatter + progressive disclosure). No data leaves your browser — nothing here is saved or transmitted. draft · not yet downloaded
copied

Comments