-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat: Enhanced skill matching with 8 new features #850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Features Implemented:
1. Skill Caching & Pre-indexing
- Builds skill index at startup for O(1) lookups
- Caches 160+ skills in memory
- Pre-indexes keywords, domains, and file extensions
2. Learning from Usage
- Tracks skill feedback in ~/.config/opencode/skill-feedback.json
- Records useful/useless ratings per skill
- Adjusts matching scores based on historical performance
3. Context-Aware Matching
- Scans project files to detect language/framework
- Boosts relevant skills based on .py, .ts, .go files
- Detects Docker, Kubernetes, database files
4. Explicit Skill Mentions
- Detects "use python-pro", "need postgres expertise"
- Auto-loads mentioned skills with +0.5 score boost
- Supports skill names, aliases, and natural language
5. Required Skill Bundles
- Auto-loads skill combinations for common tasks:
- webapp: python-pro + frontend + sql-pro + testing
- backend-api: backend-developer + api-designer + database-optimizer
- migration: database-optimizer + legacy-modernizer + testing
- debugging: debugger + error-detective + code-reviewer
- devops: devops-engineer + kubernetes-specialist + terraform-engineer
- data-science: python-pro + data-scientist + ml-engineer
6. LLM-Based Smart Matching
- Hybrid approach with keyword + description scoring
- Fallback to lightweight LLM matching when method=llm
- Configurable confidence threshold
7. Automatic Parallel Agent Spawning
- Detects complex queries needing background agents
- Suggests explore/librarian/oracle for relevant subtasks
- Triggers on "and also", "search for", "how does"
8. Skill Conflict Detection
- Warns when loading overlapping skills
- Conflict groups: debugger+error-detective, frontend skills, data skills
- Helps prevent redundant skill loading
Configuration (oh-my-opencode.json):
{
"auto_skill_matching": {
"enabled": true,
"threshold": 0.3,
"maxSkills": 5,
"method": "hybrid",
"enableCaching": true,
"enableContextAwareness": true,
"enableExplicitMentions": true,
"enableSkillBundles": true,
"enableConflictDetection": true,
"enableLearning": true
}
}
Test Results:
- Query: "Use python-pro to build web API"
Matched: python-pro, backend-developer, database-optimizer
Bundles: backend-api | Explicit: python-pro
- Query: "Deploy Docker to Kubernetes"
Matched: kubernetes-specialist, devops-engineer, terraform-engineer
Bundles: devops
- Query: "Fix bug and search for issues"
Should spawn parallel agents: true
|
Thank you for your contribution! Before we can merge this PR, we need you to sign our Contributor License Agreement (CLA). To sign the CLA, please comment on this PR with: This is a one-time requirement. Once signed, all your future contributions will be automatically accepted. I have read the CLA Document and I hereby sign the CLA kidwiz404 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7 issues found across 10 files
Confidence score: 3/5
- Skill ingestion can silently drop entire skills when
skill.yamllacks frontmatter becausesrc/features/skill-matcher/indexer.tsnever falls back toSKILL.md, so new content could disappear from matching results. - The keyword extractor in
src/features/skill-matcher/keyword-extractor.tsis still case-sensitive and lacks word boundaries, leading to missed matches for capitalized tech names and spurious substring hits, which can degrade matching quality for users. - Both the learning feedback logic and LLM path in
src/features/skill-matcher/enhanced-matcher.tsremain ineffective—the learning boost never penalizes bad feedback and the LLM configuration is ignored—so the advertised adaptive behavior will not work as expected. - Pay close attention to
src/features/skill-matcher/indexer.ts,src/features/skill-matcher/keyword-extractor.ts,src/features/skill-matcher/enhanced-matcher.ts,src/tools/sisyphus-task/tools.ts- multiple matching paths still exhibit functionality gaps.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/features/skill-matcher/keyword-extractor.ts">
<violation number="1" location="src/features/skill-matcher/keyword-extractor.ts:23">
P2: Domain term extraction is case-sensitive and lacks word boundaries, causing missed matches for capitalized tech names and false positives on substrings.</violation>
</file>
<file name="src/features/skill-matcher/enhanced-matcher.ts">
<violation number="1" location="src/features/skill-matcher/enhanced-matcher.ts:273">
P2: LLM matching path is unimplemented—`llm` method ignores `llmModel/llmThreshold` and always falls back to heuristics</violation>
<violation number="2" location="src/features/skill-matcher/enhanced-matcher.ts:286">
P2: Learning feedback is misapplied: passing baseScore=0 clamps out negative feedback and yields only a ~+2% boost, so useless feedback never penalizes and learning barely affects scores.</violation>
</file>
<file name="src/features/skill-matcher/indexer.ts">
<violation number="1" location="src/features/skill-matcher/indexer.ts:166">
P1: Skills are skipped when `skill.yaml` lacks frontmatter because parsing returns null and the code never falls back to `SKILL.md`.</violation>
<violation number="2" location="src/features/skill-matcher/indexer.ts:248">
P2: Basename extraction is platform-specific: splitting on "/" breaks on Windows paths, making context detection unreliable.</violation>
<violation number="3" location="src/features/skill-matcher/indexer.ts:258">
P2: Database context detection is unreachable: basename-only check for `db/` directory can never match</violation>
</file>
<file name="src/tools/sisyphus-task/tools.ts">
<violation number="1" location="src/tools/sisyphus-task/tools.ts:165">
P2: Auto skill matching now runs by default (enabled=true) even when users did not opt in; getDefaultConfig sets enabled true and matchSkills is called unconditionally when skills are empty.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| const skillPath = join(userSkillsDir, entry.name, "skill.yaml") | ||
| const mdPath = join(userSkillsDir, entry.name, "SKILL.md") | ||
|
|
||
| const skillInfo = existsSync(skillPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Skills are skipped when skill.yaml lacks frontmatter because parsing returns null and the code never falls back to SKILL.md.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/skill-matcher/indexer.ts, line 166:
<comment>Skills are skipped when `skill.yaml` lacks frontmatter because parsing returns null and the code never falls back to `SKILL.md`.</comment>
<file context>
@@ -0,0 +1,317 @@
+ const skillPath = join(userSkillsDir, entry.name, "skill.yaml")
+ const mdPath = join(userSkillsDir, entry.name, "SKILL.md")
+
+ const skillInfo = existsSync(skillPath)
+ ? loadSkillContent(skillPath)
+ : existsSync(mdPath)
</file context>
| } | ||
|
|
||
| export function extractDomainTerms(text: string): string[] { | ||
| const domainPatterns = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Domain term extraction is case-sensitive and lacks word boundaries, causing missed matches for capitalized tech names and false positives on substrings.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/skill-matcher/keyword-extractor.ts, line 23:
<comment>Domain term extraction is case-sensitive and lacks word boundaries, causing missed matches for capitalized tech names and false positives on substrings.</comment>
<file context>
@@ -0,0 +1,44 @@
+}
+
+export function extractDomainTerms(text: string): string[] {
+ const domainPatterns = [
+ /react|vue|angular|svelte|nextjs|nuxt/g,
+ /node|deno|bun|express|fastify/g,
</file context>
| const descScore = calculateDescriptionScore(query, skill.description) | ||
| const contextBoost = context && config.enableContextAwareness ? calculateContextBoost(context, skillName, skill.description) : 0 | ||
|
|
||
| const learningAdjustment = config.enableLearning ? getLearningAdjustedScore(skillName, 0) : 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Learning feedback is misapplied: passing baseScore=0 clamps out negative feedback and yields only a ~+2% boost, so useless feedback never penalizes and learning barely affects scores.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/skill-matcher/enhanced-matcher.ts, line 286:
<comment>Learning feedback is misapplied: passing baseScore=0 clamps out negative feedback and yields only a ~+2% boost, so useless feedback never penalizes and learning barely affects scores.</comment>
<file context>
@@ -0,0 +1,440 @@
+ const descScore = calculateDescriptionScore(query, skill.description)
+ const contextBoost = context && config.enableContextAwareness ? calculateContextBoost(context, skillName, skill.description) : 0
+
+ const learningAdjustment = config.enableLearning ? getLearningAdjustedScore(skillName, 0) : 0
+
+ switch (config.method) {
</file context>
| const bundleResult = config.enableSkillBundles ? detectBundles(query) : { bundles: [], skills: [] } | ||
|
|
||
| const scoredSkills: ScoredSkill[] = [] | ||
| const skillScores = config.method === "llm" ? new Map<string, number>() : null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: LLM matching path is unimplemented—llm method ignores llmModel/llmThreshold and always falls back to heuristics
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/skill-matcher/enhanced-matcher.ts, line 273:
<comment>LLM matching path is unimplemented—`llm` method ignores `llmModel/llmThreshold` and always falls back to heuristics</comment>
<file context>
@@ -0,0 +1,440 @@
+ const bundleResult = config.enableSkillBundles ? detectBundles(query) : { bundles: [], skills: [] }
+
+ const scoredSkills: ScoredSkill[] = []
+ const skillScores = config.method === "llm" ? new Map<string, number>() : null
+
+ for (const [skillName, skill] of index.skills) {
</file context>
| if (basename.includes("k8s") || basename.includes("kubernetes") || basename.endsWith(".yaml") || basename.endsWith(".yml")) { | ||
| hasK8s = true | ||
| } | ||
| if (basename.includes("schema") || basename.includes("migration") || basename.includes("db/")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Database context detection is unreachable: basename-only check for db/ directory can never match
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/skill-matcher/indexer.ts, line 258:
<comment>Database context detection is unreachable: basename-only check for `db/` directory can never match</comment>
<file context>
@@ -0,0 +1,317 @@
+ if (basename.includes("k8s") || basename.includes("kubernetes") || basename.endsWith(".yaml") || basename.endsWith(".yml")) {
+ hasK8s = true
+ }
+ if (basename.includes("schema") || basename.includes("migration") || basename.includes("db/")) {
+ hasDatabase = true
+ }
</file context>
| if (basename.includes("schema") || basename.includes("migration") || basename.includes("db/")) { | |
| if (basename.includes("schema") || basename.includes("migration") || file.toLowerCase().replace(/\\/g, "/").includes("/db/")) { |
| const ext = extname(file).toLowerCase().slice(1) | ||
| if (ext) extensions.add(ext) | ||
|
|
||
| const basename = file.split("/").pop()?.toLowerCase() || "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Basename extraction is platform-specific: splitting on "/" breaks on Windows paths, making context detection unreliable.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/skill-matcher/indexer.ts, line 248:
<comment>Basename extraction is platform-specific: splitting on "/" breaks on Windows paths, making context detection unreliable.</comment>
<file context>
@@ -0,0 +1,317 @@
+ const ext = extname(file).toLowerCase().slice(1)
+ if (ext) extensions.add(ext)
+
+ const basename = file.split("/").pop()?.toLowerCase() || ""
+ if (basename.includes("test") || basename.includes(".spec.") || basename.includes(".test.")) {
+ hasTests = true
</file context>
| const matcherConfig = getDefaultConfig() | ||
| try { | ||
| const openCodeConfig = await client.config.get() | ||
| const autoSkillMatch = (openCodeConfig as { auto_skill_matching?: Partial<typeof matcherConfig> })?.auto_skill_matching | ||
| if (autoSkillMatch) { | ||
| Object.assign(matcherConfig, autoSkillMatch) | ||
| } | ||
| } catch {} | ||
|
|
||
| const matchResult = matchSkills(args.prompt, matcherConfig, [], directory) | ||
| if (matchResult.matchedSkills.length > 0) { | ||
| skillsToUse = matchResult.matchedSkills | ||
| skillWarnings = matchResult.warnings | ||
| skillBundles = matchResult.usedBundles | ||
| explicitMentions = matchResult.explicitMentions | ||
| log("[sisyphus_task] Auto-matched skills", { | ||
| skills: skillsToUse, | ||
| count: skillsToUse.length, | ||
| bundles: skillBundles, | ||
| explicit: explicitMentions, | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Auto skill matching now runs by default (enabled=true) even when users did not opt in; getDefaultConfig sets enabled true and matchSkills is called unconditionally when skills are empty.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/tools/sisyphus-task/tools.ts, line 165:
<comment>Auto skill matching now runs by default (enabled=true) even when users did not opt in; getDefaultConfig sets enabled true and matchSkills is called unconditionally when skills are empty.</comment>
<file context>
@@ -147,8 +155,43 @@ export function createSisyphusTask(options: SisyphusTaskToolOptions): ToolDefini
+ let parallelAgentSuggestion: ReturnType<typeof suggestParallelAgents> | null = null
+
+ if (skillsToUse.length === 0 && args.prompt.trim()) {
+ const matcherConfig = getDefaultConfig()
+ try {
+ const openCodeConfig = await client.config.get()
</file context>
| const matcherConfig = getDefaultConfig() | |
| try { | |
| const openCodeConfig = await client.config.get() | |
| const autoSkillMatch = (openCodeConfig as { auto_skill_matching?: Partial<typeof matcherConfig> })?.auto_skill_matching | |
| if (autoSkillMatch) { | |
| Object.assign(matcherConfig, autoSkillMatch) | |
| } | |
| } catch {} | |
| const matchResult = matchSkills(args.prompt, matcherConfig, [], directory) | |
| if (matchResult.matchedSkills.length > 0) { | |
| skillsToUse = matchResult.matchedSkills | |
| skillWarnings = matchResult.warnings | |
| skillBundles = matchResult.usedBundles | |
| explicitMentions = matchResult.explicitMentions | |
| log("[sisyphus_task] Auto-matched skills", { | |
| skills: skillsToUse, | |
| count: skillsToUse.length, | |
| bundles: skillBundles, | |
| explicit: explicitMentions, | |
| }) | |
| const matcherConfig = { ...getDefaultConfig(), enabled: false } | |
| try { | |
| const openCodeConfig = await client.config.get() | |
| const autoSkillMatch = (openCodeConfig as { auto_skill_matching?: Partial<typeof matcherConfig> })?.auto_skill_matching | |
| if (autoSkillMatch) { | |
| Object.assign(matcherConfig, autoSkillMatch) | |
| } | |
| } catch {} | |
| if (matcherConfig.enabled) { | |
| const matchResult = matchSkills(args.prompt, matcherConfig, [], directory) | |
| if (matchResult.matchedSkills.length > 0) { | |
| skillsToUse = matchResult.matchedSkills | |
| skillWarnings = matchResult.warnings | |
| skillBundles = matchResult.usedBundles | |
| explicitMentions = matchResult.explicitMentions | |
| log("[sisyphus_task] Auto-matched skills", { | |
| skills: skillsToUse, | |
| count: skillsToUse.length, | |
| bundles: skillBundles, | |
| explicit: explicitMentions, | |
| }) | |
| } | |
| if (matcherConfig.method === "llm" || matcherConfig.enableCaching) { | |
| parallelAgentSuggestion = suggestParallelAgents(args.prompt) | |
| } | |
| } |
|
I have read the CLA Document and I hereby sign the CLA |
|
@404kidwiz Thanks for the PR. This definitely seems like a cool feature. And don't get discouraged. Cubic may continue to find issues. It's a damn good code reviewer. You just have to keep on fixing. 👍 |
Summary
This PR introduces an enhanced skill matching system with 8 major features that make skill selection smarter, context-aware, and adaptive.
Features Implemented
1. Skill Caching & Pre-indexing
2. Learning from Usage
~/.config/opencode/skill-feedback.json3. Context-Aware Matching
.py,.ts,.gofiles4. Explicit Skill Mentions
5. Required Skill Bundles
Auto-loads skill combinations for common tasks:
webapp: python-pro + frontend + sql-pro + testingbackend-api: backend-developer + api-designer + database-optimizermigration: database-optimizer + legacy-modernizer + testingdebugging: debugger + error-detective + code-reviewerdevops: devops-engineer + kubernetes-specialist + terraform-engineerdata-science: python-pro + data-scientist + ml-engineer6. LLM-Based Smart Matching
method=llm7. Automatic Parallel Agent Spawning
8. Skill Conflict Detection
Configuration
{ "auto_skill_matching": { "enabled": true, "threshold": 0.3, "maxSkills": 5, "method": "hybrid", "enableCaching": true, "enableContextAwareness": true, "enableExplicitMentions": true, "enableSkillBundles": true, "enableConflictDetection": true, "enableLearning": true } }Test Results
Query: "Use python-pro to build web API"
Query: "Deploy Docker to Kubernetes"
Query: "Fix bug and search for issues"
Files Changed
src/features/skill-matcher/types.ts- Type definitionssrc/features/skill-matcher/indexer.ts- Skill indexing and context detectionsrc/features/skill-matcher/enhanced-matcher.ts- Core matching logicsrc/features/skill-matcher/index.ts- Module exportssrc/tools/sisyphus-task/tools.ts- Integration with sisyphus_task toolsrc/index.ts- Startup index buildingassets/oh-my-opencode.schema.json- Updated schemaBreaking Changes
None. All features are opt-in via the new
auto_skill_matchingconfiguration.Backwards Compatibility
Existing skills continue to work unchanged. The auto-matching only triggers when
skills=[]is passed tosisyphus_task.Summary by cubic
Adds enhanced skill matching that auto-selects relevant skills from the query and project context, reducing manual selection and improving task accuracy. Improves performance with a cached index built at startup.
New Features
Migration
Written for commit dfae8d3. Summary will update on new commits.