fix(skills): scan grouped skill directories

* fix(skills): scan nested subdirectories for grouped skill layouts Previously, skill discovery only checked immediate children of the skills root for SKILL.md files. Skills organized in subdirectories (e.g. ~/.openclaw/skills/coze/koze-retrieval/SKILL.md) were silently ignored. Now, when an immediate child directory does not contain a SKILL.md, its own children are checked one level deeper. This supports grouped skill layouts while keeping the scan depth bounded (max 2 levels) to avoid unbounded filesystem traversal. The existing per-source skill count limits and containment checks still apply to nested discoveries. Fixes #56915 * test(skills): cover nested grouped skill discovery * fix(skills): cache contained-path checks and cap nested scans - Reuse skillDirRealPath captured during the collection phase so the load loop no longer re-runs resolveContainedSkillPath on the same directory. - Apply the per-root candidate cap (and the matching warning log) when descending into nested grouped skill directories, matching the outer scan's behavior. Addresses Greptile P2 feedback on PR #72534. * fix(skills): load grouped skill directories under skills roots * fix(clownfish): address review for ghcrawl-156697-autonomous-smoke (1) --------- Co-authored-by: Otto Deng <otto@ottodeng.com> Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: Otto Deng <ottodeng2@github.local>
2026-05-06 12:50:42 +00:00 · 2026-04-30 14:56:19 +08:00
parent d18fdecd53
commit 8ca1f6d590
4 changed files with 223 additions and 61 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -296,6 +296,7 @@ Docs: https://docs.openclaw.ai
 - Pairing/doctor: bootstrap `commands.ownerAllowFrom` from the first approved DM pairing when no command owner exists, and have doctor explain missing owners so privileged slash commands are not accidentally unusable after onboarding. Thanks @pashpashpash.
 - Telegram/exec: infer native exec approvers from `commands.ownerAllowFrom` and auto-enable the Telegram approval client when an owner is resolvable, so owner-only commands such as `/diagnostics` can be approved in Telegram without duplicate per-channel approver config. Thanks @pashpashpash.
 - Auto-reply/session: carry the tail of user/assistant turns into the freshly-rotated transcript on silent in-reply session resets (compaction failure, role-ordering conflict) so direct-chat continuity survives the rebind. Fixes #70853. (#70898) Thanks @neeravmakwana.
+- Skills: load grouped skill directories such as `skills/<group>/<skill>/SKILL.md` from configured skill roots while keeping grouped discovery capped for large directories. Fixes #56915. (#72534) Thanks @ottodeng, @MoerAI, and @i010542.
 - Config: skip malformed non-string `env.vars` entries before env-reference checks, so config loading no longer crashes on JSON values like numbers or booleans. (#42402) Thanks @MiltonHeYan.
 - Docker Compose: default missing config and workspace bind mounts to `${HOME:-/tmp}/.openclaw` so manual compose runs do not create invalid empty-source volume specs. (#64485) Thanks @jlapenna.
 - Agents/context engines: preserve the child agent's configured `agentDir` when subagent cleanup re-resolves a context engine, so `onSubagentEnded` hooks keep operating on the correct per-agent state. (#67243) Thanks @jarimustonen.
--- a/docs/tools/skills.md
+++ b/docs/tools/skills.md
@@ -130,6 +130,9 @@ Native `openclaw skills install` installs into the active workspace
 `./skills` under your current working directory (or falls back to the
 configured OpenClaw workspace). OpenClaw picks that up as
 `<workspace>/skills` on the next session.
+Configured skill roots also support one grouping level, such as
+`skills/<group>/<skill>/SKILL.md`, so related third-party skills can be
+kept under a shared folder without broad recursive scanning.

 ClawHub skill pages expose the latest security scan state before install,
 with scanner detail pages for VirusTotal, ClawScan, and static analysis.
--- a/src/agents/skills.loadworkspaceskillentries.test.ts
+++ b/src/agents/skills.loadworkspaceskillentries.test.ts
@@ -412,4 +412,121 @@ describe("loadWorkspaceSkillEntries", () => {
      });
    },
  );
+
+  describe("nested skill subdirectories", () => {
+    it("discovers SKILL.md two levels deep under a grouping subfolder", async () => {
+      const workspaceDir = await createTempWorkspaceDir();
+      // Grouped layout: skills/group/skill/SKILL.md (no SKILL.md at skills/group/).
+      await writeSkill({
+        dir: path.join(workspaceDir, "skills", "group", "nested-skill"),
+        name: "nested-skill",
+        description: "Nested under a group folder",
+      });
+
+      const entries = loadTestWorkspaceSkillEntries(workspaceDir);
+      const names = entries.map((entry) => entry.skill.name);
+      expect(names).toContain("nested-skill");
+    });
+
+    it("keeps loading direct skills (skills/skill/SKILL.md) unchanged", async () => {
+      const workspaceDir = await createTempWorkspaceDir();
+      await writeSkill({
+        dir: path.join(workspaceDir, "skills", "direct-skill"),
+        name: "direct-skill",
+        description: "Direct skill at first level",
+      });
+      // Sibling group with a deeper skill.
+      await writeSkill({
+        dir: path.join(workspaceDir, "skills", "group", "grouped-skill"),
+        name: "grouped-skill",
+        description: "Skill nested under a group",
+      });
+
+      const names = loadTestWorkspaceSkillEntries(workspaceDir).map((entry) => entry.skill.name);
+      expect(names).toEqual(expect.arrayContaining(["direct-skill", "grouped-skill"]));
+    });
+
+    it("does not descend more than two levels (skills/a/b/c/SKILL.md is ignored)", async () => {
+      const workspaceDir = await createTempWorkspaceDir();
+      await writeSkill({
+        dir: path.join(workspaceDir, "skills", "a", "b", "c"),
+        name: "too-deep",
+        description: "Should not be discovered (depth 3)",
+      });
+
+      const names = loadTestWorkspaceSkillEntries(workspaceDir).map((entry) => entry.skill.name);
+      expect(names).not.toContain("too-deep");
+    });
+
+    it("does not fall through to child skills when an immediate SKILL.md is invalid", async () => {
+      const workspaceDir = await createTempWorkspaceDir();
+      const parentDir = path.join(workspaceDir, "skills", "group", "parent");
+      await fs.mkdir(parentDir, { recursive: true });
+      await fs.writeFile(path.join(parentDir, "SKILL.md"), "---\nname: parent\n---\n", "utf-8");
+      await writeSkill({
+        dir: path.join(parentDir, "child"),
+        name: "too-deep",
+        description: "Should not be discovered through invalid parent fallback",
+      });
+
+      const names = loadTestWorkspaceSkillEntries(workspaceDir).map((entry) => entry.skill.name);
+      expect(names).not.toContain("too-deep");
+    });
+
+    it("prefers the immediate SKILL.md and does not descend when one is present", async () => {
+      const workspaceDir = await createTempWorkspaceDir();
+      // skills/group/SKILL.md exists -> treat group as the skill itself.
+      await writeSkill({
+        dir: path.join(workspaceDir, "skills", "group"),
+        name: "group",
+        description: "Direct skill at the group level",
+      });
+      // skills/group/inner/SKILL.md should NOT be loaded as a separate skill.
+      await writeSkill({
+        dir: path.join(workspaceDir, "skills", "group", "inner"),
+        name: "inner",
+        description: "Should be ignored when parent is itself a skill",
+      });
+
+      const names = loadTestWorkspaceSkillEntries(workspaceDir).map((entry) => entry.skill.name);
+      expect(names).toContain("group");
+      expect(names).not.toContain("inner");
+    });
+
+    it("warns and caps discovery in large grouping subfolders", async () => {
+      const workspaceDir = await createTempWorkspaceDir();
+      for (let i = 0; i < 3; i += 1) {
+        const name = `nested-skill-${i}`;
+        await writeSkill({
+          dir: path.join(workspaceDir, "skills", "group", name),
+          name,
+          description: `Nested skill ${i}`,
+        });
+      }
+      const warn = captureWarningLogger();
+
+      const names = loadTestWorkspaceSkillEntries(workspaceDir, {
+        config: {
+          skills: {
+            limits: {
+              maxCandidatesPerRoot: 2,
+              maxSkillsLoadedPerSource: 10,
+            },
+          },
+        },
+      }).map((entry) => entry.skill.name);
+
+      expect(names).toEqual(expect.arrayContaining(["nested-skill-0", "nested-skill-1"]));
+      expect(names).not.toContain("nested-skill-2");
+      expect(
+        warn.mock.calls
+          .map(([line]) => String(line))
+          .some((line) =>
+            line.includes(
+              "Nested skills directory looks suspiciously large, truncating discovery.",
+            ),
+          ),
+      ).toBe(true);
+    });
+  });
 });
--- a/src/agents/skills/workspace.ts
+++ b/src/agents/skills/workspace.ts
@@ -140,6 +140,12 @@ type LoadedSkillRecord = {
  frontmatter?: ParsedSkillFrontmatter;
 };

+type CandidateSkillDir = {
+  skillDir: string;
+  name: string;
+  skillMdRealPath: string;
+};
+
 function resolveSkillsLimits(config?: OpenClawConfig, agentId?: string): ResolvedSkillsLimits {
  const limits = config?.skills?.limits;
  const agentSkillsLimits = resolveEffectiveAgentSkillsLimits(config, agentId);
@@ -288,32 +294,6 @@ function resolveContainedSkillPath(params: {
  return null;
 }

-function filterLoadedSkillRecordsInsideRoot(params: {
-  records: LoadedSkillRecord[];
-  source: string;
-  rootDir: string;
-  rootRealPath: string;
-}): LoadedSkillRecord[] {
-  return params.records.filter(({ skill }) => {
-    const baseDirRealPath = resolveContainedSkillPath({
-      source: params.source,
-      rootDir: params.rootDir,
-      rootRealPath: params.rootRealPath,
-      candidatePath: skill.baseDir,
-    });
-    if (!baseDirRealPath) {
-      return false;
-    }
-    const skillFileRealPath = resolveContainedSkillPath({
-      source: params.source,
-      rootDir: params.rootDir,
-      rootRealPath: params.rootRealPath,
-      candidatePath: skill.filePath,
-    });
-    return Boolean(skillFileRealPath);
-  });
-}
-
 function resolveNestedSkillsRoot(
  dir: string,
  opts?: {
@@ -365,6 +345,22 @@ function unwrapLoadedSkillRecords(loaded: unknown): LoadedSkillRecord[] {
  return [];
 }

+function loadContainedSkillRecords(params: {
+  skillDir: string;
+  source: string;
+  maxSkillFileBytes: number;
+}): LoadedSkillRecord[] {
+  const expectedBaseDir = path.resolve(params.skillDir);
+  const loaded = loadSkillsFromDirSafe({
+    dir: params.skillDir,
+    source: params.source,
+    maxBytes: params.maxSkillFileBytes,
+  });
+  return unwrapLoadedSkillRecords(loaded).filter(
+    (record) => path.resolve(record.skill.baseDir) === expectedBaseDir,
+  );
+}
+
 function loadSkillEntries(
  workspaceDir: string,
  opts?: {
@@ -423,23 +419,19 @@ function loadSkillEntries(
        return [];
      }

-      const loaded = loadSkillsFromDirSafe({
-        dir: baseDir,
+      return loadContainedSkillRecords({
+        skillDir: baseDir,
        source: params.source,
-        maxBytes: limits.maxSkillFileBytes,
-      });
-      return filterLoadedSkillRecordsInsideRoot({
-        records: unwrapLoadedSkillRecords(loaded),
-        source: params.source,
-        rootDir,
-        rootRealPath: baseDirRealPath,
+        maxSkillFileBytes: limits.maxSkillFileBytes,
      });
    }

    const childDirs = listChildDirectories(baseDir);
-    const suspicious = childDirs.length > limits.maxCandidatesPerRoot;
+    const maxCandidatesPerRoot = Math.max(0, limits.maxCandidatesPerRoot);
+    const maxSkillsLoadedPerSource = Math.max(0, limits.maxSkillsLoadedPerSource);
+    const suspicious = childDirs.length > maxCandidatesPerRoot;

-    const maxCandidates = Math.max(0, limits.maxSkillsLoadedPerSource);
+    const maxCandidates = Math.min(maxCandidatesPerRoot, maxSkillsLoadedPerSource);
    const limitedChildren = childDirs.toSorted().slice(0, maxCandidates);

    if (suspicious) {
@@ -461,7 +453,10 @@ function loadSkillEntries(

    const loadedSkills: LoadedSkillRecord[] = [];

-    // Only consider immediate subfolders that look like skills (have SKILL.md) and are under size cap.
+    // Consider immediate subfolders that look like skills (have SKILL.md) and are under size cap.
+    // When an immediate subfolder does NOT have a SKILL.md, check one level deeper for grouped
+    // skill directories (e.g. ~/.openclaw/skills/coze/koze-retrieval/SKILL.md).
+    const candidateDirs: CandidateSkillDir[] = [];
    for (const name of limitedChildren) {
      const skillDir = path.join(baseDir, name);
      const skillDirRealPath = resolveContainedSkillPath({
@@ -474,24 +469,76 @@ function loadSkillEntries(
        continue;
      }
      const skillMd = path.join(skillDir, "SKILL.md");
-      if (!fs.existsSync(skillMd)) {
-        continue;
+      if (fs.existsSync(skillMd)) {
+        const skillMdRealPath = resolveContainedSkillPath({
+          source: params.source,
+          rootDir,
+          rootRealPath: baseDirRealPath,
+          candidatePath: skillMd,
+        });
+        if (skillMdRealPath) {
+          candidateDirs.push({ skillDir, name, skillMdRealPath });
+        }
+      } else {
+        // No SKILL.md here — check one level deeper for grouped skill directories.
+        // Apply the same per-root cap as the outer scan to avoid scanning huge nested trees.
+        const nestedChildren = listChildDirectories(skillDir);
+        const nestedSuspicious = nestedChildren.length > maxCandidatesPerRoot;
+        if (nestedSuspicious) {
+          skillsLogger.warn(
+            "Nested skills directory looks suspiciously large, truncating discovery.",
+            {
+              dir: params.dir,
+              baseDir,
+              nestedDir: skillDir,
+              nestedChildDirCount: nestedChildren.length,
+              maxCandidatesPerRoot: limits.maxCandidatesPerRoot,
+              maxSkillsLoadedPerSource: limits.maxSkillsLoadedPerSource,
+            },
+          );
+        }
+        const limitedNested = nestedChildren.toSorted().slice(0, maxCandidatesPerRoot);
+        for (const nestedName of limitedNested) {
+          const nestedDir = path.join(skillDir, nestedName);
+          const nestedSkillMd = path.join(nestedDir, "SKILL.md");
+          if (fs.existsSync(nestedSkillMd)) {
+            const nestedDirRealPath = resolveContainedSkillPath({
+              source: params.source,
+              rootDir,
+              rootRealPath: baseDirRealPath,
+              candidatePath: nestedDir,
+            });
+            const nestedSkillMdRealPath = resolveContainedSkillPath({
+              source: params.source,
+              rootDir,
+              rootRealPath: baseDirRealPath,
+              candidatePath: nestedSkillMd,
+            });
+            if (nestedDirRealPath && nestedSkillMdRealPath) {
+              candidateDirs.push({
+                skillDir: nestedDir,
+                name: `${name}/${nestedName}`,
+                skillMdRealPath: nestedSkillMdRealPath,
+              });
+            }
+          }
+          if (candidateDirs.length >= maxSkillsLoadedPerSource) {
+            break;
+          }
+        }
      }
-      const skillMdRealPath = resolveContainedSkillPath({
-        source: params.source,
-        rootDir,
-        rootRealPath: baseDirRealPath,
-        candidatePath: skillMd,
-      });
-      if (!skillMdRealPath) {
-        continue;
+      if (candidateDirs.length >= maxSkillsLoadedPerSource) {
+        break;
      }
+    }
+
+    for (const { skillDir, name, skillMdRealPath } of candidateDirs) {
      try {
        const size = fs.statSync(skillMdRealPath).size;
        if (size > limits.maxSkillFileBytes) {
          skillsLogger.warn("Skipping skill due to oversized SKILL.md.", {
            skill: name,
-            filePath: skillMd,
+            filePath: path.join(skillDir, "SKILL.md"),
            size,
            maxSkillFileBytes: limits.maxSkillFileBytes,
          });
@@ -501,30 +548,24 @@ function loadSkillEntries(
        continue;
      }

-      const loaded = loadSkillsFromDirSafe({
-        dir: skillDir,
-        source: params.source,
-        maxBytes: limits.maxSkillFileBytes,
-      });
      loadedSkills.push(
-        ...filterLoadedSkillRecordsInsideRoot({
-          records: unwrapLoadedSkillRecords(loaded),
+        ...loadContainedSkillRecords({
+          skillDir,
          source: params.source,
-          rootDir,
-          rootRealPath: baseDirRealPath,
+          maxSkillFileBytes: limits.maxSkillFileBytes,
        }),
      );

-      if (loadedSkills.length >= limits.maxSkillsLoadedPerSource) {
+      if (loadedSkills.length >= maxSkillsLoadedPerSource) {
        break;
      }
    }

-    if (loadedSkills.length > limits.maxSkillsLoadedPerSource) {
+    if (loadedSkills.length > maxSkillsLoadedPerSource) {
      return loadedSkills
        .slice()
        .sort((a, b) => a.skill.name.localeCompare(b.skill.name, "en"))
-        .slice(0, limits.maxSkillsLoadedPerSource);
+        .slice(0, maxSkillsLoadedPerSource);
    }

    return loadedSkills;