Compare commits
4 Commits
0d511d9a03
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ed945abdf1 | ||
|
|
8efefcc230 | ||
|
|
a02ac14481 | ||
|
|
93624efdab |
@@ -1,7 +1,19 @@
|
||||
---
|
||||
name: developing-go-gin-gorm
|
||||
description: Generates and reviews Go backend code using GIN and GORM frameworks. Enforces layered architecture (handler→service→dao), unified API response format, POST+RequestBody API design, DTO naming conventions, Chinese comments, Asia/Shanghai timezone, structured logging, and framework best practices. Trigger on Go API development, GIN handler creation, GORM repository implementation, or code review requests.
|
||||
argument-hint: "<action> <target>" e.g., "create user-handler", "review service/order.go", "scaffold api/v1/product"
|
||||
name: backend-go-gin-gorm
|
||||
|
||||
description: >
|
||||
使用 Gin + GORM 生成、编写、修改、评审 production-ready 的 Go 后端代码(Generate & Review Go backend code with Gin/GORM)。
|
||||
强制分层架构 handler → service → dao/repository(避免业务逻辑堆在 handler;DAO/Repo 只做数据访问与查询组装),并统一 API 响应包装
|
||||
consistent response envelope(code/message/data + request_id/trace_id 等可观测字段)。接口风格默认推荐 POST + JSON RequestBody
|
||||
as default(必要时遵循 REST 语义与幂等约定),规范 DTO/VO/DO 命名与字段映射 conventions(入参 DTO、出参 VO、持久化 DO/Model)。
|
||||
代码注释使用中文(Chinese comments for maintainability),时间处理默认 Asia/Shanghai(time zone aware time handling),
|
||||
采用结构化日志 structured logging(携带 request_id/trace_id/user_id/path/latency 等上下文),并遵循 Gin/GORM 工程化最佳实践
|
||||
(transactions, context propagation, error wrapping, pagination, soft delete, optimistic locking when needed)。
|
||||
触发场景 Trigger: Go 后端开发 / Gin Handler 创建 / GORM DAO/Repository 实现 / 代码走查与 Review(refactor suggestions, bug fixes, performance tips)。
|
||||
|
||||
argument-hint: "<动作 action> <目标 target>" 例如/ e.g.:
|
||||
"create user-handler", "review service/order.go", "scaffold api/v1/product", "add repo for table/users", "optimize gorm query"
|
||||
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Write
|
||||
@@ -9,12 +21,13 @@ allowed-tools:
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash
|
||||
|
||||
---
|
||||
|
||||
# Go GIN/GORM 开发规范 Skill
|
||||
|
||||
## 触发条件
|
||||
- 用户请求创建/修改 Go 后端代码(GIN handler、GORM dao、service)
|
||||
- 用户请求创建/修改 Go 后端代码
|
||||
- 用户请求代码审查
|
||||
- 用户提及 API 开发、数据库操作、统一响应、日志、时间处理
|
||||
- 用户请求设计 API 接口、DTO 结构
|
||||
@@ -150,6 +150,8 @@ type ListRequest struct {
|
||||
| 项目管理 | `/api/projects/` |
|
||||
| 用户 | `/api/users/` |
|
||||
| 权限 | `/api/permissions/` |
|
||||
| 权限-Jenkins | `/api/permissions/jenkins/` |
|
||||
| 权限-项目 | `/api/permissions/projects/` |
|
||||
| 审计 | `/api/audit/` |
|
||||
| Exchange-Hub | `/api/exchange-hub/` |
|
||||
| DCU | `/api/dcu/` |
|
||||
391
.agents/skills/dds-to-skill/SKILL.md
Normal file
391
.agents/skills/dds-to-skill/SKILL.md
Normal file
@@ -0,0 +1,391 @@
|
||||
---
|
||||
name: dds-to-skill
|
||||
description: >
|
||||
将 DDS(详细设计说明书)/ PRD / 架构文档转换为一套可落地的 Claude Code Agent Skills(Converts DDS/PRD/Architecture docs into production-ready Agent Skills)。
|
||||
包含系统级 Skill、模块级 Skills、横切 Skills 的完整生成流程,涵盖设计细节抽取、reference 分层、frontmatter 规范、质量自检。
|
||||
触发场景 Trigger: 当用户需要将 DDS 文档转为 Skills / 需要从架构设计文档生成开发指导 Skill / 需要批量创建模块级 Skill 套件。
|
||||
关键词 Keywords: DDS, PRD, 架构说明, 设计文档, skill 生成, skill 套件, agent skill, 模块拆分, reference 抽取, 契约, API, 状态机, 事件, Schema。
|
||||
argument-hint: "<dds-file-path> [--output-dir <skills-output-dir>] [--project-name <name>]"
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Write
|
||||
- Edit
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash
|
||||
---
|
||||
|
||||
# DDS-to-Skill:从设计文档生成 Agent Skills
|
||||
|
||||
本 Skill 指导你将一份 DDS(Detailed Design Specification)或 PRD / 架构说明文档,转换为一套**可落地、含设计细节**的 Claude Code Agent Skills 套件。
|
||||
|
||||
> **核心理念**:生成的不是"空洞的工作流提示词",而是**绑定了 DDS 设计细节**、能指导真实开发/审查的 Skill 套件。
|
||||
|
||||
---
|
||||
|
||||
## Phase 0:读取与理解 DDS
|
||||
|
||||
### 0.1 动态注入读取(必须执行)
|
||||
|
||||
```bash
|
||||
# 动态注入:查看源文档目录上下文
|
||||
!`ls -la $(dirname "$ARGUMENTS")`
|
||||
|
||||
# 动态注入:读取 DDS 正文(至少 3 段,覆盖全文)
|
||||
!`sed -n '1,150p' "$ARGUMENTS"`
|
||||
!`sed -n '150,300p' "$ARGUMENTS"`
|
||||
!`sed -n '300,500p' "$ARGUMENTS"`
|
||||
|
||||
# 动态注入:抽取章节标题(构建 TOC)
|
||||
!`grep -nE '^(#{1,6}\s+|[0-9]+(\.[0-9]+){0,3}\s+|第[一二三四五六七八九十]+章|第[0-9]+章)' "$ARGUMENTS" | head -n 80`
|
||||
```
|
||||
|
||||
### 0.2 设计要素定向扫描(至少执行 3 项)
|
||||
|
||||
```bash
|
||||
# API/接口
|
||||
!`grep -nE "API|接口|路径|路由|request|response|错误码|error|handler" "$ARGUMENTS" | head -n 60`
|
||||
|
||||
# 事件/消息/Topic
|
||||
!`grep -nE "事件|event|MQTT|topic|outbox|消息|payload|幂等|retry|publish|subscribe" "$ARGUMENTS" | head -n 60`
|
||||
|
||||
# 数据库/Schema
|
||||
!`grep -nE "表|schema|字段|索引|unique|constraint|migration|DDL|PostgreSQL|MySQL|GORM" "$ARGUMENTS" | head -n 60`
|
||||
|
||||
# 状态机/流程
|
||||
!`grep -nE "状态机|state|transition|流转|工单|workflow|回调|补偿|lifecycle" "$ARGUMENTS" | head -n 60`
|
||||
|
||||
# 安全/授权
|
||||
!`grep -nE "RBAC|DAC|鉴权|JWT|claim|授权|TOTP|权限|auth|token|session" "$ARGUMENTS" | head -n 60`
|
||||
|
||||
# 模块/服务/依赖
|
||||
!`grep -nE "模块|module|service|微服务|依赖|dependency|import|gateway" "$ARGUMENTS" | head -n 60`
|
||||
```
|
||||
|
||||
### 0.3 无法读取时的降级
|
||||
|
||||
若无法读取文件,**必须停止**,输出"继续所需的最小信息清单":
|
||||
|
||||
1. 系统模块列表(名称 + 职责 + 关键技术)
|
||||
2. 每个模块的接口/API 列表
|
||||
3. 事件/Topic 定义
|
||||
4. 数据库表结构
|
||||
5. 状态机/流程定义
|
||||
6. 授权模型
|
||||
7. 模块间依赖关系
|
||||
|
||||
**禁止在缺少源文档的情况下臆造设计细节。**
|
||||
|
||||
---
|
||||
|
||||
## Phase 1:分析与规划
|
||||
|
||||
### 1.1 模块识别
|
||||
|
||||
从 DDS 中识别所有业务模块,生成模块清单表:
|
||||
|
||||
| 模块名 | 职责概述 | 关键技术 | Skill 类型 |
|
||||
|--------|---------|---------|-----------|
|
||||
| *从 DDS 抽取* | *从 DDS 抽取* | *从 DDS 抽取* | 系统级/模块级/横切 |
|
||||
|
||||
### 1.2 Skill 三层架构规划
|
||||
|
||||
必须生成 3 类 Skills:
|
||||
|
||||
**A) 系统级 Skill(1 个)**
|
||||
- 跨模块一致性、依赖规则、全局变更流程
|
||||
- 命名:`developing-<system-name>-system`
|
||||
|
||||
**B) 模块级 Skills(N 个,每模块 1 个)**
|
||||
- 高频开发指导:实现步骤 + 依赖影响检查
|
||||
- 命名:`developing-<module-name>`
|
||||
|
||||
**C) 横切 Skills(≥ 3 个)**
|
||||
- 基于 DDS 内容选择,常见横切关注点:
|
||||
|
||||
| 横切主题 | 适用场景 | 参考命名 |
|
||||
|---------|---------|---------|
|
||||
| API/事件/Schema 契约 | 有跨模块接口定义 | `designing-contracts` |
|
||||
| 数据库迁移 | 有 DB Schema 定义 | `managing-db-migrations` |
|
||||
| 可观测性/审计 | 有日志/监控/审计需求 | `managing-observability` |
|
||||
| 安全/认证 | 有 RBAC/JWT/授权体系 | `implementing-auth` |
|
||||
| 前端开发规范 | 有前端架构设计 | `frontend-<framework>` |
|
||||
| 后端编码规范 | 有后端技术栈规范 | `backend-<framework>` |
|
||||
| 部署/运维 | 有 K8S/Docker/CI 设计 | `deploying-<target>` |
|
||||
|
||||
> 实际横切 Skills 必须根据 DDS 内容动态决定,不可少于 3 个。
|
||||
|
||||
### 1.3 Name 候选与确认
|
||||
|
||||
为每个 Skill 提供 2~3 个命名候选,从中选择 1 个并说明理由。命名规则:
|
||||
- 动名词形式(如 `developing-*`、`managing-*`、`implementing-*`)
|
||||
- 小写字母 + 数字 + 连字符
|
||||
- ≤ 64 字符
|
||||
- 包含模块名或领域名
|
||||
|
||||
---
|
||||
|
||||
## Phase 2:DDS 设计细节抽取
|
||||
|
||||
### 2.1 章节提取与 reference 目录构建
|
||||
|
||||
> **详细规则见** `reference/dds-extraction-guide.md`
|
||||
|
||||
从 DDS 章节标题构建 `reference/` 分层目录:
|
||||
|
||||
```
|
||||
<skill-name>/reference/
|
||||
├── 01-<section-slug>/
|
||||
│ ├── apis.md
|
||||
│ ├── db-schema.md
|
||||
│ └── events-topics.md
|
||||
├── 02-<section-slug>/
|
||||
│ └── state-machine.md
|
||||
└── 03-<section-slug>/
|
||||
└── security-model.md
|
||||
```
|
||||
|
||||
**目录命名规范**:
|
||||
- 有序前缀 `01-`、`02-`... + slug
|
||||
- slug:全小写,非字母数字字符替换为 `-`,连续 `-` 合并,≤ 48 字符
|
||||
|
||||
### 2.2 六类设计要素抽取(必须覆盖)
|
||||
|
||||
每个模块级 Skill 的 reference/ 必须覆盖**至少 3 类**:
|
||||
|
||||
| 要素类型 | 抽取内容 | reference 文件名 |
|
||||
|---------|---------|-----------------|
|
||||
| **API/接口** | 路径、方法、请求/响应字段、错误码 | `apis.md` |
|
||||
| **事件/Topic** | 字段、版本、幂等键、重试语义 | `events-topics.md` |
|
||||
| **DB Schema** | 字段、索引、约束、迁移策略 | `db-schema.md` |
|
||||
| **状态机/流程** | 状态、转移、守卫条件、回调、补偿 | `state-machine.md` |
|
||||
| **授权模型** | JWT claims、RBAC/DAC、权限层级 | `security-model.md` |
|
||||
| **依赖关系** | 跨模块调用链路、协议、集成点 | `dependencies.md` |
|
||||
|
||||
### 2.3 reference 条目格式(强制)
|
||||
|
||||
每条 reference 必须包含溯源信息:
|
||||
|
||||
```markdown
|
||||
## <设计要素名称>
|
||||
|
||||
- **DDS-Section**: <章节标题原文>
|
||||
- **DDS-Lines**: L120-L168(或近似行号)
|
||||
|
||||
### Extract
|
||||
|
||||
<结构化内容:表格/列表/代码块>
|
||||
```
|
||||
|
||||
### 2.4 TBD 标注
|
||||
|
||||
如果 DDS 中某个设计要素写得不清楚或缺失:
|
||||
- **必须标注 `[TBD]`**
|
||||
- 输出"最小补充信息清单"
|
||||
- **禁止脑补细节**
|
||||
|
||||
---
|
||||
|
||||
## Phase 3:逐个生成 SKILL.md
|
||||
|
||||
### 3.1 SKILL.md 结构模板
|
||||
|
||||
> **详细模板见** `reference/skill-templates.md`
|
||||
|
||||
每个 SKILL.md 必须包含以下结构:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: <skill-name>
|
||||
description: <单行,< 1024 字符,中英文混合,第三人称,含功能+触发场景+关键词>
|
||||
argument-hint: "<参数格式说明>"
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Write # 按需
|
||||
- Edit # 按需
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash # 按需
|
||||
---
|
||||
|
||||
# <Skill 标题>
|
||||
|
||||
<一段话概述本 Skill 的用途和适用范围>
|
||||
|
||||
## Quick Context
|
||||
<动态注入命令,至少 2 处 !`command`>
|
||||
|
||||
## Plan
|
||||
### 产物清单
|
||||
### 决策点
|
||||
|
||||
## Verify
|
||||
<按类别组织的 Checklist,可勾选>
|
||||
|
||||
## Execute
|
||||
<分步骤的可操作指令>
|
||||
|
||||
## Pitfalls
|
||||
<3~8 条与该模块/主题强相关的常见坑,至少 2 条引用 reference>
|
||||
|
||||
## Related References
|
||||
<指向 reference/ 的链接列表,说明何时查阅>
|
||||
```
|
||||
|
||||
### 3.2 Frontmatter 编写规则
|
||||
|
||||
> **详细规范见** `reference/frontmatter-spec.md`
|
||||
|
||||
**关键要点**:
|
||||
- `description` **必须单行**,否则 skill 触发会失败
|
||||
- 必须中英文混合,确保中文和英文查询都能命中
|
||||
- 必须包含:功能说明 + 触发场景 + 关键词(含模块名)
|
||||
- `allowed-tools` 遵循最小授权原则
|
||||
|
||||
### 3.3 内容编写原则
|
||||
|
||||
1. **删除常识**:只保留 DDS 特有设计与可操作步骤
|
||||
2. **解释 Why**:对重要约束解释原因,不要堆砌 MUST/ALWAYS
|
||||
3. **可执行动作**:禁止空话(如"检查 API 兼容"),必须写成具体审查动作
|
||||
4. **设计细节绑定**:Pitfalls 和 Verify 中至少 2 处引用 `reference/` 的具体内容
|
||||
5. **行数限制**:SKILL.md 主体 < 500 行
|
||||
|
||||
**示例 — 空话 vs 可执行动作**:
|
||||
|
||||
```
|
||||
❌ "检查事件一致性"
|
||||
✅ "在 reference/events-topics.md 找到 topic 列表,对照仓库 grep 出 publish/subscribe 点"
|
||||
|
||||
❌ "验证 JWT 安全"
|
||||
✅ "校验 JWT claims 是否包含 tenant_id/project_id/role(来自 reference/security-model.md)"
|
||||
|
||||
❌ "检查 migration 可回滚"
|
||||
✅ "migration 必须包含 down SQL;verify.sh grep 检查 `-- +migrate Down` 或回滚段落存在"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 4:生成 Supporting Files
|
||||
|
||||
### 4.1 目录结构
|
||||
|
||||
每个 Skill 遵循标准目录模板:
|
||||
|
||||
```
|
||||
<skill-name>/
|
||||
├── SKILL.md # 主文件(< 500 行)
|
||||
├── reference/ # 设计细节(按章节分层)
|
||||
│ ├── 01-<section>/
|
||||
│ │ ├── apis.md
|
||||
│ │ ├── db-schema.md
|
||||
│ │ └── ...
|
||||
│ └── 02-<section>/
|
||||
│ └── ...
|
||||
├── examples/ # 骨架代码示例
|
||||
│ └── ...
|
||||
└── scripts/ # 验证脚本
|
||||
└── verify.sh # 必须提供
|
||||
```
|
||||
|
||||
### 4.2 verify.sh 编写要求
|
||||
|
||||
每个 Skill 必须至少包含 1 个 `verify.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# verify.sh - <skill-name> Skill 结构与内容验证
|
||||
set -e
|
||||
|
||||
PASS=0; FAIL=0
|
||||
check() {
|
||||
if eval "$2"; then
|
||||
echo "✅ PASS: $1"; ((PASS++))
|
||||
else
|
||||
echo "❌ FAIL: $1"; ((FAIL++))
|
||||
fi
|
||||
}
|
||||
|
||||
SKILL_DIR="$(cd "$(dirname "$0")/.." && pwd)"
|
||||
|
||||
# 结构检查
|
||||
check "SKILL.md 存在" "test -f '$SKILL_DIR/SKILL.md'"
|
||||
check "reference/ 目录存在" "test -d '$SKILL_DIR/reference'"
|
||||
check "SKILL.md < 500 行" "[ $(wc -l < '$SKILL_DIR/SKILL.md') -lt 500 ]"
|
||||
|
||||
# 内容检查
|
||||
check "frontmatter 包含 name" "head -20 '$SKILL_DIR/SKILL.md' | grep -q '^name:'"
|
||||
check "frontmatter 包含 description" "head -20 '$SKILL_DIR/SKILL.md' | grep -q '^description:'"
|
||||
check "包含 Plan 章节" "grep -q '## Plan' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Verify 章节" "grep -q '## Verify' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Execute 章节" "grep -q '## Execute' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Pitfalls 章节" "grep -q '## Pitfalls' '$SKILL_DIR/SKILL.md'"
|
||||
|
||||
# reference 检查
|
||||
check "reference 有章节子目录" "find '$SKILL_DIR/reference' -maxdepth 1 -type d -name '0*' | grep -q ."
|
||||
check "reference 文件含 DDS-Section" "grep -rq 'DDS-Section:' '$SKILL_DIR/reference/' 2>/dev/null"
|
||||
|
||||
echo ""
|
||||
echo "=== 结果: $PASS PASS / $FAIL FAIL ==="
|
||||
[ $FAIL -eq 0 ] && exit 0 || exit 1
|
||||
```
|
||||
|
||||
### 4.3 examples/ 编写要求
|
||||
|
||||
- 只放**骨架与关键接口签名**,不放完整实现
|
||||
- 与模块职责强相关
|
||||
- 注释说明关键设计决策
|
||||
|
||||
---
|
||||
|
||||
## Phase 5:全局自检
|
||||
|
||||
### 5.1 输出顺序(必须遵守)
|
||||
|
||||
1. **Skills 清单表**:系统级 / 模块级 / 横切,含最终 name 与理由
|
||||
2. **总目录树**:Unix 路径风格
|
||||
3. **每个 SKILL.md**:完整内容
|
||||
4. **Supporting files**:按 `文件路径 → 文件内容` 逐个输出
|
||||
5. **全局自检结果**:逐条 PASS/FAIL + 修复建议
|
||||
|
||||
### 5.2 自检 Checklist
|
||||
|
||||
按以下维度逐条检查:
|
||||
|
||||
**结构完整性**
|
||||
- [ ] 系统级 Skill 存在(1 个)
|
||||
- [ ] 模块级 Skills 数量 = 模块数
|
||||
- [ ] 横切 Skills ≥ 3 个
|
||||
- [ ] 每个 Skill 都有 SKILL.md + reference/ + scripts/verify.sh
|
||||
|
||||
**Frontmatter 规范**
|
||||
- [ ] description 为单行
|
||||
- [ ] description < 1024 字符
|
||||
- [ ] 中英文混合
|
||||
- [ ] 包含触发场景和关键词
|
||||
- [ ] allowed-tools 最小授权
|
||||
|
||||
**内容质量**
|
||||
- [ ] SKILL.md < 500 行
|
||||
- [ ] 包含 Plan/Verify/Execute/Pitfalls 四个章节
|
||||
- [ ] ≥ 2 处 `!command` 动态注入
|
||||
- [ ] Pitfalls ≥ 2 条引用 reference
|
||||
- [ ] 无空话("检查 XX 一致性"这类无具体动作的描述)
|
||||
|
||||
**Reference 质量**
|
||||
- [ ] 每个模块 Skill 覆盖 ≥ 3 类设计要素
|
||||
- [ ] reference 有章节分层目录(非扁平)
|
||||
- [ ] 每条 reference 含 DDS-Section + DDS-Lines 溯源
|
||||
- [ ] DDS 缺失内容标注 [TBD]
|
||||
- [ ] 无脑补设计细节
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| 需要了解... | 查阅... |
|
||||
|------------|--------|
|
||||
| DDS 抽取的详细方法 | `reference/dds-extraction-guide.md` |
|
||||
| SKILL.md 模板(系统/模块/横切) | `reference/skill-templates.md` |
|
||||
| Frontmatter 详细规范 | `reference/frontmatter-spec.md` |
|
||||
| 质量自检的完整清单 | `reference/quality-checklist.md` |
|
||||
| 成功案例的目录结构 | `examples/` |
|
||||
@@ -0,0 +1,137 @@
|
||||
# DDS-to-Skill 转换实例:完整系统(多模块)
|
||||
|
||||
本文展示将一个包含多模块的 DDS 文档转换为完整 Skill 套件的过程。
|
||||
|
||||
---
|
||||
|
||||
## 1. 系统概述(模拟 DDS)
|
||||
|
||||
```
|
||||
系统名称:ProjectMoneyX(个人财务管理系统)
|
||||
技术栈:Go + Gin + GORM / Vue3 + TypeScript + Vuetify3
|
||||
模块列表:
|
||||
- 账单导入模块(bill-import)
|
||||
- 多维分析模块(analysis)
|
||||
- 预算管理模块(budget)
|
||||
- 账户管理模块(account)
|
||||
- 规则引擎模块(rules)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Skill 套件规划
|
||||
|
||||
### 2.1 Skills 清单
|
||||
|
||||
| 类型 | Skill Name | 职责 |
|
||||
|------|-----------|------|
|
||||
| 系统级 | `developing-moneyx-system` | 跨模块架构、技术栈规范、依赖管理 |
|
||||
| 模块级 | `developing-bill-import` | 账单导入 ETL 流水线 |
|
||||
| 模块级 | `developing-analysis` | 多维财务分析与图表 |
|
||||
| 模块级 | `developing-budget` | 预算创建与跟踪 |
|
||||
| 模块级 | `developing-account` | 账户 CRUD 与余额同步 |
|
||||
| 模块级 | `developing-rules` | 分类规则引擎 |
|
||||
| 横切 | `designing-contracts` | API/DTO 契约规范 |
|
||||
| 横切 | `managing-db-migrations` | 数据库迁移策略 |
|
||||
| 横切 | `managing-observability` | 日志、错误追踪 |
|
||||
|
||||
### 2.2 总目录树
|
||||
|
||||
```
|
||||
1-AgentSkills/
|
||||
├── developing-moneyx-system/
|
||||
│ ├── SKILL.md
|
||||
│ ├── reference/
|
||||
│ │ ├── 01-architecture/
|
||||
│ │ │ └── dependencies.md
|
||||
│ │ └── 02-tech-stack/
|
||||
│ │ └── conventions.md
|
||||
│ └── scripts/
|
||||
│ └── verify.sh
|
||||
├── developing-bill-import/
|
||||
│ ├── SKILL.md
|
||||
│ ├── reference/
|
||||
│ │ ├── 01-etl-pipeline/
|
||||
│ │ │ └── pipeline-design.md
|
||||
│ │ ├── 02-api-design/
|
||||
│ │ │ └── apis.md
|
||||
│ │ └── 03-data-model/
|
||||
│ │ └── db-schema.md
|
||||
│ ├── examples/
|
||||
│ │ └── etl-processor.go
|
||||
│ └── scripts/
|
||||
│ └── verify.sh
|
||||
├── developing-analysis/
|
||||
│ ├── ...(同上结构)
|
||||
├── developing-budget/
|
||||
│ ├── ...
|
||||
├── developing-account/
|
||||
│ ├── ...
|
||||
├── developing-rules/
|
||||
│ ├── ...
|
||||
├── designing-contracts/
|
||||
│ ├── SKILL.md
|
||||
│ ├── reference/
|
||||
│ │ └── api-response-spec.md
|
||||
│ └── scripts/
|
||||
│ └── verify.sh
|
||||
├── managing-db-migrations/
|
||||
│ ├── SKILL.md
|
||||
│ ├── reference/
|
||||
│ │ └── migration-conventions.md
|
||||
│ └── scripts/
|
||||
│ └── verify.sh
|
||||
└── managing-observability/
|
||||
├── SKILL.md
|
||||
├── reference/
|
||||
│ └── logging-standards.md
|
||||
└── scripts/
|
||||
└── verify.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. 关键转换决策
|
||||
|
||||
### 3.1 模块边界划分
|
||||
|
||||
> **决策依据**:DDS 中每个"章节"对应一个业务域,每个业务域生成一个模块级 Skill。
|
||||
|
||||
### 3.2 横切关注点识别
|
||||
|
||||
从 DDS 全文 grep 识别跨模块使用的技术点:
|
||||
|
||||
```bash
|
||||
# 发现所有模块都用了统一响应格式 → designing-contracts
|
||||
grep -c "ResponseError\|ResponseSuccess" *.go
|
||||
|
||||
# 发现多个模块有 migration 文件 → managing-db-migrations
|
||||
find . -name "*migration*" -o -name "*migrate*"
|
||||
|
||||
# 发现多个模块有日志调用 → managing-observability
|
||||
grep -rn "log\.\(Info\|Error\|Debug\)" --include="*.go" | wc -l
|
||||
```
|
||||
|
||||
### 3.3 Reference 深度决策
|
||||
|
||||
| 要素 | 模块 | DDS 覆盖度 | reference 策略 |
|
||||
|------|------|-----------|---------------|
|
||||
| API | bill-import | 完整 | 全量抽取到 apis.md |
|
||||
| DB Schema | budget | 部分 | 抽取已有 + [TBD] 标注缺失 |
|
||||
| 事件 | analysis | 无 | 跳过,无需创建事件 reference |
|
||||
| 状态机 | bill-import | 完整 | ETL 状态流转到 state-machine.md |
|
||||
|
||||
---
|
||||
|
||||
## 4. 输出示例:系统级 Skill
|
||||
|
||||
```yaml
|
||||
---
|
||||
name: developing-moneyx-system
|
||||
description: >
|
||||
指导 ProjectMoneyX 个人财务管理系统的全局架构决策与跨模块一致性(Guides system-level architecture for ProjectMoneyX personal finance system)。
|
||||
包含:模块注册、技术栈规范、依赖管理、响应格式统一。
|
||||
触发场景 Trigger: 新增模块 / 跨模块变更 / 架构决策 / 技术栈选型。
|
||||
关键词 Keywords: moneyx, system, architecture, 架构, 财务, finance, 模块, cross-module。
|
||||
---
|
||||
```
|
||||
@@ -0,0 +1,95 @@
|
||||
# DDS-to-Skill 转换实例:工单流程模块
|
||||
|
||||
本文展示了一个从 DDS 片段到完整 Skill 的转换过程。
|
||||
|
||||
---
|
||||
|
||||
## 1. DDS 原文片段(模拟)
|
||||
|
||||
```markdown
|
||||
## 5. 工单管理模块(rmdc-work-procedure)
|
||||
|
||||
### 5.1 模块职责
|
||||
负责工单生命周期管理,包括创建、审批、执行、完成等流转。
|
||||
|
||||
### 5.2 数据库设计
|
||||
|
||||
#### workflows 主表
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | BIGINT | PK, AUTO_INCREMENT | 工单ID |
|
||||
| type | VARCHAR(50) | NOT NULL | 工单类型 |
|
||||
| status | VARCHAR(30) | NOT NULL, DEFAULT 'pending' | 当前状态 |
|
||||
| creator_id | BIGINT | NOT NULL, FK → users.id | 创建人 |
|
||||
| assignee_id | BIGINT | FK → users.id | 处理人 |
|
||||
| version | INT | NOT NULL, DEFAULT 1 | 乐观锁版本号 |
|
||||
|
||||
### 5.3 状态机
|
||||
- pending → submitted → under_review → approved/rejected
|
||||
- submitted → revoked(创建人可撤销)
|
||||
- 终态:approved, rejected, revoked, closed
|
||||
|
||||
### 5.4 API 接口
|
||||
- POST /api/workflow/create - 创建工单
|
||||
- POST /api/workflow/transition - 状态转换
|
||||
- POST /api/workflow/callback - 业务回调
|
||||
- POST /api/workflow/list - 工单列表
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. 转换步骤演示
|
||||
|
||||
### 步骤 1:模块识别
|
||||
|
||||
| 模块 | 职责 | 技术 | Skill 类型 |
|
||||
|------|------|------|-----------|
|
||||
| rmdc-work-procedure | 工单生命周期管理 | Go + Gin + PostgreSQL | 模块级 |
|
||||
|
||||
### 步骤 2:设计要素抽取
|
||||
|
||||
从 DDS 中识别到 4 类要素:
|
||||
- ✅ API/接口 → 4 个 API 端点
|
||||
- ✅ DB Schema → workflows 主表
|
||||
- ✅ 状态机 → 5 个状态 + 转换规则
|
||||
- ❌ 事件/Topic → DDS 未提及 → 标注 [TBD]
|
||||
- ❌ 授权模型 → DDS 未提及 → 标注 [TBD]
|
||||
- ✅ 依赖关系 → 业务模块回调
|
||||
|
||||
### 步骤 3:reference 文件生成
|
||||
|
||||
```
|
||||
reference/
|
||||
├── 01-data-model/
|
||||
│ └── db-schema.md # workflows 表结构
|
||||
├── 02-api-design/
|
||||
│ └── apis.md # 4 个 API 定义
|
||||
├── 03-workflow-engine/
|
||||
│ └── state-machine.md # 状态机定义
|
||||
└── 04-integration/
|
||||
└── dependencies.md # 回调接口
|
||||
```
|
||||
|
||||
### 步骤 4:SKILL.md 关键段落
|
||||
|
||||
```markdown
|
||||
## Pitfalls
|
||||
|
||||
1. **版本号遗漏**: 更新工单时忘记传递 `version` 字段,导致乐观锁失效
|
||||
(参考 `reference/01-data-model/db-schema.md` 中 workflows.version 字段定义)
|
||||
2. **终态误转换**: 对 approved/rejected/revoked/closed 状态尝试非法转换
|
||||
(参考 `reference/03-workflow-engine/state-machine.md` 中的终态定义)
|
||||
3. **事件推送遗漏**: 状态变更后忘记通知相关方 [TBD - DDS 未定义事件机制]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. 自检结果
|
||||
|
||||
| # | 检查项 | 结果 | 说明 |
|
||||
|---|-------|------|------|
|
||||
| S4 | SKILL.md 存在 | ✅ PASS | |
|
||||
| R1 | 设计要素 ≥ 3 类 | ✅ PASS | API + DB + 状态机 + 依赖 = 4 类 |
|
||||
| R5 | TBD 标注 | ✅ PASS | 事件和授权标注了 [TBD] |
|
||||
| C7 | Pitfalls 引用 reference | ✅ PASS | 2 条引用了 reference 路径 |
|
||||
| R6 | 无脑补 | ✅ PASS | 缺失内容均标注 [TBD] |
|
||||
260
.agents/skills/dds-to-skill/reference/dds-extraction-guide.md
Normal file
260
.agents/skills/dds-to-skill/reference/dds-extraction-guide.md
Normal file
@@ -0,0 +1,260 @@
|
||||
# DDS 设计细节抽取指南
|
||||
|
||||
本文档详细说明如何从 DDS(详细设计说明书)中抽取设计细节,并组织到 reference/ 目录中。
|
||||
|
||||
---
|
||||
|
||||
## 1. 章节标题提取
|
||||
|
||||
### 1.1 标题识别规则(按优先级)
|
||||
|
||||
| 优先级 | 格式 | 示例 |
|
||||
|-------|------|------|
|
||||
| 1 | Markdown 标题 | `# 系统架构`、`## 接口设计` |
|
||||
| 2 | 编号标题 | `1 概述`、`2.3 数据库设计` |
|
||||
| 3 | 中文章标题 | `第一章 总体设计`、`第3章` |
|
||||
| 4 | 中文小节 | `一、系统概述`、`(二)接口规范` |
|
||||
|
||||
### 1.2 提取命令
|
||||
|
||||
```bash
|
||||
# 综合提取(推荐首选)
|
||||
grep -nE '^(#{1,6}\s+|[0-9]+(\.[0-9]+){0,3}\s+|第[一二三四五六七八九十]+章|第[0-9]+章|[一二三四五六七八九十]+、)' "$DDS_FILE" | head -n 120
|
||||
|
||||
# 如果上面匹配不足,尝试更宽松的模式
|
||||
sed -n '1,200p' "$DDS_FILE" | nl -ba | sed -n '1,120p'
|
||||
```
|
||||
|
||||
### 1.3 降级策略
|
||||
|
||||
当标题提取不足(少于 3 个)或 DDS 格式混乱时:
|
||||
|
||||
```
|
||||
reference/00-unknown/
|
||||
├── 01-apis/
|
||||
├── 02-events/
|
||||
├── 03-db/
|
||||
├── 04-state-machine/
|
||||
└── 05-security/
|
||||
```
|
||||
|
||||
同时在自检中标记 FAIL:
|
||||
- **原因**:DDS 标题结构不可识别
|
||||
- **建议**:提供 Markdown 标题 / 章节目录 / md 格式导出版本
|
||||
|
||||
---
|
||||
|
||||
## 2. 六类设计要素抽取方法
|
||||
|
||||
### 2.1 API/接口
|
||||
|
||||
**扫描关键词**:
|
||||
```bash
|
||||
grep -nE "API|接口|路径|路由|request|response|错误码|error|handler|endpoint|method" "$DDS_FILE" | head -n 80
|
||||
```
|
||||
|
||||
**抽取内容**:
|
||||
| 字段 | 说明 |
|
||||
|------|------|
|
||||
| 路径 | `/api/v1/users/list` |
|
||||
| 方法 | POST / GET 等 |
|
||||
| 请求字段 | 字段名、类型、是否必须、校验规则 |
|
||||
| 响应字段 | 字段名、类型、说明 |
|
||||
| 错误码 | code + message + 触发场景 |
|
||||
| 鉴权要求 | JWT / API Key / 公开 |
|
||||
|
||||
**输出格式**:
|
||||
```markdown
|
||||
## POST /api/v1/users/list
|
||||
|
||||
- **DDS-Section**: 3.2 用户管理接口
|
||||
- **DDS-Lines**: L120-L168
|
||||
|
||||
### Request
|
||||
|
||||
| 字段 | 类型 | 必须 | 说明 |
|
||||
|------|------|------|------|
|
||||
| page | int | N | 页码,默认 1 |
|
||||
| page_size | int | N | 每页数量,默认 20 |
|
||||
|
||||
### Response
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| list | []User | 用户列表 |
|
||||
| total | int | 总数 |
|
||||
|
||||
### 错误码
|
||||
|
||||
| code | message | 触发场景 |
|
||||
|------|---------|---------|
|
||||
| 1001 | 参数校验失败 | 字段格式错误 |
|
||||
```
|
||||
|
||||
### 2.2 事件/Topic/消息
|
||||
|
||||
**扫描关键词**:
|
||||
```bash
|
||||
grep -nE "事件|event|MQTT|topic|outbox|消息|payload|幂等|retry|publish|subscribe|Kafka|RabbitMQ" "$DDS_FILE" | head -n 80
|
||||
```
|
||||
|
||||
**抽取内容**:
|
||||
| 字段 | 说明 |
|
||||
|------|------|
|
||||
| Topic/Queue 名 | `cmii/rmdc/{project_id}/command` |
|
||||
| 方向 | Publish / Subscribe |
|
||||
| Payload 字段 | 字段名、类型、说明 |
|
||||
| QoS / 可靠性 | At-least-once / Exactly-once |
|
||||
| 幂等键 | 用于去重的唯一标识字段 |
|
||||
| 重试策略 | 重试间隔、最大次数、死信队列 |
|
||||
|
||||
### 2.3 数据库/Schema
|
||||
|
||||
**扫描关键词**:
|
||||
```bash
|
||||
grep -nE "表|schema|字段|索引|unique|constraint|migration|DDL|PostgreSQL|MySQL|GORM|column|CREATE TABLE" "$DDS_FILE" | head -n 80
|
||||
```
|
||||
|
||||
**抽取内容**:
|
||||
| 字段 | 说明 |
|
||||
|------|------|
|
||||
| 表名 | `users`、`workflows` |
|
||||
| 字段定义 | 名称、类型、约束、默认值 |
|
||||
| 索引 | 类型(唯一/普通/组合)、字段 |
|
||||
| 外键关系 | 引用表、级联策略 |
|
||||
| 迁移策略 | 向前兼容 / 字段演进方案 |
|
||||
|
||||
### 2.4 状态机/流程
|
||||
|
||||
**扫描关键词**:
|
||||
```bash
|
||||
grep -nE "状态机|state|transition|流转|工单|workflow|回调|补偿|lifecycle|FSM|guard" "$DDS_FILE" | head -n 80
|
||||
```
|
||||
|
||||
**抽取内容**:
|
||||
| 字段 | 说明 |
|
||||
|------|------|
|
||||
| 状态枚举 | 名称、值、描述 |
|
||||
| 转换规则 | from → to、触发动作、守卫条件 |
|
||||
| 角色权限 | 谁可以触发哪些转换 |
|
||||
| 回调/副作用 | 状态变更后执行的操作 |
|
||||
| 补偿机制 | 转换失败时的回滚策略 |
|
||||
|
||||
### 2.5 授权模型
|
||||
|
||||
**扫描关键词**:
|
||||
```bash
|
||||
grep -nE "RBAC|DAC|鉴权|JWT|claim|授权|TOTP|权限|auth|token|session|role|permission" "$DDS_FILE" | head -n 80
|
||||
```
|
||||
|
||||
**抽取内容**:
|
||||
| 字段 | 说明 |
|
||||
|------|------|
|
||||
| 认证方式 | JWT / Session / OAuth |
|
||||
| JWT Claims | 包含的字段(tenant_id, role 等) |
|
||||
| 角色定义 | 角色名、权限描述 |
|
||||
| 权限矩阵 | 角色 × 资源 × 操作 |
|
||||
| 层级设计 | 一级授权 / 二级授权 |
|
||||
|
||||
### 2.6 依赖关系
|
||||
|
||||
**扫描关键词**:
|
||||
```bash
|
||||
grep -nE "模块|module|service|依赖|dependency|import|gateway|调用|集成|protocol" "$DDS_FILE" | head -n 80
|
||||
```
|
||||
|
||||
**抽取内容**:
|
||||
| 字段 | 说明 |
|
||||
|------|------|
|
||||
| 源模块 | 调用方 |
|
||||
| 目标模块 | 被调用方 |
|
||||
| 协议 | HTTP / gRPC / MQTT / 内部调用 |
|
||||
| 关键接口 | 跨模块调用的接口清单 |
|
||||
| 失败处理 | 超时、重试、熔断策略 |
|
||||
|
||||
---
|
||||
|
||||
## 3. reference 目录组织
|
||||
|
||||
### 3.1 命名规范
|
||||
|
||||
```
|
||||
reference/
|
||||
├── 01-architecture-overview/ # 章节序号 + slug
|
||||
│ └── dependencies.md
|
||||
├── 02-api-design/
|
||||
│ ├── apis.md
|
||||
│ └── error-codes.md
|
||||
├── 03-data-model/
|
||||
│ └── db-schema.md
|
||||
├── 04-message-system/
|
||||
│ └── events-topics.md
|
||||
├── 05-workflow-engine/
|
||||
│ └── state-machine.md
|
||||
└── 06-security/
|
||||
└── security-model.md
|
||||
```
|
||||
|
||||
**Slug 生成规则**:
|
||||
1. 全小写
|
||||
2. 非字母数字字符替换为 `-`
|
||||
3. 连续 `-` 合并为单个
|
||||
4. 截断到 48 字符以内
|
||||
5. 序号来自 DDS 中的章节顺序
|
||||
|
||||
### 3.2 SKILL.md 中的引用方式
|
||||
|
||||
```markdown
|
||||
## Pitfalls
|
||||
|
||||
1. **MQTT Topic 命名冲突**:新增 topic 前必须检查
|
||||
`reference/04-message-system/events-topics.md` 中的 topic 清单
|
||||
|
||||
## Related References
|
||||
|
||||
| 需要了解... | 查阅... |
|
||||
|------------|--------|
|
||||
| API 完整定义 | `reference/02-api-design/apis.md` |
|
||||
| 数据库表结构 | `reference/03-data-model/db-schema.md` |
|
||||
```
|
||||
|
||||
### 3.3 扁平化兼容
|
||||
|
||||
当 DDS 章节结构不明显时,也可以采用扁平 reference,但需在自检中说明:
|
||||
|
||||
```
|
||||
reference/
|
||||
├── apis.md
|
||||
├── db-schema.md
|
||||
├── events-topics.md
|
||||
├── state-machine.md
|
||||
├── security-model.md
|
||||
└── dependencies.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. TBD 标注规范
|
||||
|
||||
当 DDS 中某个设计要素不完整或不清晰时:
|
||||
|
||||
```markdown
|
||||
## 消息重试策略
|
||||
|
||||
- **DDS-Section**: 4.3 消息可靠性
|
||||
- **DDS-Lines**: L245-L260
|
||||
|
||||
### Extract
|
||||
|
||||
| 配置项 | 值 |
|
||||
|-------|---|
|
||||
| 最大重试次数 | [TBD - DDS 未明确指定] |
|
||||
| 重试间隔 | [TBD - DDS 未明确指定] |
|
||||
| 死信队列 | [TBD - DDS 仅提及概念,未给出配置] |
|
||||
|
||||
### 最小补充信息清单
|
||||
|
||||
1. 重试次数上限(建议 3~5 次)
|
||||
2. 重试间隔策略(固定 / 指数退避)
|
||||
3. 死信队列名称与消费策略
|
||||
```
|
||||
162
.agents/skills/dds-to-skill/reference/frontmatter-spec.md
Normal file
162
.agents/skills/dds-to-skill/reference/frontmatter-spec.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# Frontmatter 编写规范
|
||||
|
||||
Frontmatter 是 Skill 的"身份证",决定了 Skill 何时被触发、是否被正确识别。编写不当会导致 Skill 永远不会被使用。
|
||||
|
||||
---
|
||||
|
||||
## 1. 必须字段
|
||||
|
||||
### 1.1 name
|
||||
|
||||
**规则**:
|
||||
- 小写字母 + 数字 + 连字符(`-`)
|
||||
- 动名词形式开头(`developing-`、`managing-`、`implementing-`、`designing-`)
|
||||
- ≤ 64 字符
|
||||
- 包含模块名或领域名
|
||||
|
||||
**常用前缀**:
|
||||
| 前缀 | 适用场景 |
|
||||
|------|---------|
|
||||
| `developing-` | 模块开发、功能实现 |
|
||||
| `managing-` | 管理类操作(DB、配置、部署) |
|
||||
| `implementing-` | 特定技术方案实现 |
|
||||
| `designing-` | 设计阶段的规范和契约 |
|
||||
| `writing-` | 编写文档、脚本、测试 |
|
||||
|
||||
**示例**:
|
||||
```yaml
|
||||
# ✅ 正确
|
||||
name: developing-work-procedure
|
||||
name: managing-db-migrations
|
||||
name: implementing-totp-auth
|
||||
|
||||
# ❌ 错误
|
||||
name: WorkProcedure # 不能大写
|
||||
name: work_procedure # 不能用下划线
|
||||
name: wp # 太短,无法触发
|
||||
```
|
||||
|
||||
### 1.2 description
|
||||
|
||||
**这是最关键的字段** —— 决定 Skill 是否能被正确触发。
|
||||
|
||||
**硬性规则**:
|
||||
1. **必须单行**(不换行) —— 换行会导致 YAML 解析出错,Skill 静默失败
|
||||
2. **< 1024 字符**
|
||||
3. **第三人称**描述
|
||||
4. **中英文混合** —— 确保中文和英文查询都能命中
|
||||
5. **包含触发场景**(Trigger)和**关键词**(Keywords)
|
||||
|
||||
**结构模板**:
|
||||
```
|
||||
<功能概述(中英文)>。包含:<具体能力列表>。触发场景 Trigger: <场景列表>。关键词 Keywords: <关键词列表>。
|
||||
```
|
||||
|
||||
**示例**:
|
||||
```yaml
|
||||
# ✅ 正确(单行,中英文混合,包含触发场景和关键词)
|
||||
description: 指导 rmdc-work-procedure 工单流程模块的开发(Guides development of rmdc-work-procedure workflow module)。包含:状态机实现、工单 CRUD、并发控制、WebSocket 事件。触发场景 Trigger: 修改工单表 / 添加工单类型 / 变更状态转换 / 实现工单 API。关键词 Keywords: workflow, work-procedure, state-machine, 工单, 状态机, 流转。
|
||||
|
||||
# ❌ 错误 - 多行(会静默失败!)
|
||||
description: |
|
||||
指导工单模块开发。
|
||||
包含状态机实现。
|
||||
|
||||
# ❌ 错误 - 太短,无关键词
|
||||
description: 工单模块开发指导
|
||||
|
||||
# ❌ 错误 - 纯英文,中文查询无法命中
|
||||
description: Guides the development of workflow module with state machine
|
||||
```
|
||||
|
||||
**推动触发的技巧**:
|
||||
- Claude 有"不触发"的倾向,所以 description 应该稍微"激进"一些
|
||||
- 多列出触发场景,覆盖用户可能的表述方式
|
||||
- 包含同义词(如:工单/workflow/ticket)
|
||||
|
||||
### 1.3 argument-hint
|
||||
|
||||
**规则**:
|
||||
- 说明 `$ARGUMENTS` 的期望格式
|
||||
- 给出 2~3 个具体示例
|
||||
|
||||
**示例**:
|
||||
```yaml
|
||||
argument-hint: "<action> <target> - e.g., 'create handler user', 'add api /workflow/create', 'update schema workflows'"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. 可选字段
|
||||
|
||||
### 2.1 allowed-tools
|
||||
|
||||
**原则**:最小授权 —— 只声明 Skill 真正需要的工具。
|
||||
|
||||
| 工具 | 适用场景 |
|
||||
|------|---------|
|
||||
| `Read` | 读取文件(几乎总是需要) |
|
||||
| `Glob` | 搜索文件(几乎总是需要) |
|
||||
| `Grep` | 搜索文件内容(几乎总是需要) |
|
||||
| `Bash` | 执行 shell 命令(按需) |
|
||||
| `Write` | 创建新文件(开发类 Skill 需要) |
|
||||
| `Edit` | 编辑现有文件(开发类 Skill 需要) |
|
||||
|
||||
**示例**:
|
||||
```yaml
|
||||
# 只读 Skill(审查/分析类)
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Glob
|
||||
- Grep
|
||||
|
||||
# 开发 Skill(需要写文件)
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Write
|
||||
- Edit
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. YAML 格式注意事项
|
||||
|
||||
### 3.1 多行 description 的安全写法
|
||||
|
||||
如果 description 确实很长,使用 `>` 折叠块语法(注意:这仍然会被解析为单行):
|
||||
|
||||
```yaml
|
||||
description: >
|
||||
指导 rmdc-work-procedure 工单流程模块的开发。
|
||||
包含状态机实现、工单 CRUD、并发控制。
|
||||
触发场景 Trigger: 修改工单表、添加工单类型。
|
||||
```
|
||||
|
||||
> ⚠️ 使用 `>` 时,YAML 会将换行替换为空格,最终合并为单行。这是安全的。
|
||||
> ❌ 绝不要使用 `|`(保留换行块语法),那会导致多行 description。
|
||||
|
||||
### 3.2 特殊字符转义
|
||||
|
||||
```yaml
|
||||
# 包含冒号时用引号包裹
|
||||
argument-hint: "<action>: <target>"
|
||||
|
||||
# 包含 # 时用引号
|
||||
description: "指导 C# 项目开发"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. 自检清单
|
||||
|
||||
- [ ] `name` 为动名词形式、小写连字符、≤ 64 字符
|
||||
- [ ] `description` 在最终 YAML 中为单行
|
||||
- [ ] `description` < 1024 字符
|
||||
- [ ] `description` 包含中文和英文
|
||||
- [ ] `description` 包含触发场景(Trigger)
|
||||
- [ ] `description` 包含关键词(Keywords)
|
||||
- [ ] `argument-hint` 有具体示例
|
||||
- [ ] `allowed-tools` 遵循最小授权
|
||||
114
.agents/skills/dds-to-skill/reference/quality-checklist.md
Normal file
114
.agents/skills/dds-to-skill/reference/quality-checklist.md
Normal file
@@ -0,0 +1,114 @@
|
||||
# 全局质量自检清单
|
||||
|
||||
DDS-to-Skill 转换完成后,必须按以下清单逐条检查。每条标记 PASS 或 FAIL,FAIL 必须附修复建议。
|
||||
|
||||
---
|
||||
|
||||
## 1. 结构完整性
|
||||
|
||||
| # | 检查项 | PASS 条件 |
|
||||
|---|-------|----------|
|
||||
| S1 | 系统级 Skill 存在 | 恰好 1 个 `developing-*-system` Skill |
|
||||
| S2 | 模块级 Skills 数量 | = DDS 中识别的模块数 |
|
||||
| S3 | 横切 Skills 数量 | ≥ 3 个 |
|
||||
| S4 | 每个 Skill 有 SKILL.md | 所有 Skill 目录下存在 SKILL.md |
|
||||
| S5 | 每个 Skill 有 reference/ | 所有 Skill 目录下存在 reference/ |
|
||||
| S6 | 每个 Skill 有 verify.sh | 所有 Skill 的 scripts/ 下存在 verify.sh |
|
||||
| S7 | 目录命名规范 | 全小写、连字符、动名词形式 |
|
||||
|
||||
---
|
||||
|
||||
## 2. Frontmatter 规范
|
||||
|
||||
| # | 检查项 | PASS 条件 |
|
||||
|---|-------|----------|
|
||||
| F1 | description 单行 | YAML 解析后 description 为单行字符串 |
|
||||
| F2 | description 长度 | < 1024 字符 |
|
||||
| F3 | description 中英文 | 同时包含中文和英文描述 |
|
||||
| F4 | description 含触发场景 | 包含 "触发场景" 或 "Trigger" 关键词 |
|
||||
| F5 | description 含关键词 | 包含 "关键词" 或 "Keywords" |
|
||||
| F6 | name 格式 | 小写字母 + 数字 + 连字符,动名词开头 |
|
||||
| F7 | argument-hint 存在 | frontmatter 中包含 argument-hint 字段 |
|
||||
| F8 | allowed-tools 最小授权 | 只读 Skill 不包含 Write/Edit |
|
||||
|
||||
---
|
||||
|
||||
## 3. 内容质量
|
||||
|
||||
| # | 检查项 | PASS 条件 |
|
||||
|---|-------|----------|
|
||||
| C1 | SKILL.md 行数 | < 500 行 |
|
||||
| C2 | 包含 Plan 章节 | grep 到 `## Plan` |
|
||||
| C3 | 包含 Verify 章节 | grep 到 `## Verify` |
|
||||
| C4 | 包含 Execute 章节 | grep 到 `## Execute` |
|
||||
| C5 | 包含 Pitfalls 章节 | grep 到 `## Pitfalls` |
|
||||
| C6 | 动态注入 | ≥ 2 处 `!` + 反引号命令 |
|
||||
| C7 | Pitfalls 引用 reference | ≥ 2 条 Pitfall 中出现 `reference/` 路径 |
|
||||
| C8 | 无空话 | 不含"检查 XX 一致性"这类无具体动作的描述 |
|
||||
| C9 | 无常识内容 | 不含 Claude 已知的通用知识(如 HTTP 状态码定义) |
|
||||
| C10 | 术语一致 | 同一概念在所有 Skill 中使用相同术语 |
|
||||
|
||||
---
|
||||
|
||||
## 4. Reference 质量
|
||||
|
||||
| # | 检查项 | PASS 条件 |
|
||||
|---|-------|----------|
|
||||
| R1 | 设计要素覆盖率 | 每个模块 Skill 覆盖 ≥ 3 类(API/事件/DB/状态机/权限/依赖) |
|
||||
| R2 | 章节分层 | reference/ 下存在 `01-*` 等编号目录(或使用扁平+说明) |
|
||||
| R3 | DDS 溯源 | 每条 reference 含 `DDS-Section:` 字段 |
|
||||
| R4 | DDS 行号 | 每条 reference 含 `DDS-Lines:` 字段 |
|
||||
| R5 | TBD 标注 | DDS 缺失内容标注 `[TBD]`,附最小补充清单 |
|
||||
| R6 | 无脑补 | 所有设计细节可溯源到 DDS 原文 |
|
||||
| R7 | 内容充分 | reference 包含足够的结构化数据(表格/列表/代码块) |
|
||||
|
||||
---
|
||||
|
||||
## 5. 跨 Skill 一致性
|
||||
|
||||
| # | 检查项 | PASS 条件 |
|
||||
|---|-------|----------|
|
||||
| X1 | 模块名一致 | 所有 Skill 中模块名拼写相同 |
|
||||
| X2 | 错误码不冲突 | 相同错误码在不同 Skill 中含义相同 |
|
||||
| X3 | API 路径不冲突 | 不同模块的 API 路径无重叠 |
|
||||
| X4 | 事件/Topic 定义一致 | 同一 Topic 在发布方和订阅方 Skill 中定义相同 |
|
||||
| X5 | 授权模型一致 | JWT Claims、角色定义在所有 Skill 中一致 |
|
||||
|
||||
---
|
||||
|
||||
## 6. 自检输出格式
|
||||
|
||||
```markdown
|
||||
# 全局自检结果
|
||||
|
||||
## 结构完整性
|
||||
- ✅ S1 PASS: 系统级 Skill `developing-xxx-system` 存在
|
||||
- ✅ S2 PASS: 模块级 Skills 数量 = 5(匹配 DDS 中的 5 个模块)
|
||||
- ❌ S3 FAIL: 横切 Skills 仅 2 个,少于要求的 3 个
|
||||
- **修复**: 从 DDS 中识别出缓存策略章节,建议增加 `managing-cache` Skill
|
||||
- ✅ S4 PASS: 所有 Skill 目录下存在 SKILL.md
|
||||
|
||||
## Frontmatter 规范
|
||||
- ✅ F1 PASS: 所有 description 为单行
|
||||
- ❌ F2 FAIL: `developing-core` 的 description 超过 1024 字符(1156 字符)
|
||||
- **修复**: 精简触发场景描述,移除重复关键词
|
||||
|
||||
## 内容质量
|
||||
- ✅ C1 PASS: 所有 SKILL.md < 500 行
|
||||
- ❌ C8 FAIL: `developing-gateway` 中 Verify 包含"检查 API 一致性"
|
||||
- **修复**: 改为"对照 reference/02-api-design/apis.md 中的接口清单,grep 仓库中的 handler 注册点,确认路径和方法一致"
|
||||
|
||||
## 总计: XX PASS / YY FAIL
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. 常见 FAIL 及修复方案
|
||||
|
||||
| FAIL 类型 | 常见原因 | 修复方案 |
|
||||
|----------|---------|---------|
|
||||
| description 多行 | 使用了 `\|` 语法 | 改用 `>` 或单行字符串 |
|
||||
| reference 不足 | DDS 内容被遗漏 | 重新扫描 DDS,补充缺失要素 |
|
||||
| 空话 | 直接复制 DDS 原文 | 转化为可执行的审查动作 |
|
||||
| 脑补 | DDS 未提及的细节 | 标注 [TBD] 并列出补充清单 |
|
||||
| 横切不足 | 未充分分析 DDS | 从 DDS 中识别更多跨模块关注点 |
|
||||
255
.agents/skills/dds-to-skill/reference/skill-templates.md
Normal file
255
.agents/skills/dds-to-skill/reference/skill-templates.md
Normal file
@@ -0,0 +1,255 @@
|
||||
# SKILL.md 模板库
|
||||
|
||||
本文档包含系统级 Skill、模块级 Skill、横切 Skill 的 SKILL.md 模板,供 DDS-to-Skill 转换时参照。
|
||||
|
||||
---
|
||||
|
||||
## 1. 系统级 Skill 模板
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: developing-<system>-system
|
||||
description: >
|
||||
指导 <系统名> 系统级开发决策与跨模块一致性(Guides system-level development for <system>)。
|
||||
包含:架构总览、模块注册、依赖规则、全局变更流程、版本兼容策略、技术栈规范。
|
||||
触发场景 Trigger: 新增模块 / 跨模块变更 / 全局架构决策 / 技术栈选型。
|
||||
关键词 Keywords: <system>, system, architecture, 架构, 模块, 依赖, 兼容, cross-module。
|
||||
argument-hint: "<module-name|change-type> - 指定涉及的模块名或变更类型"
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash
|
||||
---
|
||||
|
||||
# Developing <System> System
|
||||
|
||||
<一段话描述系统整体架构、技术栈、模块组成>
|
||||
|
||||
## Quick Context
|
||||
|
||||
```bash
|
||||
# 动态注入:查看系统模块结构
|
||||
!`ls -la <project-root>/`
|
||||
|
||||
# 动态注入:搜索模块间依赖
|
||||
!`grep -rnE "import|module|service" <project-root>/ | head -30`
|
||||
```
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
<ASCII 架构图或层次说明>
|
||||
|
||||
## Module Registry
|
||||
|
||||
| 模块 | 职责 | 技术 | Skill |
|
||||
|------|------|------|-------|
|
||||
| ... | ... | ... | `developing-<module>` |
|
||||
|
||||
## Plan
|
||||
|
||||
### 产物清单
|
||||
- [ ] 确定变更涉及的模块列表
|
||||
- [ ] 确认是否涉及跨模块通信
|
||||
- [ ] 确认是否涉及契约变更
|
||||
- [ ] 确认是否需要数据库迁移
|
||||
|
||||
### 决策点
|
||||
1. 变更是否影响多个模块?
|
||||
2. 是否需要版本兼容处理?
|
||||
3. 是否需要全局配置变更?
|
||||
|
||||
## Verify
|
||||
|
||||
- [ ] 模块间依赖无循环
|
||||
- [ ] 共享契约版本一致
|
||||
- [ ] 全局配置项完整
|
||||
- [ ] 技术栈版本对齐
|
||||
|
||||
## Execute
|
||||
|
||||
### 添加新模块
|
||||
1. 在项目根目录创建模块目录...
|
||||
2. 注册到路由/网关...
|
||||
3. 更新模块依赖图...
|
||||
|
||||
### 跨模块变更
|
||||
1. 列出所有受影响模块...
|
||||
2. 按依赖顺序逐个修改...
|
||||
3. 运行集成测试...
|
||||
|
||||
## Pitfalls
|
||||
|
||||
1. **循环依赖**: 模块间禁止直接 import,必须通过共享接口定义
|
||||
2. **版本不一致**: 修改共享结构需同步更新所有消费方
|
||||
3. ...
|
||||
|
||||
## Related References
|
||||
|
||||
- [模块依赖关系](reference/dependencies.md)
|
||||
- [技术栈规范](reference/tech-stack.md)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. 模块级 Skill 模板
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: developing-<module>
|
||||
description: >
|
||||
指导 <module> 模块的开发(Guides development of <module> module)。
|
||||
包含:<模块职责概述>、API 实现、数据库操作、状态管理、安全校验。
|
||||
触发场景 Trigger: 开发/修改 <module> 相关功能 / <模块特定场景>。
|
||||
关键词 Keywords: <module>, <技术关键词>, <业务关键词>。
|
||||
argument-hint: "<action> <target> - e.g., 'create handler', 'add api', 'update schema'"
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Write
|
||||
- Edit
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash
|
||||
---
|
||||
|
||||
# Developing <Module>
|
||||
|
||||
<一段话描述模块职责、技术栈、在系统中的位置>
|
||||
|
||||
## Quick Context
|
||||
|
||||
```bash
|
||||
# 动态注入:查看模块结构
|
||||
!`find . -name "*.go" -path "*/<module>/*" | head -20`
|
||||
|
||||
# 动态注入:查看现有接口
|
||||
!`grep -rn "func.*Handler\|func.*Service" ./<module>/ | head -20`
|
||||
```
|
||||
|
||||
## Plan
|
||||
|
||||
### 产物清单
|
||||
- [ ] <根据 DDS 列出具体产物>
|
||||
|
||||
### 决策点
|
||||
1. <从 DDS 抽取的关键决策>
|
||||
2. ...
|
||||
|
||||
## Verify
|
||||
|
||||
### <验证类别 1>
|
||||
- [ ] <具体检查项,引用 reference>
|
||||
|
||||
### <验证类别 2>
|
||||
- [ ] <具体检查项>
|
||||
|
||||
## Execute
|
||||
|
||||
### 1. <步骤标题>
|
||||
```bash
|
||||
# 具体操作命令
|
||||
```
|
||||
|
||||
### 2. <步骤标题>
|
||||
```go
|
||||
// 关键代码骨架
|
||||
```
|
||||
|
||||
## Pitfalls
|
||||
|
||||
1. **<坑名>**: <描述>(参考 `reference/<file>.md`)
|
||||
2. ...(至少 3 条,至少 2 条引用 reference)
|
||||
|
||||
## Related References
|
||||
|
||||
- [API 定义](reference/01-<section>/apis.md)
|
||||
- [数据库 Schema](reference/02-<section>/db-schema.md)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. 横切 Skill 模板
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: <crosscut-skill-name>
|
||||
description: >
|
||||
<横切关注点>的统一规范与实现指导(Guides <crosscut concern> across all modules)。
|
||||
包含:<具体内容列表>。
|
||||
触发场景 Trigger: <触发场景列表>。
|
||||
关键词 Keywords: <关键词列表>。
|
||||
argument-hint: "<module-name|file-path> - 指定要应用规范的模块或文件"
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash
|
||||
---
|
||||
|
||||
# <横切 Skill 标题>
|
||||
|
||||
<描述这个横切关注点在系统中的重要性和适用范围>
|
||||
|
||||
## Quick Context
|
||||
|
||||
```bash
|
||||
# 动态注入
|
||||
!`<扫描所有模块中与该横切主题相关的文件>`
|
||||
```
|
||||
|
||||
## Plan
|
||||
|
||||
### 产物清单
|
||||
- [ ] <横切维度的产物>
|
||||
|
||||
### 决策点
|
||||
1. <跨模块的统一决策>
|
||||
2. ...
|
||||
|
||||
## Verify
|
||||
|
||||
- [ ] <跨模块一致性检查>
|
||||
- [ ] <规范合规检查>
|
||||
- [ ] ...
|
||||
|
||||
## Execute
|
||||
|
||||
### 全局规范
|
||||
<适用于所有模块的规则>
|
||||
|
||||
### 模块适配
|
||||
<各模块的特殊处理>
|
||||
|
||||
## Pitfalls
|
||||
|
||||
1. **<跨模块一致性问题>**: <描述>
|
||||
2. ...
|
||||
|
||||
## Related References
|
||||
|
||||
- [全局规范定义](reference/<global-spec>.md)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. 模板使用注意事项
|
||||
|
||||
### 4.1 必须自定义的部分
|
||||
|
||||
- `<尖括号>` 中的所有占位符
|
||||
- Plan 的产物清单和决策点必须来自 DDS
|
||||
- Verify 的检查项必须与模块设计细节对应
|
||||
- Pitfalls 必须与模块/主题强相关,不可用通用建议填充
|
||||
|
||||
### 4.2 禁止照搬模板
|
||||
|
||||
模板是结构参考,不是内容来源。以下行为将导致自检 FAIL:
|
||||
- 产物清单中出现模板占位符
|
||||
- Pitfalls 与模块无关(如:在前端 Skill 中出现数据库 Pitfall)
|
||||
- Verify 中没有引用任何 reference
|
||||
|
||||
### 4.3 按 DDS 内容增减
|
||||
|
||||
- 如果 DDS 中没有状态机,模块 Skill 可以不包含状态机相关 Verify
|
||||
- 如果 DDS 中有额外的关注点(如性能优化、缓存策略),应增加对应章节
|
||||
- 横切 Skill 的数量和主题必须由 DDS 内容决定
|
||||
214
.agents/skills/dds-to-skill/scripts/verify-skill-output.sh
Normal file
214
.agents/skills/dds-to-skill/scripts/verify-skill-output.sh
Normal file
@@ -0,0 +1,214 @@
|
||||
#!/bin/bash
|
||||
# verify-skill-output.sh
|
||||
# 验证 DDS-to-Skill 转换输出的完整性和质量
|
||||
#
|
||||
# 用法:./verify-skill-output.sh <skills-output-dir>
|
||||
# 示例:./verify-skill-output.sh /path/to/1-AgentSkills
|
||||
#
|
||||
# 依赖:bash, grep, sed, find, wc
|
||||
|
||||
set -e
|
||||
|
||||
SKILLS_DIR="${1:-.}"
|
||||
PASS=0
|
||||
FAIL=0
|
||||
WARN=0
|
||||
|
||||
# 颜色输出
|
||||
GREEN='\033[0;32m'
|
||||
RED='\033[0;31m'
|
||||
YELLOW='\033[0;33m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
pass() {
|
||||
echo -e "${GREEN}✅ PASS${NC}: $1"
|
||||
((PASS++))
|
||||
}
|
||||
|
||||
fail() {
|
||||
echo -e "${RED}❌ FAIL${NC}: $1"
|
||||
echo -e " ${RED}修复${NC}: $2"
|
||||
((FAIL++))
|
||||
}
|
||||
|
||||
warn() {
|
||||
echo -e "${YELLOW}⚠️ WARN${NC}: $1"
|
||||
((WARN++))
|
||||
}
|
||||
|
||||
echo "============================================"
|
||||
echo " DDS-to-Skill 输出质量验证"
|
||||
echo " 目标目录: $SKILLS_DIR"
|
||||
echo "============================================"
|
||||
echo ""
|
||||
|
||||
# ============================================
|
||||
# 1. 结构完整性检查
|
||||
# ============================================
|
||||
echo "--- 1. 结构完整性 ---"
|
||||
|
||||
# S1: 检查是否有系统级 Skill
|
||||
SYSTEM_SKILLS=$(find "$SKILLS_DIR" -maxdepth 1 -type d -name "*-system*" 2>/dev/null | wc -l)
|
||||
if [ "$SYSTEM_SKILLS" -ge 1 ]; then
|
||||
pass "S1: 存在系统级 Skill ($SYSTEM_SKILLS 个)"
|
||||
else
|
||||
warn "S1: 未找到系统级 Skill(名称包含 '-system')"
|
||||
fi
|
||||
|
||||
# S4: 每个 Skill 都有 SKILL.md
|
||||
SKILL_DIRS=$(find "$SKILLS_DIR" -maxdepth 1 -type d ! -name "$(basename "$SKILLS_DIR")" 2>/dev/null)
|
||||
MISSING_SKILLMD=0
|
||||
for dir in $SKILL_DIRS; do
|
||||
if [ ! -f "$dir/SKILL.md" ]; then
|
||||
fail "S4: $dir 缺少 SKILL.md" "创建该目录下的 SKILL.md"
|
||||
((MISSING_SKILLMD++))
|
||||
fi
|
||||
done
|
||||
if [ "$MISSING_SKILLMD" -eq 0 ]; then
|
||||
pass "S4: 所有 Skill 目录都有 SKILL.md"
|
||||
fi
|
||||
|
||||
# S5: 每个 Skill 都有 reference/
|
||||
MISSING_REF=0
|
||||
for dir in $SKILL_DIRS; do
|
||||
if [ ! -d "$dir/reference" ]; then
|
||||
warn "S5: $dir 缺少 reference/ 目录"
|
||||
((MISSING_REF++))
|
||||
fi
|
||||
done
|
||||
if [ "$MISSING_REF" -eq 0 ]; then
|
||||
pass "S5: 所有 Skill 目录都有 reference/"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
|
||||
# ============================================
|
||||
# 2. Frontmatter 规范检查
|
||||
# ============================================
|
||||
echo "--- 2. Frontmatter 规范 ---"
|
||||
|
||||
for dir in $SKILL_DIRS; do
|
||||
SKILL_FILE="$dir/SKILL.md"
|
||||
[ ! -f "$SKILL_FILE" ] && continue
|
||||
SKILL_NAME=$(basename "$dir")
|
||||
|
||||
# F1: name 字段
|
||||
if head -20 "$SKILL_FILE" | grep -q '^name:'; then
|
||||
pass "F1 [$SKILL_NAME]: frontmatter 包含 name"
|
||||
else
|
||||
fail "F1 [$SKILL_NAME]: 缺少 name 字段" "在 frontmatter 中添加 name 字段"
|
||||
fi
|
||||
|
||||
# F2: description 字段
|
||||
if head -20 "$SKILL_FILE" | grep -q '^description:'; then
|
||||
pass "F2 [$SKILL_NAME]: frontmatter 包含 description"
|
||||
else
|
||||
fail "F2 [$SKILL_NAME]: 缺少 description 字段" "在 frontmatter 中添加 description 字段"
|
||||
fi
|
||||
|
||||
# C1: 行数 < 500
|
||||
LINE_COUNT=$(wc -l < "$SKILL_FILE")
|
||||
if [ "$LINE_COUNT" -lt 500 ]; then
|
||||
pass "C1 [$SKILL_NAME]: SKILL.md = $LINE_COUNT 行 (< 500)"
|
||||
else
|
||||
fail "C1 [$SKILL_NAME]: SKILL.md = $LINE_COUNT 行 (>= 500)" "将冗长内容移到 reference/ 中"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
|
||||
# ============================================
|
||||
# 3. 内容质量检查
|
||||
# ============================================
|
||||
echo "--- 3. 内容质量 ---"
|
||||
|
||||
for dir in $SKILL_DIRS; do
|
||||
SKILL_FILE="$dir/SKILL.md"
|
||||
[ ! -f "$SKILL_FILE" ] && continue
|
||||
SKILL_NAME=$(basename "$dir")
|
||||
|
||||
# C2-C5: 必须章节
|
||||
for section in "Plan" "Verify" "Execute" "Pitfalls"; do
|
||||
if grep -q "## $section" "$SKILL_FILE"; then
|
||||
pass "C [$SKILL_NAME]: 包含 ## $section"
|
||||
else
|
||||
warn "C [$SKILL_NAME]: 缺少 ## $section 章节"
|
||||
fi
|
||||
done
|
||||
|
||||
# C6: 动态注入
|
||||
INJECT_COUNT=$(grep -c '!`' "$SKILL_FILE" 2>/dev/null || echo 0)
|
||||
if [ "$INJECT_COUNT" -ge 2 ]; then
|
||||
pass "C6 [$SKILL_NAME]: $INJECT_COUNT 处动态注入 (>= 2)"
|
||||
else
|
||||
warn "C6 [$SKILL_NAME]: 仅 $INJECT_COUNT 处动态注入 (建议 >= 2)"
|
||||
fi
|
||||
|
||||
# C7: Pitfalls 引用 reference
|
||||
REF_IN_PITFALLS=$(sed -n '/## Pitfalls/,/## /p' "$SKILL_FILE" | grep -c 'reference/' 2>/dev/null || echo 0)
|
||||
if [ "$REF_IN_PITFALLS" -ge 2 ]; then
|
||||
pass "C7 [$SKILL_NAME]: Pitfalls 中 $REF_IN_PITFALLS 处引用 reference (>= 2)"
|
||||
else
|
||||
warn "C7 [$SKILL_NAME]: Pitfalls 中仅 $REF_IN_PITFALLS 处引用 reference (建议 >= 2)"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
|
||||
# ============================================
|
||||
# 4. Reference 质量检查
|
||||
# ============================================
|
||||
echo "--- 4. Reference 质量 ---"
|
||||
|
||||
for dir in $SKILL_DIRS; do
|
||||
[ ! -d "$dir/reference" ] && continue
|
||||
SKILL_NAME=$(basename "$dir")
|
||||
|
||||
# R2: 章节分层
|
||||
SECTION_DIRS=$(find "$dir/reference" -maxdepth 1 -type d -name '0*' 2>/dev/null | wc -l)
|
||||
if [ "$SECTION_DIRS" -ge 1 ]; then
|
||||
pass "R2 [$SKILL_NAME]: reference 有 $SECTION_DIRS 个章节子目录"
|
||||
else
|
||||
warn "R2 [$SKILL_NAME]: reference 无章节子目录(使用扁平结构)"
|
||||
fi
|
||||
|
||||
# R3: DDS 溯源
|
||||
DDS_SECTION_COUNT=$(grep -r 'DDS-Section:' "$dir/reference/" 2>/dev/null | wc -l)
|
||||
if [ "$DDS_SECTION_COUNT" -ge 1 ]; then
|
||||
pass "R3 [$SKILL_NAME]: $DDS_SECTION_COUNT 处 DDS-Section 溯源"
|
||||
else
|
||||
warn "R3 [$SKILL_NAME]: 无 DDS-Section 溯源标记"
|
||||
fi
|
||||
|
||||
# R5: TBD 标注
|
||||
TBD_COUNT=$(grep -r '\[TBD' "$dir/reference/" 2>/dev/null | wc -l)
|
||||
if [ "$TBD_COUNT" -ge 0 ]; then
|
||||
pass "R5 [$SKILL_NAME]: $TBD_COUNT 处 [TBD] 标注"
|
||||
fi
|
||||
|
||||
# R1: 设计要素类型数
|
||||
REF_FILES=$(find "$dir/reference" -name "*.md" 2>/dev/null | wc -l)
|
||||
if [ "$REF_FILES" -ge 3 ]; then
|
||||
pass "R1 [$SKILL_NAME]: $REF_FILES 个 reference 文件 (>= 3)"
|
||||
else
|
||||
warn "R1 [$SKILL_NAME]: 仅 $REF_FILES 个 reference 文件 (建议 >= 3)"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
|
||||
# ============================================
|
||||
# 总结
|
||||
# ============================================
|
||||
echo "============================================"
|
||||
echo " 验证完成"
|
||||
echo " ✅ PASS: $PASS"
|
||||
echo " ❌ FAIL: $FAIL"
|
||||
echo " ⚠️ WARN: $WARN"
|
||||
echo "============================================"
|
||||
|
||||
if [ "$FAIL" -gt 0 ]; then
|
||||
exit 1
|
||||
else
|
||||
exit 0
|
||||
fi
|
||||
60
.agents/skills/dds-to-skill/scripts/verify.sh
Normal file
60
.agents/skills/dds-to-skill/scripts/verify.sh
Normal file
@@ -0,0 +1,60 @@
|
||||
#!/bin/bash
|
||||
# verify.sh - dds-to-skill Skill 自身结构验证
|
||||
#
|
||||
# 验证本 Skill 的文件结构和内容完整性
|
||||
# 用法:cd dds-to-skill && ./scripts/verify.sh
|
||||
|
||||
set -e
|
||||
|
||||
PASS=0; FAIL=0
|
||||
|
||||
check() {
|
||||
if eval "$2"; then
|
||||
echo "✅ PASS: $1"; ((PASS++))
|
||||
else
|
||||
echo "❌ FAIL: $1"; ((FAIL++))
|
||||
fi
|
||||
}
|
||||
|
||||
SKILL_DIR="$(cd "$(dirname "$0")/.." && pwd)"
|
||||
|
||||
echo "=== dds-to-skill Skill 自检 ==="
|
||||
echo "目录: $SKILL_DIR"
|
||||
echo ""
|
||||
|
||||
# 结构检查
|
||||
check "SKILL.md 存在" "test -f '$SKILL_DIR/SKILL.md'"
|
||||
check "reference/ 目录存在" "test -d '$SKILL_DIR/reference'"
|
||||
check "examples/ 目录存在" "test -d '$SKILL_DIR/examples'"
|
||||
check "scripts/ 目录存在" "test -d '$SKILL_DIR/scripts'"
|
||||
|
||||
# SKILL.md 内容检查
|
||||
check "SKILL.md < 500 行" "[ \$(wc -l < '$SKILL_DIR/SKILL.md') -lt 500 ]"
|
||||
check "包含 name 字段" "head -20 '$SKILL_DIR/SKILL.md' | grep -q '^name:'"
|
||||
check "包含 description 字段" "head -20 '$SKILL_DIR/SKILL.md' | grep -q '^description:'"
|
||||
check "包含 argument-hint" "head -20 '$SKILL_DIR/SKILL.md' | grep -q 'argument-hint:'"
|
||||
|
||||
# 阶段结构检查
|
||||
check "包含 Phase 0(读取)" "grep -q 'Phase 0' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Phase 1(分析)" "grep -q 'Phase 1' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Phase 2(抽取)" "grep -q 'Phase 2' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Phase 3(生成)" "grep -q 'Phase 3' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Phase 4(支撑文件)" "grep -q 'Phase 4' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Phase 5(自检)" "grep -q 'Phase 5' '$SKILL_DIR/SKILL.md'"
|
||||
|
||||
# Reference 文件检查
|
||||
check "dds-extraction-guide.md 存在" "test -f '$SKILL_DIR/reference/dds-extraction-guide.md'"
|
||||
check "skill-templates.md 存在" "test -f '$SKILL_DIR/reference/skill-templates.md'"
|
||||
check "frontmatter-spec.md 存在" "test -f '$SKILL_DIR/reference/frontmatter-spec.md'"
|
||||
check "quality-checklist.md 存在" "test -f '$SKILL_DIR/reference/quality-checklist.md'"
|
||||
|
||||
# Examples 检查
|
||||
check "至少 1 个转换示例" "find '$SKILL_DIR/examples' -name '*.md' | grep -q ."
|
||||
|
||||
# 动态注入检查
|
||||
INJECT_COUNT=$(grep -c '!\`' "$SKILL_DIR/SKILL.md" 2>/dev/null || echo 0)
|
||||
check "SKILL.md 包含动态注入 (>= 2 处)" "[ $INJECT_COUNT -ge 2 ]"
|
||||
|
||||
echo ""
|
||||
echo "=== 结果: $PASS PASS / $FAIL FAIL ==="
|
||||
[ $FAIL -eq 0 ] && exit 0 || exit 1
|
||||
267
.agents/skills/developing-projectmoneyx/SKILL.md
Normal file
267
.agents/skills/developing-projectmoneyx/SKILL.md
Normal file
@@ -0,0 +1,267 @@
|
||||
---
|
||||
name: developing-projectmoneyx
|
||||
description: >
|
||||
指导 ProjectMoneyX 多源账单数据治理系统的全栈开发(Guides full-stack development of ProjectMoneyX bill data governance system)。
|
||||
包含:ETL Pipeline 编排(Parse → Normalize → Dedup → Link → Rule → Export)、插件化解析器对接、三层去重策略、规则引擎映射、Firefly III 适配、SQLite 数据模型、审计追溯。
|
||||
触发场景 Trigger: 开发/修改 ProjectMoneyX 的 Parser / Pipeline / 去重 / 规则 / 导入导出 / 审计 / 前端页面 / API 接口。
|
||||
关键词 Keywords: ProjectMoneyX, 账单, bill, ETL, parser, dedup, 去重, 链路合并, transfer link, rule engine, 规则引擎, Firefly III, 导入, import, export, audit, 审计, SQLite, GORM, GIN, Vue3, Vuetify。
|
||||
argument-hint: "<action> <target>" 例如/ e.g.:
|
||||
"add parser for ccb", "implement dedup scorer", "create rule handler",
|
||||
"update transaction schema", "build import preview page"
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Write
|
||||
- Edit
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash
|
||||
---
|
||||
|
||||
# Developing ProjectMoneyX
|
||||
|
||||
ProjectMoneyX 是 Firefly III 生态的**本地化多源账单数据治理中间件**,技术栈为 Go (GIN + GORM) + Vue3 (TypeScript + Vuetify) + SQLite。系统核心是一条 ETL Pipeline:`Parse → Normalize → Dedup → Link → Rule → Export`,将支付宝/微信/银行账单标准化后推送至 Firefly III。
|
||||
|
||||
> **架构关键词**:DDD 分层 · 插件化 Adapter · 三层去重 · 规则可解释 · 全链路审计
|
||||
|
||||
## Quick Context
|
||||
|
||||
```bash
|
||||
# 动态注入:后端项目结构
|
||||
!`find projectmoneyx-server/internal -type f -name "*.go" | head -40`
|
||||
|
||||
# 动态注入:前端项目结构
|
||||
!`find projectmoneyx-web/src -type f -name "*.ts" -o -name "*.vue" | head -30`
|
||||
|
||||
# 动态注入:数据库表定义
|
||||
!`grep -rn "TableName\|func.*TableName" projectmoneyx-server/internal/ | head -20`
|
||||
|
||||
# 动态注入:API 路由注册
|
||||
!`grep -rn "Group\|GET\|POST\|PUT\|DELETE" projectmoneyx-server/internal/handler/ | head -30`
|
||||
```
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ 展现层: Vue3 + TypeScript + Vuetify │
|
||||
│ 导入中心 / 清洗预览 / 去重处理 / 规则管理 / 导入任务 / 审计 │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ 接入层: GIN RESTful API (/api/v1/*) │
|
||||
│ import / transactions / dedup / rules / export / audit │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ 应用服务层: Pipeline 编排 │
|
||||
│ ImportBatchService → PipelineService │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ 业务逻辑层 (ETL Core Domain) │
|
||||
│ Parser(插件) → Normalize → Match → Link → Rule → Export │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ 数据持久层: GORM + SQLite (WAL) │
|
||||
│ 11 张核心表,分阶段事务 │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**分层依赖规则**:handler → service → domain(entity/repository)← dao。Parser/Matcher/Linker/Rule/Exporter 为独立可测试组件。
|
||||
|
||||
## Module Registry
|
||||
|
||||
| 模块 | 包路径 | 职责 | 优先级 |
|
||||
|------|--------|------|--------|
|
||||
| 导入中心 | `handler/import` + `service/import_batch` | 文件上传、批次管理 | P0 |
|
||||
| 解析引擎 | `parser/` | 插件化平台解析器 | P0 |
|
||||
| 标准化引擎 | `normalize/` | 异构字段 → 统一 Transaction 模型 | P0 |
|
||||
| 去重引擎 | `matcher/` | 严格去重 + 模糊去重(P1) | P0/P1 |
|
||||
| 链路引擎 | `linker/` | 转账闭环 + 订单链路合并 | P0 |
|
||||
| 规则引擎 | `rule/` | 6 类规则按序执行 | P0/P1 |
|
||||
| 导出引擎 | `exporter/` | Firefly API/CSV 导出 | P0 |
|
||||
| 审计中心 | `service/audit` | 全链路追溯 | P0 |
|
||||
| 系统设置 | `handler/settings` + `config/` | Firefly 连接、阈值参数 | P1 |
|
||||
|
||||
## Plan
|
||||
|
||||
### 产物清单
|
||||
|
||||
| 动作 | 产物 |
|
||||
|------|------|
|
||||
| `add parser` | `parser/<platform>/<platform>_parser.go` — 实现 `BillParser` 接口 |
|
||||
| `create handler` | `handler/<resource>_handler.go` — GIN Handler |
|
||||
| `create service` | `service/<resource>_service.go` — 应用服务 |
|
||||
| `create dao` | `dao/<resource>_dao.go` — GORM 数据访问 |
|
||||
| `create entity` | `domain/entity/<resource>.go` — 领域实体 |
|
||||
| `add rule type` | `rule/<type>_mapper.go` — 规则映射器 |
|
||||
| `scaffold module` | 上述全部 + DTO + repository 接口 |
|
||||
|
||||
### 决策点
|
||||
|
||||
1. **Parser 选择**:先检查 `parser/registry.go` 中已注册的解析器,确认目标平台是否已有实现
|
||||
2. **去重层级**:严格去重(P0) vs 模糊去重(P1) — 新功能默认只实现严格去重
|
||||
3. **规则执行顺序**:必须遵守 6 步固定顺序(`reference/04-rule-engine/rule-execution.md`)
|
||||
4. **事务边界**:ETL 每阶段独立事务,禁止跨阶段长事务
|
||||
5. **SQLite 约束**:单写连接 `MaxOpenConns=1`,启用 WAL 模式
|
||||
|
||||
---
|
||||
|
||||
## Execute
|
||||
|
||||
### 1. 新增平台解析器
|
||||
|
||||
```go
|
||||
// 1. 实现 BillParser 接口 (parser/<platform>/<platform>_parser.go)
|
||||
type Parser struct{}
|
||||
|
||||
func (p *Parser) Platform() string { return "<platform>" }
|
||||
|
||||
func (p *Parser) Detect(meta FileMeta, header []string) bool {
|
||||
// 基于文件名/表头特征判定
|
||||
}
|
||||
|
||||
func (p *Parser) Parse(ctx context.Context, reader io.Reader) ([]RawBillRecord, error) {
|
||||
// 逐行读取 → 填充 RawBillRecord.RawFields
|
||||
// 必须设置 SourcePlatform, SourceRecordID, RowNo, RowFingerprint
|
||||
}
|
||||
|
||||
// 2. 注册到 Registry (parser/registry.go)
|
||||
r.Register(&<platform>.Parser{})
|
||||
```
|
||||
|
||||
**字段映射要求**(参考 `reference/02-parser-engine/field-mappings.md`):
|
||||
- `trade_time`:统一 UTC+8,`time.Time`
|
||||
- `amount`:去除货币符号,正数 `decimal(18,6)`
|
||||
- `direction`:`income` / `expense` / `transfer` / `refund` / `fee` / `other`
|
||||
- `category_raw`:保留原始分类,不在 Parser 中做映射
|
||||
- `order_id`:去除空格,作为唯一标识
|
||||
|
||||
### 2. ETL Pipeline 阶段开发
|
||||
|
||||
每个阶段必须:
|
||||
1. 接收 `context.Context` + 数据切片
|
||||
2. 返回处理后切片 + error
|
||||
3. 在独立事务中持久化(`db.Transaction`,每批 500 条 `CreateInBatches`)
|
||||
4. 更新批次状态
|
||||
|
||||
```go
|
||||
// 阶段签名模式
|
||||
func (s *StageService) Execute(ctx context.Context, txns []*Transaction) ([]*Transaction, error)
|
||||
```
|
||||
|
||||
### 3. 规则引擎扩展
|
||||
|
||||
新增规则类型时:
|
||||
1. 在 `rule/engine.go` 的 `executionOrder` 中确认位置
|
||||
2. 实现 `MatchConditions(txn)` 和 `ApplyActions(txn)` 方法
|
||||
3. 确保 `RuleHit` 记录命中日志(含 `BeforeValue` / `AfterValue`)
|
||||
4. 规则条件 JSON 存储,参考 `reference/04-rule-engine/rule-conditions.md`
|
||||
|
||||
### 4. API 开发
|
||||
|
||||
遵循统一响应格式:
|
||||
```go
|
||||
type Response struct {
|
||||
Code int `json:"code"` // 0=成功
|
||||
Message string `json:"message"`
|
||||
Data interface{} `json:"data"`
|
||||
}
|
||||
```
|
||||
|
||||
路由分组:`/api/v1/import/*`、`/api/v1/transactions/*`、`/api/v1/dedup/*`、`/api/v1/rules/*`、`/api/v1/export/*`、`/api/v1/audit/*`、`/api/v1/settings/*`
|
||||
|
||||
### 5. 前端页面开发
|
||||
|
||||
7 个核心页面,全部使用 Vue3 + Composition API + TypeScript:
|
||||
- 导入中心:`FileUploader.vue` + 拖拽上传 + 进度条
|
||||
- 清洗预览:`TransactionTable.vue` + `v-data-table` + 行展开对比
|
||||
- 去重处理:`DedupCompare.vue` + 左右分栏 + 评分因子展开
|
||||
- 规则管理:`RuleEditor.vue` + 条件构建器 + 测试预览
|
||||
- 导入任务:统计概览 + 失败列表 + 单条/批量重试
|
||||
- 审计追溯:`AuditTimeline.vue` + `v-timeline` + 快照展开
|
||||
- 系统设置:Firefly 连接配置 + 测试连接 + 去重参数配置
|
||||
|
||||
---
|
||||
|
||||
## Verify
|
||||
|
||||
### 架构层级检查
|
||||
- [ ] handler 层不包含业务逻辑,仅做参数绑定 + 调用 service + 返回响应
|
||||
- [ ] service 层不直接操作 `*gorm.DB`,通过 repository 接口访问数据
|
||||
- [ ] domain/entity 不依赖 handler/service
|
||||
- [ ] 无循环依赖(handler → service → domain ← dao)
|
||||
|
||||
### Parser 检查
|
||||
- [ ] 新增 Parser 实现了 `BillParser` 接口的全部 3 个方法(`Platform()`, `Detect()`, `Parse()`)
|
||||
- [ ] 已注册到 `parser/registry.go`(`reference/02-parser-engine/parser-interface.md`)
|
||||
- [ ] 字段映射覆盖了所有原始字段(对照 `reference/02-parser-engine/field-mappings.md`)
|
||||
- [ ] `amount` 为正数,`direction` 独立表达收支方向
|
||||
- [ ] `RowFingerprint` 使用 SHA256 生成(`reference/03-dedup-engine/fingerprint.md`)
|
||||
|
||||
### 去重与链路检查
|
||||
- [ ] 严格去重判定键按优先级 3 级执行(`reference/03-dedup-engine/strict-dedup.md`)
|
||||
- [ ] 模糊去重评分因子 6 项,阈值可配置(`reference/03-dedup-engine/fuzzy-dedup.md`)
|
||||
- [ ] 转账闭环 5 条件全部满足才匹配(`reference/03-dedup-engine/transfer-link.md`)
|
||||
- [ ] 疑似重复(60-84 分)进入 `PENDING_REVIEW` 人工确认队列
|
||||
|
||||
### 规则引擎检查
|
||||
- [ ] 6 类规则按固定顺序执行:对手方归一 → 商户归一 → 分类 → 账户 → 标签 → Firefly(`reference/04-rule-engine/rule-execution.md`)
|
||||
- [ ] 同类型内按 `priority` 升序执行,首条命中即停止
|
||||
- [ ] 每条命中记录 `RuleHit`,含 `BeforeValue` / `AfterValue`
|
||||
- [ ] 规则条件 JSON 结构正确(`reference/04-rule-engine/rule-conditions.md`)
|
||||
|
||||
### 数据库检查
|
||||
- [ ] 表结构 11 张表齐全(`reference/05-database/db-schema.md`)
|
||||
- [ ] 关键索引已创建(`reference/05-database/indexes.md`)
|
||||
- [ ] SQLite 配置:`MaxOpenConns=1`, WAL 模式, `cache_size=-64000`
|
||||
- [ ] ETL 每阶段独立事务,`CreateInBatches` 每批 500 条
|
||||
|
||||
### API 检查
|
||||
- [ ] 路由路径遵循 `reference/06-api-design/api-catalog.md`
|
||||
- [ ] 统一 `Response` / `PageResponse` 结构
|
||||
- [ ] 导入前 6 项校验完整(`reference/07-export-engine/import-validation.md`)
|
||||
|
||||
### 前端检查
|
||||
- [ ] 所有页面使用 `<script setup lang="ts">` + Composition API
|
||||
- [ ] 数据表格使用 `v-data-table` + `fixed-header`
|
||||
- [ ] 处理三种 UI 状态:加载中(skeleton)、空数据(empty-state)、错误(snackbar + 重试)
|
||||
- [ ] 路由定义匹配 `reference/08-frontend/routes.md`
|
||||
|
||||
---
|
||||
|
||||
## Pitfalls
|
||||
|
||||
1. **支付宝分类是全局基准字典**:系统使用支付宝 22 种分类作为统一标准。微信/银行等平台必须映射到此分类枚举,不要在 Parser 中创造新的分类值体系(参考 `reference/02-parser-engine/field-mappings.md` 中的分类枚举表)
|
||||
|
||||
2. **微信分类需要"交易类型 + 商品"联合推断**:微信的"交易类型"是支付动作(商户消费/转账/红包),不是消费语义。必须结合"商品"字段做关键词推断,未命中的归入"其他"并记录。不要直接把微信交易类型当作分类使用(参考 `reference/02-parser-engine/field-mappings.md` 中的微信推断规则流程图)
|
||||
|
||||
3. **规则执行顺序不可变**:6 类规则的执行顺序是固定的(对手方归一 → 商户归一 → 分类 → 账户 → 标签 → Firefly),先归一再分类可提升命中率。修改顺序会导致规则相互依赖断裂(参考 `reference/04-rule-engine/rule-execution.md`)
|
||||
|
||||
4. **SQLite 单写连接**:`MaxOpenConns` 必须设为 1,SQLite 不支持多写。ETL Pipeline 已通过分阶段事务避免超长锁定。如果遇到 `database is locked` 错误,检查是否有未关闭的事务(参考 `reference/05-database/db-schema.md` 中的 SQLite 配置)
|
||||
|
||||
5. **转账闭环需 5 条件全部满足**:金额一致 + 方向互补 + 时间窗口内 + 不同平台 + 非退款/手续费。漏掉任何一项都会导致误合并。特别注意退款交易(direction=refund)不应参与转账闭环(参考 `reference/03-dedup-engine/transfer-link.md`)
|
||||
|
||||
6. **行指纹是分钟粒度**:`GenerateRowFingerprint` 使用 `2006-01-02 15:04` 格式(分钟级),不是秒级。这是为了容忍不同平台对同一交易记录的秒级时间差异(参考 `reference/03-dedup-engine/fingerprint.md`)
|
||||
|
||||
7. **批次状态机严格受控**:批次状态只能按 `CREATED → UPLOADED → PARSING → ... → IMPORT_SUCCESS` 的线性路径推进,不可跳跃。失败时只能回退至上一可重试状态(参考 `reference/01-architecture/batch-state-machine.md`)
|
||||
|
||||
---
|
||||
|
||||
## Related References
|
||||
|
||||
| 需要了解... | 查阅... |
|
||||
|------------|--------|
|
||||
| 系统架构与模块依赖 | `reference/01-architecture/system-overview.md` |
|
||||
| 批次状态机定义 | `reference/01-architecture/batch-state-machine.md` |
|
||||
| Parser 接口与注册 | `reference/02-parser-engine/parser-interface.md` |
|
||||
| 平台字段映射规则 | `reference/02-parser-engine/field-mappings.md` |
|
||||
| 严格去重判定键 | `reference/03-dedup-engine/strict-dedup.md` |
|
||||
| 模糊去重评分模型 | `reference/03-dedup-engine/fuzzy-dedup.md` |
|
||||
| 转账闭环识别规则 | `reference/03-dedup-engine/transfer-link.md` |
|
||||
| 行指纹生成算法 | `reference/03-dedup-engine/fingerprint.md` |
|
||||
| 规则条件 JSON 格式 | `reference/04-rule-engine/rule-conditions.md` |
|
||||
| 规则执行顺序与可解释性 | `reference/04-rule-engine/rule-execution.md` |
|
||||
| 数据库 11 张表结构 | `reference/05-database/db-schema.md` |
|
||||
| 关键索引设计 | `reference/05-database/indexes.md` |
|
||||
| API 接口目录 | `reference/06-api-design/api-catalog.md` |
|
||||
| 统一响应与错误码 | `reference/06-api-design/response-format.md` |
|
||||
| Firefly 导出适配 | `reference/07-export-engine/firefly-mapping.md` |
|
||||
| 导入前校验清单 | `reference/07-export-engine/import-validation.md` |
|
||||
| 前端路由与页面 | `reference/08-frontend/routes.md` |
|
||||
| 前端组件交互 | `reference/08-frontend/components.md` |
|
||||
| 非功能设计要求 | `reference/09-nonfunctional/performance.md` |
|
||||
| 部署与安全 | `reference/09-nonfunctional/deployment.md` |
|
||||
@@ -0,0 +1,57 @@
|
||||
## 批次状态机
|
||||
|
||||
- **DDS-Section**: 5.2 批次状态机
|
||||
- **DDS-Lines**: L362-L387
|
||||
|
||||
### Extract
|
||||
|
||||
#### 状态枚举
|
||||
|
||||
| 状态 | 说明 | 备注 |
|
||||
|------|------|------|
|
||||
| `CREATED` | 批次已创建 | 初始状态 |
|
||||
| `UPLOADED` | 文件上传完成 | |
|
||||
| `PARSING` | 正在解析 | |
|
||||
| `PARSED` | 解析完成 | |
|
||||
| `NORMALIZING` | 正在标准化 | |
|
||||
| `NORMALIZED` | 标准化完成 | |
|
||||
| `MATCHING` | 正在去重/链路合并 | |
|
||||
| `MATCHED` | 去重/链路完成 | |
|
||||
| `RULE_APPLYING` | 正在应用规则 | |
|
||||
| `PREVIEW_READY` | 规则映射完成,可预览 | 用户可在此阶段查看预览结果、人工确认去重 |
|
||||
| `IMPORTING` | 正在导入 | 用户确认后触发 |
|
||||
| `IMPORT_SUCCESS` | 全部导入成功 | 终态 |
|
||||
| `PARTIAL_FAILED` | 部分失败 | 可重试 |
|
||||
| `IMPORT_FAILED` | 全部失败 | 可重试 |
|
||||
| `RETRYING` | 重试中 | |
|
||||
|
||||
#### 状态转换规则
|
||||
|
||||
```
|
||||
[*] → CREATED: 创建批次
|
||||
CREATED → UPLOADED: 文件上传完成
|
||||
UPLOADED → PARSING: 触发解析
|
||||
PARSING → PARSED: 解析完成
|
||||
PARSING → UPLOADED: 解析失败(回退)
|
||||
PARSED → NORMALIZING: 触发标准化
|
||||
NORMALIZING → NORMALIZED: 标准化完成
|
||||
NORMALIZED → MATCHING: 触发去重/链路
|
||||
MATCHING → MATCHED: 去重/链路完成
|
||||
MATCHED → RULE_APPLYING: 触发规则映射
|
||||
RULE_APPLYING → PREVIEW_READY: 规则映射完成
|
||||
PREVIEW_READY → IMPORTING: 用户确认导入
|
||||
IMPORTING → IMPORT_SUCCESS: 全部成功
|
||||
IMPORTING → PARTIAL_FAILED: 部分失败
|
||||
IMPORTING → IMPORT_FAILED: 全部失败
|
||||
PARTIAL_FAILED → RETRYING: 用户重试
|
||||
IMPORT_FAILED → RETRYING: 用户重试
|
||||
RETRYING → IMPORT_SUCCESS: 重试成功
|
||||
RETRYING → PARTIAL_FAILED: 仍有失败
|
||||
```
|
||||
|
||||
#### 实现要点
|
||||
|
||||
- 解析失败时状态回退至 `UPLOADED`,记录错误信息
|
||||
- `PREVIEW_READY` 是用户交互节点,用户可查看预览结果和人工确认去重
|
||||
- 失败状态支持重试,不需要整批重做
|
||||
- 状态变更必须记录到 `audit_logs` 表
|
||||
@@ -0,0 +1,113 @@
|
||||
## 系统架构总览
|
||||
|
||||
- **DDS-Section**: 3. 总体架构设计
|
||||
- **DDS-Lines**: L77-L164
|
||||
|
||||
### Extract
|
||||
|
||||
#### 分层架构
|
||||
|
||||
| 层级 | 技术选型 | 核心职责 |
|
||||
|------|----------|----------|
|
||||
| **展现层** | Vue3 + TS + Vuetify | 上传文件、批次管理、预览确认、规则配置、人工确认、导入结果展示 |
|
||||
| **接入层** | GIN Framework | RESTful 接口,统一参数校验、错误处理、响应封装 |
|
||||
| **应用服务层** | Go Service | 编排完整业务流程(ETL Pipeline),不承载具体解析规则 |
|
||||
| **Adapter 层** | Go Plugin Interface | 按平台解析原始文件,输出平台原始记录 DTO |
|
||||
| **Normalize 层** | Go Service | 统一字段、金额、方向、时间、分类原始值 |
|
||||
| **Match 层** | Go Service | 严格去重、模糊去重(多因子评分) |
|
||||
| **Link 层** | Go Service | 转账闭环、订单链路聚合 |
|
||||
| **Rule 层** | Go Service | 分类、账户、对手方、标签映射 |
|
||||
| **Export 层** | Go Service | 适配 Firefly III / Data Importer API 或 CSV/JSON |
|
||||
| **Repository 层** | GORM | 隔离数据库访问,面向领域对象持久化 |
|
||||
| **数据持久层** | SQLite | 本地数据库,存储全量数据与审计链路 |
|
||||
|
||||
#### 数据流转拓扑
|
||||
|
||||
```
|
||||
多源账单导入 → 解析器(Parser) → 标准化入库 → 去重与链路合并 → 规则映射 → 导出/推送
|
||||
(文件上传) (Adapter层) (SQLite) (Dedup+Link) (Rule Engine) (API/CSV)
|
||||
```
|
||||
|
||||
#### 后端包结构
|
||||
|
||||
```
|
||||
projectmoneyx-server/
|
||||
├── cmd/server/main.go # 程序入口
|
||||
├── internal/
|
||||
│ ├── config/ # 配置管理
|
||||
│ ├── handler/ # GIN Handler(接入层)
|
||||
│ ├── middleware/ # 中间件
|
||||
│ ├── service/ # 应用服务层
|
||||
│ ├── domain/ # 领域层
|
||||
│ │ ├── entity/ # 领域实体
|
||||
│ │ ├── valueobject/ # 值对象
|
||||
│ │ └── repository/ # 仓储接口
|
||||
│ ├── parser/ # Adapter 解析层(插件化)
|
||||
│ ├── normalize/ # 标准化引擎
|
||||
│ ├── matcher/ # 去重引擎
|
||||
│ ├── linker/ # 链路合并引擎
|
||||
│ ├── rule/ # 规则引擎
|
||||
│ ├── exporter/ # 导出引擎
|
||||
│ ├── dao/ # 数据访问层(GORM 实现)
|
||||
│ └── dto/ # 数据传输对象
|
||||
├── migrations/ # 数据库迁移
|
||||
├── web/ # 前端打包产物
|
||||
└── go.mod
|
||||
```
|
||||
|
||||
#### 前端项目结构
|
||||
|
||||
```
|
||||
projectmoneyx-web/
|
||||
├── src/
|
||||
│ ├── api/ # API 调用封装
|
||||
│ ├── views/ # 页面视图 (8 个核心页面)
|
||||
│ ├── components/ # 可复用组件
|
||||
│ ├── stores/ # Pinia 状态管理
|
||||
│ ├── types/ # TypeScript 类型定义
|
||||
│ ├── router/ # Vue Router
|
||||
│ ├── plugins/ # Vuetify 等插件
|
||||
│ ├── App.vue
|
||||
│ └── main.ts
|
||||
├── package.json
|
||||
├── tsconfig.json
|
||||
└── vite.config.ts
|
||||
```
|
||||
|
||||
#### 模块清单
|
||||
|
||||
| 模块 | 包名 | 职责 | 优先级 |
|
||||
|------|------|------|--------|
|
||||
| 导入中心 | `import-center` | 文件上传、批次管理、来源识别 | P0 |
|
||||
| 解析引擎 | `parser-engine` | 平台解析器注册、装载与执行 | P0 |
|
||||
| 标准化引擎 | `normalize-engine` | 统一模型转换 | P0 |
|
||||
| 去重引擎 | `dedup-engine` | 严格去重与模糊去重 | P0/P1 |
|
||||
| 链路引擎 | `link-engine` | 转账闭环与订单链路合并 | P0 |
|
||||
| 规则引擎 | `rule-engine` | 分类/账户/标签/商户归一化 | P0/P1 |
|
||||
| 导入编排 | `import-orchestrator` | 导入预览、执行、重试 | P0 |
|
||||
| 审计中心 | `audit-center` | 审计日志、处理链追溯 | P0 |
|
||||
| 系统设置 | `settings-center` | Firefly 配置、阈值参数 | P1 |
|
||||
|
||||
#### 核心 Service 清单
|
||||
|
||||
```go
|
||||
type ImportBatchService struct { ... } // 批次管理
|
||||
type PipelineService struct { ... } // ETL 流水线编排
|
||||
type ParserRegistry struct { ... } // 解析器注册中心
|
||||
type TransactionNormalizeService struct { ... } // 标准化服务
|
||||
type DedupMatchService struct { ... } // 去重匹配服务
|
||||
type TransferLinkService struct { ... } // 转账链路合并服务
|
||||
type RuleApplyService struct { ... } // 规则应用服务
|
||||
type FireflyExportService struct { ... } // Firefly 导出服务
|
||||
type AuditTraceService struct { ... } // 审计追溯服务
|
||||
```
|
||||
|
||||
#### 关键设计约束
|
||||
|
||||
| # | 约束 | 说明 |
|
||||
|---|------|------|
|
||||
| 1 | 本地优先 | 财务数据敏感,必须本地部署 |
|
||||
| 2 | 插件化解析器 | 平台格式变化频繁,适配逻辑隔离在 Adapter 层 |
|
||||
| 3 | 统一交易模型稳定 | 避免下游 Firefly 或上游平台格式污染核心域模型 |
|
||||
| 4 | 支付宝分类为标准 | 支付宝 22 种分类最丰富,其他平台映射到此体系 |
|
||||
| 5 | 微信分类需推断 | 微信"交易类型"粗粒度,结合"商品"字段推断分类 |
|
||||
@@ -0,0 +1,156 @@
|
||||
## 平台字段映射规则
|
||||
|
||||
- **DDS-Section**: 6.3 支付宝解析规则 + 6.4 微信解析规则 + 6.5 统一交易模型
|
||||
- **DDS-Lines**: L500-L668
|
||||
|
||||
### Extract
|
||||
|
||||
#### 支付宝原始字段
|
||||
|
||||
```
|
||||
交易时间 | 交易分类 | 交易对方 | 对方账号 | 商品说明 | 收/支 | 金额 | 收/付款方式 | 交易状态 | 交易订单号 | 商家订单号 | 备注
|
||||
```
|
||||
|
||||
#### 支付宝字段映射表
|
||||
|
||||
| 原字段 | 目标字段 | 映射说明 |
|
||||
|--------|----------|----------|
|
||||
| 交易时间 | `trade_time` | 解析为 `time.Time`,统一 UTC+8 |
|
||||
| 交易分类 | `category_raw` | 直接作为原始分类(22 种标准分类) |
|
||||
| 交易对方 | `counterparty` | 交易对手名称 |
|
||||
| 对方账号 | `counterparty_account` | 存入扩展字段 |
|
||||
| 商品说明 | `merchant_name` / `note` | 优先作为商户名,辅助作为备注 |
|
||||
| 收/支 | `direction` | "收入" → income, "支出" → expense, "其他" → other |
|
||||
| 金额 | `amount` | 去除 ¥ 符号,解析为 Decimal 正数 |
|
||||
| 收/付款方式 | `payment_method` | 存入扩展字段,可用于账户映射 |
|
||||
| 交易状态 | `trade_status` | "交易成功" / "退款成功" 等 |
|
||||
| 交易订单号 | `source_record_id` / `order_id` | 去除空格,作为唯一标识 |
|
||||
| 商家订单号 | `merchant_order_id` / `parent_order_id` | 可用于链路关联 |
|
||||
| 备注 | `note` | 补充备注 |
|
||||
|
||||
#### 支付宝标准分类枚举(22 类)— 全局统一基准
|
||||
|
||||
| 编号 | 分类名称 | 编号 | 分类名称 |
|
||||
|------|----------|------|----------|
|
||||
| 1 | 餐饮美食 | 12 | 退款 |
|
||||
| 2 | 投资理财 | 13 | 教育培训 |
|
||||
| 3 | 日用百货 | 14 | 住房物业 |
|
||||
| 4 | 数码电器 | 15 | 酒店旅游 |
|
||||
| 5 | 交通出行 | 16 | 文化休闲 |
|
||||
| 6 | 充值缴费 | 17 | 运动户外 |
|
||||
| 7 | 信用借还 | 18 | 爱车养车 |
|
||||
| 8 | 转账红包 | 19 | 商业服务 |
|
||||
| 9 | 生活服务 | 20 | 母婴亲子 |
|
||||
| 10 | 家居家装 | 21 | 收入 |
|
||||
| 11 | 医疗健康 | 22 | 其他 |
|
||||
|
||||
> **关键设计决策**:支付宝拥有最丰富的 22 种交易分类,系统将其作为**全局统一分类基准字典**。所有其他平台的交易分类最终都应映射到此套分类枚举。
|
||||
|
||||
#### 微信原始字段
|
||||
|
||||
```
|
||||
交易时间 | 交易类型 | 交易对方 | 商品 | 收/支 | 金额(元) | 支付方式 | 当前状态 | 交易单号 | 商户单号 | 备注
|
||||
```
|
||||
|
||||
#### 微信字段映射表
|
||||
|
||||
| 原字段 | 目标字段 | 映射说明 |
|
||||
|--------|----------|----------|
|
||||
| 交易时间 | `trade_time` | 解析为 `time.Time`,统一 UTC+8 |
|
||||
| 交易类型 | `category_raw` | 存储原始类型,需通过推断映射到标准分类 |
|
||||
| 交易对方 | `counterparty` | 交易对手名称 |
|
||||
| 商品 | `merchant_name` / `product_desc` | **关键字段**,用于推断实际消费分类 |
|
||||
| 收/支 | `direction` | "收入" → income, "支出" → expense |
|
||||
| 金额(元) | `amount` | 去除 ¥ 符号,解析为 Decimal 正数 |
|
||||
| 支付方式 | `payment_method` | 存入扩展字段 |
|
||||
| 当前状态 | `trade_status` | "支付成功" / "已退款" 等 |
|
||||
| 交易单号 | `source_record_id` / `order_id` | 去除空格 |
|
||||
| 商户单号 | `merchant_order_id` / `parent_order_id` | 链路关联 |
|
||||
| 备注 | `note` | 补充备注 |
|
||||
|
||||
#### 微信分类推断规则
|
||||
|
||||
微信"交易类型"多为支付动作(商户消费、扫二维码付款、转账、红包等),无实际消费语义。推断流程:
|
||||
|
||||
1. **交易类型 = 转账** → 转账红包
|
||||
2. **交易类型 = 微信红包** → 转账红包
|
||||
3. **交易类型含 "退款"** → 退款
|
||||
4. **交易类型 = 商户消费/扫二维码付款** → 基于"商品"字段关键词推断:
|
||||
|
||||
| 微信交易类型 | 商品关键词 | 推断标准分类 |
|
||||
|-------------|-----------|-------------|
|
||||
| 商户消费 | 美团 / 外卖 / 餐厅 / 咖啡 / 面包 | 餐饮美食 |
|
||||
| 商户消费 | 滴滴 / 打车 / 地铁 / 高铁 / 加油 | 交通出行 |
|
||||
| 商户消费 | 京东 / 超市 / 便利店 / 百货 | 日用百货 |
|
||||
| 商户消费 | 电费 / 水费 / 话费 / 燃气 | 充值缴费 |
|
||||
| 商户消费 | 医院 / 药店 / 体检 | 医疗健康 |
|
||||
| 商户消费 | 电影 / 游戏 / 书籍 | 文化休闲 |
|
||||
| 商户消费 | 酒店 / 景点 / 旅行 | 酒店旅游 |
|
||||
| 商户消费 | 无法识别 | 其他(待人工补充) |
|
||||
|
||||
> **设计原则**:若商品内容无法识别关键词,先落入"其他"分类,并允许用户通过规则管理补充映射。系统应记录未命中规则的记录,便于后续规则完善。
|
||||
|
||||
#### 统一交易模型 (Transaction)
|
||||
|
||||
```go
|
||||
type Transaction struct {
|
||||
ID string
|
||||
TransactionID string // 业务唯一 ID
|
||||
BatchID string
|
||||
SourcePlatform string
|
||||
SourceRecordID string
|
||||
TradeTime time.Time
|
||||
Amount decimal.Decimal // 正数,方向独立表达
|
||||
Currency string // 默认 CNY
|
||||
Direction string // income/expense/transfer/refund/fee/other
|
||||
Counterparty string
|
||||
MerchantName string
|
||||
CategoryRaw string // 原始分类(来自平台)
|
||||
CategoryMapped string // 映射后分类(规则引擎填充)
|
||||
AccountMapped string // 映射账户
|
||||
Tags string // 逗号分隔
|
||||
OrderID string
|
||||
ParentOrderID string
|
||||
PaymentMethod string
|
||||
Note string
|
||||
RawPayload string // 原始记录完整 JSON 快照
|
||||
RowFingerprint string
|
||||
Status string // PENDING_CLEAN → CLEANED → ... → IMPORTED
|
||||
FireflyTxnID string
|
||||
}
|
||||
```
|
||||
|
||||
#### 标准化规则
|
||||
|
||||
| 规则项 | 说明 |
|
||||
|--------|------|
|
||||
| 时间 | 统一存储为 `Asia/Shanghai (UTC+8)` |
|
||||
| 金额 | 统一使用正数(`decimal(18,6)`),方向独立用 `direction` 表达 |
|
||||
| 币种 | 默认 `CNY` |
|
||||
| 状态 | 初始为 `PENDING_CLEAN` |
|
||||
| 原始快照 | 完整写入 `raw_payload`(JSON),确保审计可追溯 |
|
||||
| 指纹 | 对关键字段做 SHA256,用于严格去重 |
|
||||
|
||||
#### Direction 枚举
|
||||
|
||||
| 值 | 说明 |
|
||||
|----|------|
|
||||
| `INCOME` | 收入 |
|
||||
| `EXPENSE` | 支出 |
|
||||
| `TRANSFER` | 内部转账 |
|
||||
| `REFUND` | 退款 |
|
||||
| `FEE` | 手续费 |
|
||||
| `OTHER` | 其他 |
|
||||
|
||||
#### TransactionStatus 枚举
|
||||
|
||||
| 值 | 说明 |
|
||||
|----|------|
|
||||
| `PENDING_CLEAN` | 待清洗(标准化完成) |
|
||||
| `CLEANED` | 已清洗 |
|
||||
| `PENDING_REVIEW` | 待人工确认(模糊去重疑似) |
|
||||
| `READY_TO_IMPORT` | 可导入 |
|
||||
| `IMPORTING` | 导入中 |
|
||||
| `IMPORTED` | 已导入 |
|
||||
| `FAILED` | 导入失败 |
|
||||
| `DUPLICATE` | 重复记录 |
|
||||
@@ -0,0 +1,90 @@
|
||||
## 解析器接口设计
|
||||
|
||||
- **DDS-Section**: 6.1 解析器接口设计 + 6.2 解析器注册中心
|
||||
- **DDS-Lines**: L438-L498
|
||||
|
||||
### Extract
|
||||
|
||||
#### BillParser 接口
|
||||
|
||||
```go
|
||||
// BillParser 是所有平台解析器必须实现的接口
|
||||
type BillParser interface {
|
||||
// Platform 返回平台标识符,如 "alipay", "wechat", "ccb"
|
||||
Platform() string
|
||||
|
||||
// Detect 根据文件元信息和表头判断是否为本平台文件
|
||||
Detect(fileMeta FileMeta, header []string) bool
|
||||
|
||||
// Parse 解析指定文件,返回原始记录列表
|
||||
Parse(ctx context.Context, reader io.Reader) ([]RawBillRecord, error)
|
||||
}
|
||||
```
|
||||
|
||||
#### FileMeta 文件元信息
|
||||
|
||||
```go
|
||||
type FileMeta struct {
|
||||
FileName string
|
||||
FileType string // csv, xlsx, txt
|
||||
FileHash string
|
||||
FileSize int64
|
||||
}
|
||||
```
|
||||
|
||||
#### RawBillRecord 原始账单记录
|
||||
|
||||
```go
|
||||
type RawBillRecord struct {
|
||||
SourcePlatform string
|
||||
SourceRecordID string
|
||||
RawFields map[string]string // 原始 K-V 字段
|
||||
RowNo int
|
||||
RowFingerprint string
|
||||
}
|
||||
```
|
||||
|
||||
#### Registry 解析器注册中心
|
||||
|
||||
```go
|
||||
type Registry struct {
|
||||
parsers []BillParser
|
||||
}
|
||||
|
||||
func NewRegistry() *Registry {
|
||||
r := &Registry{}
|
||||
// 注册所有解析器
|
||||
r.Register(&alipay.Parser{})
|
||||
r.Register(&wechat.Parser{})
|
||||
r.Register(&ccb.Parser{})
|
||||
r.Register(&icbc.Parser{})
|
||||
return r
|
||||
}
|
||||
|
||||
// Detect 自动检测文件对应的解析器
|
||||
func (r *Registry) Detect(meta FileMeta, header []string) (BillParser, error) {
|
||||
for _, p := range r.parsers {
|
||||
if p.Detect(meta, header) {
|
||||
return p, nil
|
||||
}
|
||||
}
|
||||
return nil, ErrUnknownPlatform
|
||||
}
|
||||
```
|
||||
|
||||
#### 新增解析器步骤
|
||||
|
||||
1. 在 `parser/<platform>/` 目录下创建 `<platform>_parser.go`
|
||||
2. 实现 `BillParser` 接口的 3 个方法
|
||||
3. 在 `parser/registry.go` 的 `NewRegistry()` 中调用 `r.Register()`
|
||||
4. `Detect()` 方法基于文件名特征或 CSV 表头关键词判定
|
||||
5. `Parse()` 逐行读取文件,填充 `RawBillRecord.RawFields`
|
||||
|
||||
#### V1.0 支持的平台
|
||||
|
||||
| 平台 | 标识符 | 文件格式 | 优先级 |
|
||||
|------|--------|----------|--------|
|
||||
| 支付宝 | `alipay` | CSV | P0(优先) |
|
||||
| 微信支付 | `wechat` | CSV | P0(优先) |
|
||||
| 建设银行 | `ccb` | CSV/Excel | P1(次优先) |
|
||||
| 工商银行 | `icbc` | CSV/Excel | P1(次优先) |
|
||||
@@ -0,0 +1,41 @@
|
||||
## 行指纹生成算法
|
||||
|
||||
- **DDS-Section**: 7.2 严格去重 — 行指纹算法
|
||||
- **DDS-Lines**: L709-L726
|
||||
|
||||
### Extract
|
||||
|
||||
#### 指纹生成函数
|
||||
|
||||
```go
|
||||
func GenerateRowFingerprint(t *Transaction) string {
|
||||
raw := fmt.Sprintf("%s|%s|%s|%s|%s|%s",
|
||||
t.TradeTime.Format("2006-01-02 15:04"), // 分钟粒度(非秒级)
|
||||
t.Amount.String(),
|
||||
t.Direction,
|
||||
normalizeString(t.Counterparty),
|
||||
normalizeString(t.MerchantName),
|
||||
t.OrderID,
|
||||
)
|
||||
hash := sha256.Sum256([]byte(raw))
|
||||
return hex.EncodeToString(hash[:])
|
||||
}
|
||||
```
|
||||
|
||||
#### 关键设计决策
|
||||
|
||||
- **分钟粒度**:使用 `2006-01-02 15:04` 格式(精确到分钟),不是 `15:04:05`(秒级)
|
||||
- **原因**:不同平台对同一交易的记录时间可能有秒级差异(如支付宝记录 10:30:15,微信记录 10:30:22),分钟粒度可以容忍这种差异
|
||||
- **normalizeString** 处理:去除前后空格、全角转半角、统一大小写
|
||||
- **SHA256 输出**:64 字符的十六进制字符串
|
||||
|
||||
#### 参与指纹的字段
|
||||
|
||||
| 字段 | 处理方式 |
|
||||
|------|----------|
|
||||
| `TradeTime` | `Format("2006-01-02 15:04")` — 分钟粒度 |
|
||||
| `Amount` | `Decimal.String()` — 标准化数字字符串 |
|
||||
| `Direction` | 原值:income/expense/... |
|
||||
| `Counterparty` | `normalizeString()` — 去空格、标准化 |
|
||||
| `MerchantName` | `normalizeString()` — 去空格、标准化 |
|
||||
| `OrderID` | 原值(已在 Parser 中去除空格) |
|
||||
@@ -0,0 +1,85 @@
|
||||
## 模糊去重(多因子评分 — P1 阶段)
|
||||
|
||||
- **DDS-Section**: 7.3 模糊去重(多因子评分 — P1 阶段)
|
||||
- **DDS-Lines**: L755-L818
|
||||
|
||||
### Extract
|
||||
|
||||
#### 多因子评分模型
|
||||
|
||||
| 因子 | 分值 | 评分说明 |
|
||||
|------|------|----------|
|
||||
| 时间在 ±5 分钟内 | 30 | 时间差越小得分越高,超出窗口直接 0 分 |
|
||||
| 金额精确一致 | 30 | 金额一致得满分,差额在手续费容差内得部分分(20分) |
|
||||
| 交易方向一致 | 10 | 方向相同得满分 |
|
||||
| 订单号相同/相近 | 15 | 完全一致 15 分,包含关系 10 分 |
|
||||
| 对手方相似 | 10 | Levenshtein 相似度 + contains 判定 |
|
||||
| 来源关联规则命中 | 5 | 预配置的平台关联规则 |
|
||||
|
||||
**总分 = 100 分**
|
||||
|
||||
#### 判定阈值(可配置)
|
||||
|
||||
| 分值范围 | 判定结果 | 处理方式 |
|
||||
|----------|----------|----------|
|
||||
| ≥ 85 | 自动判定重复 | 自动标记 DUPLICATE |
|
||||
| 60 ~ 84 | 疑似重复 | 标记 PENDING_REVIEW,进入人工确认队列 |
|
||||
| < 60 | 不判定重复 | 保留独立交易 |
|
||||
|
||||
#### 评分算法骨架
|
||||
|
||||
```go
|
||||
type FuzzyScorer struct {
|
||||
TimeWindow time.Duration // 默认 5 分钟
|
||||
AmountEpsilon float64 // 金额容差(手续费)
|
||||
}
|
||||
|
||||
func (s *FuzzyScorer) Score(a, b *Transaction) int {
|
||||
score := 0
|
||||
|
||||
// 时间因子 (30分) — 线性衰减
|
||||
timeDiff := math.Abs(a.TradeTime.Sub(b.TradeTime).Minutes())
|
||||
if timeDiff <= s.TimeWindow.Minutes() {
|
||||
score += int(30 * (1 - timeDiff/s.TimeWindow.Minutes()))
|
||||
}
|
||||
|
||||
// 金额因子 (30分)
|
||||
if a.Amount.Equal(b.Amount) {
|
||||
score += 30
|
||||
} else if a.Amount.Sub(b.Amount).Abs().LessThan(decimal.NewFromFloat(s.AmountEpsilon)) {
|
||||
score += 20
|
||||
}
|
||||
|
||||
// 方向因子 (10分)
|
||||
if a.Direction == b.Direction { score += 10 }
|
||||
|
||||
// 订单号因子 (15分)
|
||||
if a.OrderID != "" && a.OrderID == b.OrderID {
|
||||
score += 15
|
||||
} else if strings.Contains(a.OrderID, b.OrderID) || strings.Contains(b.OrderID, a.OrderID) {
|
||||
score += 10
|
||||
}
|
||||
|
||||
// 对手方因子 (10分) — Levenshtein 相似度
|
||||
score += int(10 * counterpartySimilarity(a.Counterparty, b.Counterparty))
|
||||
|
||||
// 来源规则因子 (5分)
|
||||
if s.platformLinked(a.SourcePlatform, b.SourcePlatform) { score += 5 }
|
||||
|
||||
return score
|
||||
}
|
||||
```
|
||||
|
||||
#### 配置项
|
||||
|
||||
| 配置键 | 默认值 | 说明 |
|
||||
|--------|--------|------|
|
||||
| `fuzzy_time_window` | 5 | 模糊匹配时间窗口(分钟) |
|
||||
| `fuzzy_threshold_high` | 85 | 自动判定重复阈值 |
|
||||
| `fuzzy_threshold_low` | 60 | 疑似重复阈值 |
|
||||
| `amount_epsilon` | 0.01 | 金额容差(手续费) |
|
||||
|
||||
#### 性能优化
|
||||
|
||||
- 按 `trade_time` 时间分桶,避免全表 O(N²) 扫描
|
||||
- 使用索引 `idx_txn_trade_time` 加速时间范围查询
|
||||
@@ -0,0 +1,48 @@
|
||||
## 严格去重(精确匹配)
|
||||
|
||||
- **DDS-Section**: 7.2 严格去重(基础去重 — 精确匹配)
|
||||
- **DDS-Lines**: L697-L753
|
||||
|
||||
### Extract
|
||||
|
||||
#### 三级唯一性判定键(按优先级)
|
||||
|
||||
| 优先级 | 判定键 | 适用场景 |
|
||||
|--------|--------|----------|
|
||||
| 1 | `source_platform` + `source_record_id` | 同一平台重复导入 |
|
||||
| 2 | `source_file_hash` + `row_fingerprint` | 同一文件重复上传 |
|
||||
| 3 | `order_id`(若可信) | 跨批次订单号匹配 |
|
||||
|
||||
#### 执行流程
|
||||
|
||||
```go
|
||||
func (s *StrictDedup) Execute(ctx context.Context, txns []*Transaction) ([]*Transaction, error) {
|
||||
var result []*Transaction
|
||||
for _, txn := range txns {
|
||||
// 判定键 1: platform + record_id
|
||||
exists, existingID := s.repo.FindByPlatformAndRecordID(
|
||||
ctx, txn.SourcePlatform, txn.SourceRecordID)
|
||||
if exists {
|
||||
s.createDedupRelation(ctx, txn.ID, existingID, "strict", 100)
|
||||
txn.Status = "DUPLICATE"
|
||||
continue
|
||||
}
|
||||
// 判定键 2: file_hash + fingerprint
|
||||
exists, existingID = s.repo.FindByFingerprint(ctx, txn.RowFingerprint)
|
||||
if exists {
|
||||
s.createDedupRelation(ctx, txn.ID, existingID, "strict", 100)
|
||||
txn.Status = "DUPLICATE"
|
||||
continue
|
||||
}
|
||||
result = append(result, txn)
|
||||
}
|
||||
return result, nil
|
||||
}
|
||||
```
|
||||
|
||||
#### 实现要点
|
||||
|
||||
- 命中时创建 `dedup_relation`(relation_type=`strict`, confidence=100)
|
||||
- 被判重的交易 status 设为 `DUPLICATE`
|
||||
- 只返回未命中的交易进入下一阶段
|
||||
- 使用数据库索引 `idx_txn_platform_record` 和 `idx_txn_fingerprint` 加速查询
|
||||
@@ -0,0 +1,78 @@
|
||||
## 转账闭环识别
|
||||
|
||||
- **DDS-Section**: 7.4 链路合并(转账闭环 + 订单链路)
|
||||
- **DDS-Lines**: L820-L871
|
||||
|
||||
### Extract
|
||||
|
||||
#### 典型场景
|
||||
|
||||
银行卡支出 1000 元(流向支付宝),支付宝收入 1000 元 → 合并为一笔内部转账。
|
||||
|
||||
#### 转账闭环识别规则(5 项全部满足)
|
||||
|
||||
| # | 条件 | 说明 |
|
||||
|---|------|------|
|
||||
| 1 | 金额一致 | `a.Amount.Equal(b.Amount)` |
|
||||
| 2 | 方向互补 | 一条 expense + 一条 income |
|
||||
| 3 | 时间窗口内 | 默认 ±30 分钟(`transfer_time_window` 可配置) |
|
||||
| 4 | 不同平台 | `a.SourcePlatform != b.SourcePlatform` |
|
||||
| 5 | 非退款/手续费 | `direction` 不是 `refund` 或 `fee` |
|
||||
|
||||
#### 实现骨架
|
||||
|
||||
```go
|
||||
func (l *TransferLinker) Detect(a, b *Transaction) *LinkResult {
|
||||
// 条件 1: 金额一致
|
||||
if !a.Amount.Equal(b.Amount) { return nil }
|
||||
|
||||
// 条件 2: 方向互补
|
||||
if !((a.Direction == "expense" && b.Direction == "income") ||
|
||||
(a.Direction == "income" && b.Direction == "expense")) { return nil }
|
||||
|
||||
// 条件 3: 时间窗口
|
||||
if math.Abs(a.TradeTime.Sub(b.TradeTime).Minutes()) > l.TimeWindow { return nil }
|
||||
|
||||
// 条件 4: 不同平台
|
||||
if a.SourcePlatform == b.SourcePlatform { return nil }
|
||||
|
||||
// 条件 5: 非退款/手续费
|
||||
if a.Direction == "refund" || b.Direction == "refund" { return nil }
|
||||
if a.Direction == "fee" || b.Direction == "fee" { return nil }
|
||||
|
||||
return &LinkResult{
|
||||
ParentTransactionID: selectPrimary(a, b).ID,
|
||||
ChildTransactionID: selectSecondary(a, b).ID,
|
||||
LinkType: "transfer",
|
||||
FromAccount: mapToAccount(getExpenseSide(a, b)),
|
||||
ToAccount: mapToAccount(getIncomeSide(a, b)),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 订单链路合并
|
||||
|
||||
**典型场景**:京东订单 + 微信支付,一笔真实消费产生多条流水。
|
||||
|
||||
**合并策略**:
|
||||
- 保留更完整的业务记录为主交易(优先保留有商品详情的记录)
|
||||
- 其他记录挂为关联来源
|
||||
- 形成 `parent_order_id` 聚合链路
|
||||
|
||||
#### LinkResult 数据结构
|
||||
|
||||
```go
|
||||
type LinkResult struct {
|
||||
ParentTransactionID string
|
||||
ChildTransactionID string
|
||||
LinkType string // transfer / order / refund / fee
|
||||
FromAccount string
|
||||
ToAccount string
|
||||
}
|
||||
```
|
||||
|
||||
#### 配置项
|
||||
|
||||
| 配置键 | 默认值 | 说明 |
|
||||
|--------|--------|------|
|
||||
| `transfer_time_window` | 30 | 转账闭环时间窗口(分钟) |
|
||||
@@ -0,0 +1,63 @@
|
||||
## 规则条件 JSON 格式
|
||||
|
||||
- **DDS-Section**: 8.2 规则匹配条件
|
||||
- **DDS-Lines**: L938-L967
|
||||
|
||||
### Extract
|
||||
|
||||
#### 条件匹配维度
|
||||
|
||||
| 条件类型 | 字段 | 说明 | 示例 |
|
||||
|----------|------|------|------|
|
||||
| 平台过滤 | `platform` | 指定生效平台 | `"alipay"` |
|
||||
| 原始分类 | `category_raw` | 原始分类匹配 | `"餐饮美食"` |
|
||||
| 关键词 | `keywords` | 商品/商户名关键词 | `["美团", "外卖"]` |
|
||||
| 正则 | `regex` | 正则表达式匹配 | `"^滴滴.*出行$"` |
|
||||
| 金额范围 | `amount_range` | [min, max] | `[0, 50]` |
|
||||
| 方向 | `direction` | 收支方向 | `"expense"` |
|
||||
| 对手方 | `counterparty` | 对手方包含 | `"支付宝"` |
|
||||
|
||||
#### JSON 结构示例
|
||||
|
||||
```json
|
||||
{
|
||||
"platform": "wechat",
|
||||
"conditions": {
|
||||
"category_raw": "商户消费",
|
||||
"keywords": ["美团", "外卖", "饿了么"],
|
||||
"direction": "expense"
|
||||
},
|
||||
"actions": {
|
||||
"category_mapped": "餐饮美食",
|
||||
"merchant_normalized": "外卖平台"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Rule 数据模型
|
||||
|
||||
```go
|
||||
type Rule struct {
|
||||
ID string // UUID
|
||||
RuleType string // 规则类型枚举
|
||||
Priority int // 优先级(越小越高)
|
||||
PlatformScope string // 平台范围:all / alipay / wechat
|
||||
ConditionsJSON string // 条件 JSON
|
||||
ActionsJSON string // 动作 JSON
|
||||
Enabled bool // 是否启用
|
||||
Description string // 规则描述
|
||||
CreatedAt time.Time
|
||||
UpdatedAt time.Time
|
||||
}
|
||||
```
|
||||
|
||||
#### RuleType 枚举
|
||||
|
||||
| 类型 | 说明 |
|
||||
|------|------|
|
||||
| `COUNTERPARTY_NORMALIZE` | 对手方归一化 |
|
||||
| `MERCHANT_NORMALIZE` | 商户名归一化 |
|
||||
| `CATEGORY_MAPPING` | 分类映射 |
|
||||
| `ACCOUNT_MAPPING` | 账户映射 |
|
||||
| `TAG_MAPPING` | 标签映射 |
|
||||
| `FIREFLY_FIELD_MAPPING` | Firefly 字段映射 |
|
||||
@@ -0,0 +1,89 @@
|
||||
## 规则执行顺序与可解释性
|
||||
|
||||
- **DDS-Section**: 8.3 规则执行顺序 + 8.4 规则引擎核心实现 + 8.5 可解释性设计
|
||||
- **DDS-Lines**: L969-L1045
|
||||
|
||||
### Extract
|
||||
|
||||
#### 固定执行顺序(不可变)
|
||||
|
||||
```
|
||||
1. 对手方归一化 (COUNTERPARTY_NORMALIZE)
|
||||
↓
|
||||
2. 商户归一化 (MERCHANT_NORMALIZE)
|
||||
↓
|
||||
3. 分类映射 (CATEGORY_MAPPING)
|
||||
↓
|
||||
4. 账户映射 (ACCOUNT_MAPPING)
|
||||
↓
|
||||
5. 标签映射 (TAG_MAPPING)
|
||||
↓
|
||||
6. Firefly 字段映射 (FIREFLY_FIELD_MAPPING)
|
||||
```
|
||||
|
||||
**设计原因**:先做归一再做分类,可提升规则命中率与稳定性。例如先将"美团外卖-北京"归一为"美团外卖",再匹配分类规则"美团 → 餐饮美食"。
|
||||
|
||||
#### 执行原则
|
||||
|
||||
- 同一类型内按 `priority` 升序执行(数字越小优先级越高)
|
||||
- **首条命中即停止**(同类型中第一个匹配的规则生效,后续不再匹配)
|
||||
- 每条交易记录所有命中的规则 ID 和前后字段对比
|
||||
|
||||
#### 规则引擎核心实现
|
||||
|
||||
```go
|
||||
type Engine struct {
|
||||
ruleRepo repository.RuleRepo
|
||||
hitRepo repository.RuleHitRepo
|
||||
}
|
||||
|
||||
func (e *Engine) Apply(ctx context.Context, txns []*Transaction) error {
|
||||
ruleGroups := e.loadRulesGroupByType(ctx)
|
||||
|
||||
executionOrder := []string{
|
||||
"COUNTERPARTY_NORMALIZE",
|
||||
"MERCHANT_NORMALIZE",
|
||||
"CATEGORY_MAPPING",
|
||||
"ACCOUNT_MAPPING",
|
||||
"TAG_MAPPING",
|
||||
"FIREFLY_FIELD_MAPPING",
|
||||
}
|
||||
|
||||
for _, txn := range txns {
|
||||
for _, ruleType := range executionOrder {
|
||||
rules := ruleGroups[ruleType]
|
||||
for _, rule := range rules {
|
||||
if !rule.MatchPlatform(txn.SourcePlatform) { continue }
|
||||
if rule.MatchConditions(txn) {
|
||||
before := txn.Snapshot()
|
||||
rule.ApplyActions(txn)
|
||||
after := txn.Snapshot()
|
||||
e.hitRepo.Save(ctx, &RuleHit{
|
||||
TransactionID: txn.ID,
|
||||
RuleID: rule.ID,
|
||||
MatchedCondition: rule.ConditionsJSON,
|
||||
BeforeValue: before,
|
||||
AfterValue: after,
|
||||
})
|
||||
break // 同类型首条命中即停止
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
#### 可解释性设计 — RuleHit 审计
|
||||
|
||||
每条交易保留完整的规则命中记录:
|
||||
|
||||
| 信息项 | 说明 |
|
||||
|--------|------|
|
||||
| 命中规则 ID | 关联 rules 表 |
|
||||
| 命中条件摘要 | 匹配的具体关键词/正则 |
|
||||
| 变更前值 | 规则执行前的字段值 |
|
||||
| 变更后值 | 规则执行后的字段值 |
|
||||
| 命中时间 | 规则执行时间戳 |
|
||||
|
||||
用于前端"为何被分到餐饮/交通"的解释展示。
|
||||
@@ -0,0 +1,252 @@
|
||||
## 数据库表结构设计
|
||||
|
||||
- **DDS-Section**: 10. 数据库详细设计(SQLite + GORM)
|
||||
- **DDS-Lines**: L1135-L1347
|
||||
|
||||
### Extract
|
||||
|
||||
#### ER 关系总览
|
||||
|
||||
```
|
||||
IMPORT_BATCHES ──1:N──> SOURCE_FILES ──1:N──> RAW_RECORDS
|
||||
IMPORT_BATCHES ──1:N──> TRANSACTIONS
|
||||
RAW_RECORDS ──1:1──> TRANSACTIONS (normalizes_to)
|
||||
TRANSACTIONS ──1:N──> DEDUP_RELATIONS
|
||||
TRANSACTIONS ──1:N──> LINK_RELATIONS
|
||||
TRANSACTIONS ──1:N──> RULE_HITS
|
||||
RULES ──1:N──> RULE_HITS
|
||||
IMPORT_BATCHES ──1:N──> IMPORT_TASKS ──1:N──> IMPORT_RESULTS
|
||||
TRANSACTIONS ──1:N──> AUDIT_LOGS
|
||||
```
|
||||
|
||||
共 **11 张核心表**。
|
||||
|
||||
#### 表结构定义
|
||||
|
||||
##### 1. IMPORT_BATCHES — 导入批次
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| status | varchar(32) | | 批次状态 |
|
||||
| total_files | int | | 文件总数 |
|
||||
| total_records | int | | 记录总数 |
|
||||
| success_count | int | | 成功数 |
|
||||
| failed_count | int | | 失败数 |
|
||||
| duplicate_count | int | | 重复数 |
|
||||
| created_at | datetime | | 创建时间 |
|
||||
| updated_at | datetime | | 更新时间 |
|
||||
|
||||
##### 2. SOURCE_FILES — 源文件
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| batch_id | varchar(36) | FK | 批次 ID |
|
||||
| file_name | varchar(255) | | 原始文件名 |
|
||||
| file_hash | varchar(64) | INDEX | 文件 SHA256 哈希 |
|
||||
| source_platform | varchar(32) | | 来源平台 |
|
||||
| file_type | varchar(16) | | csv/xlsx/txt |
|
||||
| file_size | int | | 文件大小(bytes) |
|
||||
| uploaded_at | datetime | | 上传时间 |
|
||||
|
||||
##### 3. RAW_RECORDS — 原始记录
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| source_file_id | varchar(36) | FK | 来源文件 ID |
|
||||
| row_no | int | | 行号 |
|
||||
| source_platform | varchar(32) | | 平台 |
|
||||
| source_record_id | varchar(128) | | 原始流水号 |
|
||||
| row_fingerprint | varchar(64) | INDEX | 行指纹 SHA256 |
|
||||
| raw_payload | text | | 原始 JSON 快照 |
|
||||
| parse_status | varchar(32) | | 解析状态 |
|
||||
| parse_error | text | | 错误信息 |
|
||||
|
||||
##### 4. TRANSACTIONS — 统一交易记录(核心表)
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| transaction_id | varchar(64) | UNIQUE | 业务唯一 ID |
|
||||
| batch_id | varchar(36) | INDEX | 导入批次 |
|
||||
| raw_record_id | varchar(36) | | 原始记录 ID |
|
||||
| source_platform | varchar(32) | INDEX(组合) | 来源平台 |
|
||||
| source_record_id | varchar(128) | INDEX(组合) | 原始记录号 |
|
||||
| trade_time | datetime | INDEX, NOT NULL | 交易时间 |
|
||||
| amount | decimal(18,6) | NOT NULL | 金额 |
|
||||
| currency | varchar(16) | DEFAULT 'CNY' | 币种 |
|
||||
| direction | varchar(16) | NOT NULL | 方向 |
|
||||
| counterparty | varchar(255) | | 对手方 |
|
||||
| merchant_name | varchar(255) | | 商户名 |
|
||||
| category_raw | varchar(128) | | 原始分类 |
|
||||
| category_mapped | varchar(128) | | 映射分类 |
|
||||
| account_mapped | varchar(128) | | 映射账户 |
|
||||
| tags | varchar(512) | | 标签(逗号分隔) |
|
||||
| order_id | varchar(128) | INDEX | 订单号 |
|
||||
| parent_order_id | varchar(128) | | 父链路号 |
|
||||
| payment_method | varchar(128) | | 支付方式 |
|
||||
| note | text | | 备注 |
|
||||
| raw_payload | text | | 原始记录 JSON |
|
||||
| row_fingerprint | varchar(64) | INDEX | 行指纹 |
|
||||
| status | varchar(32) | INDEX, DEFAULT 'PENDING_CLEAN' | 状态 |
|
||||
| firefly_txn_id | varchar(128) | | Firefly 交易 ID |
|
||||
| imported_at | datetime | | 导入时间 |
|
||||
| created_at | datetime | | 创建时间 |
|
||||
| updated_at | datetime | | 更新时间 |
|
||||
|
||||
##### 5. DEDUP_RELATIONS — 去重关系
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| src_transaction_id | varchar(36) | FK, INDEX | 原交易 ID |
|
||||
| target_transaction_id | varchar(36) | FK | 目标交易 ID |
|
||||
| relation_type | varchar(16) | | strict/fuzzy |
|
||||
| confidence | int | | 置信度 0-100 |
|
||||
| status | varchar(16) | | auto/confirmed/rejected |
|
||||
| reason_json | text | | 判定依据 JSON |
|
||||
| created_at | datetime | | 创建时间 |
|
||||
|
||||
##### 6. LINK_RELATIONS — 链路关系
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| parent_transaction_id | varchar(36) | FK | 主交易 ID |
|
||||
| child_transaction_id | varchar(36) | FK | 子交易 ID |
|
||||
| link_type | varchar(16) | | transfer/order/refund/fee |
|
||||
| reason_json | text | | 关联依据 JSON |
|
||||
| created_at | datetime | | 创建时间 |
|
||||
|
||||
##### 7. RULES — 规则定义
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| rule_type | varchar(32) | INDEX(组合) | 规则类型 |
|
||||
| priority | int | INDEX(组合) | 优先级 |
|
||||
| platform_scope | varchar(32) | | 平台范围 |
|
||||
| conditions_json | text | | 条件 JSON |
|
||||
| actions_json | text | | 动作 JSON |
|
||||
| enabled | boolean | | 是否启用 |
|
||||
| description | varchar(255) | | 规则描述 |
|
||||
| created_at | datetime | | 创建时间 |
|
||||
| updated_at | datetime | | 更新时间 |
|
||||
|
||||
##### 8. RULE_HITS — 规则命中记录
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| transaction_id | varchar(36) | FK | 交易 ID |
|
||||
| rule_id | varchar(36) | FK | 规则 ID |
|
||||
| matched_condition | text | | 命中条件摘要 |
|
||||
| before_value | text | | 变更前值 |
|
||||
| after_value | text | | 变更后值 |
|
||||
| created_at | datetime | | 执行时间 |
|
||||
|
||||
##### 9. IMPORT_TASKS — 导入任务
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| batch_id | varchar(36) | FK | 批次 ID |
|
||||
| export_mode | varchar(16) | | api/csv |
|
||||
| status | varchar(32) | | pending/running/success/partial_failed/failed |
|
||||
| total_count | int | | 总记录数 |
|
||||
| success_count | int | | 成功数 |
|
||||
| failed_count | int | | 失败数 |
|
||||
| started_at | datetime | | 开始时间 |
|
||||
| finished_at | datetime | | 完成时间 |
|
||||
|
||||
##### 10. IMPORT_RESULTS — 导入结果
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| task_id | varchar(36) | FK | 任务 ID |
|
||||
| transaction_id | varchar(36) | FK | 交易 ID |
|
||||
| status | varchar(16) | | success/failed |
|
||||
| error_code | varchar(32) | | 错误码 |
|
||||
| error_message | text | | 错误描述 |
|
||||
| firefly_txn_id | varchar(128) | | Firefly 返回 ID |
|
||||
| retry_count | int | | 重试次数 |
|
||||
| created_at | datetime | | 创建时间 |
|
||||
|
||||
##### 11. AUDIT_LOGS — 审计日志
|
||||
|
||||
| 字段 | 类型 | 约束 | 说明 |
|
||||
|------|------|------|------|
|
||||
| id | varchar(36) | PK | UUID 主键 |
|
||||
| entity_type | varchar(64) | | 实体类型 |
|
||||
| entity_id | varchar(36) | | 实体 ID |
|
||||
| action | varchar(32) | | 操作类型 |
|
||||
| before_snapshot | text | | 变更前快照 |
|
||||
| after_snapshot | text | | 变更后快照 |
|
||||
| operator | varchar(64) | | 操作者 |
|
||||
| created_at | datetime | | 操作时间 |
|
||||
|
||||
#### GORM Model 示例(Transaction)
|
||||
|
||||
```go
|
||||
type Transaction struct {
|
||||
ID string `gorm:"primaryKey;type:varchar(36)"`
|
||||
TransactionID string `gorm:"uniqueIndex;type:varchar(64)"`
|
||||
BatchID string `gorm:"index;type:varchar(36)"`
|
||||
RawRecordID string `gorm:"type:varchar(36)"`
|
||||
SourcePlatform string `gorm:"type:varchar(32);index:idx_platform_record"`
|
||||
SourceRecordID string `gorm:"type:varchar(128);index:idx_platform_record"`
|
||||
TradeTime time.Time `gorm:"index;not null"`
|
||||
Amount decimal.Decimal `gorm:"type:decimal(18,6);not null"`
|
||||
Currency string `gorm:"type:varchar(16);default:'CNY'"`
|
||||
Direction string `gorm:"type:varchar(16);not null"`
|
||||
Counterparty string `gorm:"type:varchar(255)"`
|
||||
MerchantName string `gorm:"type:varchar(255)"`
|
||||
CategoryRaw string `gorm:"type:varchar(128)"`
|
||||
CategoryMapped string `gorm:"type:varchar(128)"`
|
||||
AccountMapped string `gorm:"type:varchar(128)"`
|
||||
Tags string `gorm:"type:varchar(512)"`
|
||||
OrderID string `gorm:"index;type:varchar(128)"`
|
||||
ParentOrderID string `gorm:"type:varchar(128)"`
|
||||
PaymentMethod string `gorm:"type:varchar(128)"`
|
||||
Note string `gorm:"type:text"`
|
||||
RawPayload string `gorm:"type:text"`
|
||||
RowFingerprint string `gorm:"index;type:varchar(64)"`
|
||||
Status string `gorm:"index;type:varchar(32);default:'PENDING_CLEAN'"`
|
||||
FireflyTxnID string `gorm:"type:varchar(128)"`
|
||||
ImportedAt *time.Time
|
||||
CreatedAt time.Time
|
||||
UpdatedAt time.Time
|
||||
}
|
||||
|
||||
func (Transaction) TableName() string { return "transactions" }
|
||||
```
|
||||
|
||||
#### SQLite 性能配置
|
||||
|
||||
```go
|
||||
func initDB(dbPath string) *gorm.DB {
|
||||
db, _ := gorm.Open(sqlite.Open(dbPath), &gorm.Config{})
|
||||
sqlDB, _ := db.DB()
|
||||
sqlDB.SetMaxOpenConns(1) // SQLite 单写
|
||||
sqlDB.SetMaxIdleConns(10)
|
||||
db.Exec("PRAGMA journal_mode=WAL")
|
||||
db.Exec("PRAGMA synchronous=NORMAL")
|
||||
db.Exec("PRAGMA cache_size=-64000") // 64MB cache
|
||||
return db
|
||||
}
|
||||
```
|
||||
|
||||
#### 事务边界设计
|
||||
|
||||
```
|
||||
事务 1: 文件入库 + 原始记录入库
|
||||
事务 2: 标准化结果落库
|
||||
事务 3: 去重/链路关系落库
|
||||
事务 4: 规则命中落库
|
||||
事务 5: 导入结果落库
|
||||
```
|
||||
|
||||
每阶段独立事务,使用 `CreateInBatches` 每批 500 条。
|
||||
@@ -0,0 +1,25 @@
|
||||
## 关键索引设计
|
||||
|
||||
- **DDS-Section**: 10.2 关键索引设计
|
||||
- **DDS-Lines**: L1296-L1310
|
||||
|
||||
### Extract
|
||||
|
||||
| 表名 | 索引名 | 索引列 | 用途 |
|
||||
|------|--------|--------|------|
|
||||
| `transactions` | `idx_txn_platform_record` | `source_platform, source_record_id` | 严格去重判定键 1 |
|
||||
| `transactions` | `idx_txn_fingerprint` | `row_fingerprint` | 严格去重判定键 2 |
|
||||
| `transactions` | `idx_txn_batch` | `batch_id` | 批次查询 |
|
||||
| `transactions` | `idx_txn_trade_time` | `trade_time` | 模糊去重时间分桶 |
|
||||
| `transactions` | `idx_txn_order` | `order_id` | 订单号匹配 |
|
||||
| `transactions` | `idx_txn_status` | `status` | 状态过滤 |
|
||||
| `source_files` | `idx_sf_hash` | `file_hash` | 文件重复上传拦截 |
|
||||
| `raw_records` | `idx_rr_fingerprint` | `row_fingerprint` | 行级去重 |
|
||||
| `rules` | `idx_rule_type_priority` | `rule_type, priority` | 规则执行顺序 |
|
||||
| `dedup_relations` | `idx_dedup_src` | `src_transaction_id` | 去重关系查询 |
|
||||
|
||||
#### 性能说明
|
||||
|
||||
- `idx_txn_platform_record` 是严格去重最频繁命中的索引,组合索引比两个单列索引效率更高
|
||||
- `idx_txn_trade_time` 用于模糊去重的时间分桶,避免全表 O(N²) 扫描
|
||||
- `idx_rule_type_priority` 确保规则按类型分组、按优先级有序加载
|
||||
@@ -0,0 +1,87 @@
|
||||
## API 接口目录
|
||||
|
||||
- **DDS-Section**: 11. API 接口设计(GIN RESTful)
|
||||
- **DDS-Lines**: L1350-L1451
|
||||
|
||||
### Extract
|
||||
|
||||
#### 导入中心 API
|
||||
|
||||
| 方法 | 路径 | 说明 |
|
||||
|------|------|------|
|
||||
| `POST` | `/api/v1/import/batches` | 上传账单文件,创建批次(multipart/form-data) |
|
||||
| `GET` | `/api/v1/import/batches` | 获取批次列表 |
|
||||
| `GET` | `/api/v1/import/batches/:batchId` | 获取批次详情 |
|
||||
| `POST` | `/api/v1/import/batches/:batchId/process` | 触发解析与清洗流水线 |
|
||||
| `GET` | `/api/v1/import/batches/:batchId/preview` | 获取清洗预览结果 |
|
||||
| `DELETE` | `/api/v1/import/batches/:batchId` | 删除批次 |
|
||||
|
||||
##### 上传文件接口详情
|
||||
|
||||
```
|
||||
POST /api/v1/import/batches
|
||||
Content-Type: multipart/form-data
|
||||
|
||||
参数:
|
||||
- files[] (必填) 账单文件
|
||||
- sourcePlatform (可选) 指定来源平台: "alipay", "wechat"
|
||||
- autoDetect (可选) 是否自动识别, 默认 true
|
||||
|
||||
响应:
|
||||
{
|
||||
"code": 0,
|
||||
"message": "ok",
|
||||
"data": {
|
||||
"batchId": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"status": "UPLOADED",
|
||||
"filesCount": 2,
|
||||
"detectedPlatforms": ["alipay", "wechat"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 交易记录 API
|
||||
|
||||
| 方法 | 路径 | 说明 |
|
||||
|------|------|------|
|
||||
| `GET` | `/api/v1/transactions` | 分页查询交易记录 |
|
||||
| `GET` | `/api/v1/transactions/:id` | 获取交易详情(含规则命中记录) |
|
||||
| `GET` | `/api/v1/transactions/:id/trace` | 获取交易完整处理链路 |
|
||||
|
||||
#### 去重确认 API
|
||||
|
||||
| 方法 | 路径 | 说明 |
|
||||
|------|------|------|
|
||||
| `GET` | `/api/v1/dedup/reviews` | 获取疑似重复列表 |
|
||||
| `GET` | `/api/v1/dedup/reviews/:reviewId` | 获取重复详情(含评分因子) |
|
||||
| `POST` | `/api/v1/dedup/reviews/:reviewId/confirm` | 确认合并 |
|
||||
| `POST` | `/api/v1/dedup/reviews/:reviewId/reject` | 拒绝合并 |
|
||||
|
||||
#### 规则管理 API
|
||||
|
||||
| 方法 | 路径 | 说明 |
|
||||
|------|------|------|
|
||||
| `GET` | `/api/v1/rules` | 获取规则列表(支持类型/平台过滤) |
|
||||
| `POST` | `/api/v1/rules` | 创建规则 |
|
||||
| `PUT` | `/api/v1/rules/:id` | 更新规则 |
|
||||
| `DELETE` | `/api/v1/rules/:id` | 删除规则 |
|
||||
| `POST` | `/api/v1/rules/evaluate` | 重新评估规则(修改规则后触发) |
|
||||
| `POST` | `/api/v1/rules/:id/test` | 测试规则命中预览 |
|
||||
|
||||
#### 导入/导出 API
|
||||
|
||||
| 方法 | 路径 | 说明 |
|
||||
|------|------|------|
|
||||
| `POST` | `/api/v1/import/tasks` | 创建导入任务(确认导入到 Firefly) |
|
||||
| `GET` | `/api/v1/import/tasks/:taskId` | 获取导入任务详情 |
|
||||
| `POST` | `/api/v1/import/tasks/:taskId/retry` | 重试失败项 |
|
||||
| `GET` | `/api/v1/export/csv/:batchId` | 导出批次为 CSV 文件 |
|
||||
|
||||
#### 审计与系统 API
|
||||
|
||||
| 方法 | 路径 | 说明 |
|
||||
|------|------|------|
|
||||
| `GET` | `/api/v1/audit/logs` | 获取操作日志列表 |
|
||||
| `GET` | `/api/v1/settings` | 获取系统配置 |
|
||||
| `PUT` | `/api/v1/settings` | 更新系统配置(Firefly 连接等) |
|
||||
| `POST` | `/api/v1/settings/test-connection` | 测试 Firefly III 连接 |
|
||||
@@ -0,0 +1,57 @@
|
||||
## 统一响应格式与错误码
|
||||
|
||||
- **DDS-Section**: 11.1 统一响应格式 + 14.4 错误处理策略
|
||||
- **DDS-Lines**: L1352-L1369, L1787-L1816
|
||||
|
||||
### Extract
|
||||
|
||||
#### 统一响应结构
|
||||
|
||||
```go
|
||||
// 普通响应
|
||||
type Response struct {
|
||||
Code int `json:"code"` // 0=成功, 非0=错误码
|
||||
Message string `json:"message"` // 成功/错误说明
|
||||
Data interface{} `json:"data"` // 业务数据
|
||||
}
|
||||
|
||||
// 分页响应
|
||||
type PageResponse struct {
|
||||
Code int `json:"code"`
|
||||
Message string `json:"message"`
|
||||
Data interface{} `json:"data"`
|
||||
Total int64 `json:"total"`
|
||||
Page int `json:"page"`
|
||||
Size int `json:"size"`
|
||||
}
|
||||
```
|
||||
|
||||
#### 错误码定义
|
||||
|
||||
| 错误码 | 常量名 | 说明 |
|
||||
|--------|--------|------|
|
||||
| 0 | `ErrCodeSuccess` | 成功 |
|
||||
| 40000 | `ErrCodeBadRequest` | 通用参数错误 |
|
||||
| 40001 | `ErrCodeFileParseError` | 文件解析失败 |
|
||||
| 40002 | `ErrCodeUnknownPlatform` | 未识别的平台来源 |
|
||||
| 40003 | `ErrCodeDuplicateFile` | 重复上传文件 |
|
||||
| 40004 | `ErrCodeRuleInvalid` | 规则定义无效 |
|
||||
| 40005 | `ErrCodeExportFailed` | 导出/推送失败 |
|
||||
| 50000 | `ErrCodeInternal` | 服务器内部错误 |
|
||||
|
||||
#### 全局错误处理中间件
|
||||
|
||||
```go
|
||||
func ErrorHandler() gin.HandlerFunc {
|
||||
return func(c *gin.Context) {
|
||||
c.Next()
|
||||
if len(c.Errors) > 0 {
|
||||
err := c.Errors.Last()
|
||||
c.JSON(http.StatusOK, Response{
|
||||
Code: mapErrorCode(err),
|
||||
Message: err.Error(),
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,65 @@
|
||||
## Firefly III 导出适配
|
||||
|
||||
- **DDS-Section**: 9. Firefly III / Data Importer 适配设计
|
||||
- **DDS-Lines**: L1048-L1132
|
||||
|
||||
### Extract
|
||||
|
||||
#### 两段式规则映射策略
|
||||
|
||||
**第一阶段 — ProjectMoneyX 负责**:
|
||||
1. 字段级映射:异构字段 → 统一模型
|
||||
2. 业务分类映射:对手方/描述 → 分类/标签
|
||||
3. 商户名归一化:别名 → 统一名称
|
||||
|
||||
**第二阶段 — Firefly III 负责**:
|
||||
1. 最后一层字段适配
|
||||
2. 临时补充规则
|
||||
3. 导入格式兼容
|
||||
|
||||
#### 导出模式
|
||||
|
||||
##### 模式 A:API 推送模式(优先)
|
||||
|
||||
```
|
||||
Export Engine → Data Importer (POST /api/v1/import) → Firefly III
|
||||
↓
|
||||
返回成功/失败明细 → 更新 import_results 表
|
||||
```
|
||||
|
||||
##### 模式 B:中间文件导出模式
|
||||
|
||||
- 生成完全符合 Data Importer 规范的标准 CSV / JSON
|
||||
- 用户手动下载后在 Data Importer 中执行导入
|
||||
- 适合 API 不可用或权限受限场景
|
||||
|
||||
#### Firefly 交易类型映射
|
||||
|
||||
| 内部 Direction | Firefly Type | 说明 |
|
||||
|----------------|--------------|------|
|
||||
| `expense` | `withdrawal` | 支出 |
|
||||
| `income` | `deposit` | 收入 |
|
||||
| `transfer` | `transfer` | 内部转账 |
|
||||
| `refund` | `deposit` | 退款(作为收入处理) |
|
||||
| `fee` | `withdrawal` | 手续费(作为支出处理) |
|
||||
|
||||
#### ImportResult 数据结构
|
||||
|
||||
```go
|
||||
type ImportResult struct {
|
||||
TaskID string
|
||||
TransactionID string
|
||||
Status string // success / failed
|
||||
ErrorCode string
|
||||
ErrorMessage string
|
||||
FireflyTxnID string // Firefly III 返回的交易 ID
|
||||
RetryCount int
|
||||
CreatedAt time.Time
|
||||
}
|
||||
```
|
||||
|
||||
#### 导入后反馈
|
||||
|
||||
- 导入成功/失败数量统计
|
||||
- 失败原因分类展示(字段缺失、格式错误、账户不存在等)
|
||||
- 失败记录可单独重试(无需整批重做)
|
||||
@@ -0,0 +1,24 @@
|
||||
## 导入前校验清单
|
||||
|
||||
- **DDS-Section**: 9.4 导入前校验清单
|
||||
- **DDS-Lines**: L1103-L1112
|
||||
|
||||
### Extract
|
||||
|
||||
#### 6 项必须校验
|
||||
|
||||
| # | 校验项 | 说明 | 失败处理 |
|
||||
|---|--------|------|----------|
|
||||
| 1 | 必填字段完整性 | `amount`, `trade_time`, `direction` 不可为空 | 标记为 FAILED,记录具体缺失字段 |
|
||||
| 2 | 金额格式合法性 | 必须为正数,精度不超过 6 位小数 | 标记为 FAILED |
|
||||
| 3 | 时间格式合法性 | 必须为有效日期时间 | 标记为 FAILED |
|
||||
| 4 | 账户映射完整性 | 来源平台必须有对应的 Firefly 账户映射 | 标记为 FAILED,提示配置账户映射 |
|
||||
| 5 | 重复导入拦截 | 检查 `transaction_id` 是否已在 Firefly III 中存在 | 跳过并标记 |
|
||||
| 6 | 未确认记录检查 | 是否存在 `PENDING_REVIEW` 状态的疑似重复记录 | 阻断导入,提示先处理待确认记录 |
|
||||
|
||||
#### 校验实现要点
|
||||
|
||||
- 校验在 Export Engine 的 `validate()` 方法中统一执行
|
||||
- 校验失败的记录不参与推送,但不影响其他记录
|
||||
- 校验结果写入 `import_results` 表,前端可展示失败原因
|
||||
- 第 6 项(未确认记录)为全局阻断性校验,有未确认记录时整批不可导入
|
||||
@@ -0,0 +1,82 @@
|
||||
## 前端核心组件与交互
|
||||
|
||||
- **DDS-Section**: 12.2 页面职责与交互说明
|
||||
- **DDS-Lines**: L1489-L1572
|
||||
|
||||
### Extract
|
||||
|
||||
#### 核心可复用组件
|
||||
|
||||
| 组件 | 文件 | 用途 |
|
||||
|------|------|------|
|
||||
| 文件上传器 | `FileUploader.vue` | 拖拽 + 点击上传,进度条,平台自动检测 |
|
||||
| 交易表格 | `TransactionTable.vue` | `v-data-table` + 行展开 + 批量操作 |
|
||||
| 规则编辑器 | `RuleEditor.vue` | 条件构建器 + 动作配置 + 测试预览 |
|
||||
| 去重对比 | `DedupCompare.vue` | 左右分栏对比 + 评分因子 + 差异高亮 |
|
||||
| 审计时间线 | `AuditTimeline.vue` | `v-timeline` + 快照展开 |
|
||||
|
||||
#### 1. FileUploader.vue
|
||||
|
||||
- 使用 Vuetify `v-file-input` + 自定义拖拽区域
|
||||
- 文件选择后立即调用 `POST /api/v1/import/batches`
|
||||
- 上传进度条实时展示
|
||||
- 自动检测文件来源平台,允许手动修改
|
||||
- 批量文件列表显示文件名、大小、检测到的平台
|
||||
|
||||
#### 2. TransactionTable.vue
|
||||
|
||||
- 使用 Vuetify `v-data-table` 实现分页排序
|
||||
- 行展开(`expanded`)显示详情面板:原始字段 vs 标准字段对比
|
||||
- 规则命中说明显示该条交易命中的具体规则
|
||||
- 状态标记:以颜色标签区分 待清洗/已清洗/重复/待确认
|
||||
- 筛选器:支持按来源平台、分类、方向、状态过滤
|
||||
- 必须 `fixed-header` + 列宽显式指定
|
||||
|
||||
#### 3. DedupCompare.vue
|
||||
|
||||
- 左右分栏对比布局
|
||||
- 差异字段高亮显示
|
||||
- 评分因子可展开查看(6 项因子各自得分)
|
||||
- 操作按钮:确认合并 / 拒绝合并 / 暂时跳过
|
||||
- 链路视图展示已识别的转账闭环和订单链路
|
||||
|
||||
#### 4. RuleEditor.vue
|
||||
|
||||
- 条件构建器:支持多条件 AND/OR 组合
|
||||
- 使用 Vuetify `v-select`、`v-text-field`、`v-chip` 构建
|
||||
- 按类型分组展示规则列表,支持拖拽排序优先级
|
||||
- 测试按钮触发 `POST /api/v1/rules/:id/test`
|
||||
- 启用/禁用一键切换
|
||||
|
||||
#### 5. AuditTimeline.vue
|
||||
|
||||
- 使用 Vuetify `v-timeline` 组件
|
||||
- 每个节点可展开查看详细快照数据
|
||||
- 处理链路:原始文件 → 原始记录 → 标准化 → 规则命中 → 导入结果
|
||||
|
||||
#### Pinia Store 清单
|
||||
|
||||
| Store | 文件 | 管理的状态 |
|
||||
|-------|------|-----------|
|
||||
| Import Store | `importStore.ts` | 批次列表、上传状态、处理进度 |
|
||||
| Transaction Store | `transactionStore.ts` | 交易记录分页、筛选条件、详情 |
|
||||
| Rule Store | `ruleStore.ts` | 规则列表、编辑表单、测试结果 |
|
||||
|
||||
#### API 模块清单
|
||||
|
||||
| 模块 | 文件 | 封装的 API 组 |
|
||||
|------|------|-------------|
|
||||
| Import API | `api/import.ts` | 批次上传、列表、详情、处理 |
|
||||
| Transaction API | `api/transaction.ts` | 交易查询、详情、链路 |
|
||||
| Dedup API | `api/dedup.ts` | 去重列表、确认、拒绝 |
|
||||
| Rule API | `api/rule.ts` | 规则 CRUD、测试、评估 |
|
||||
| Export API | `api/export.ts` | 导入任务、重试、CSV 导出 |
|
||||
| Audit API | `api/audit.ts` | 审计日志查询 |
|
||||
|
||||
#### TypeScript 类型清单
|
||||
|
||||
| 文件 | 定义的类型 |
|
||||
|------|-----------|
|
||||
| `types/transaction.ts` | Transaction, Direction, TransactionStatus, RawRecord |
|
||||
| `types/rule.ts` | Rule, RuleType, RuleCondition, RuleHit |
|
||||
| `types/common.ts` | Response, PageResponse, ImportBatch, SourceFile, ImportTask |
|
||||
@@ -0,0 +1,95 @@
|
||||
## 前端路由与页面
|
||||
|
||||
- **DDS-Section**: 12.3 前端路由设计 + 12.1 信息架构与导航
|
||||
- **DDS-Lines**: L1454-L1628
|
||||
|
||||
### Extract
|
||||
|
||||
#### 信息架构
|
||||
|
||||
```
|
||||
侧边导航栏 → 导入中心
|
||||
→ 数据清洗
|
||||
→ 去重处理
|
||||
→ 规则管理
|
||||
→ 导入任务
|
||||
→ 数据审计
|
||||
→ 系统设置
|
||||
|
||||
导入中心 → 文件上传页 / 批次列表页 / 批次详情页
|
||||
数据清洗 → 清洗结果预览页
|
||||
去重处理 → 重复记录处理页
|
||||
规则管理 → 规则列表页 / 规则编辑页
|
||||
导入任务 → 导入任务列表页 / 导入结果详情页
|
||||
数据审计 → 审计追溯页 / 交易处理链路页
|
||||
系统设置 → Firefly 连接配置 / 去重参数配置
|
||||
```
|
||||
|
||||
#### 路由定义
|
||||
|
||||
```typescript
|
||||
const routes = [
|
||||
{ path: '/', redirect: '/import' },
|
||||
{
|
||||
path: '/import',
|
||||
name: 'ImportCenter',
|
||||
component: () => import('@/views/ImportCenterView.vue'),
|
||||
meta: { title: '导入中心', icon: 'mdi-upload' }
|
||||
},
|
||||
{
|
||||
path: '/import/batch/:batchId',
|
||||
name: 'BatchDetail',
|
||||
component: () => import('@/views/BatchDetailView.vue'),
|
||||
meta: { title: '批次详情' }
|
||||
},
|
||||
{
|
||||
path: '/preview/:batchId',
|
||||
name: 'Preview',
|
||||
component: () => import('@/views/PreviewView.vue'),
|
||||
meta: { title: '清洗预览' }
|
||||
},
|
||||
{
|
||||
path: '/dedup',
|
||||
name: 'DedupReview',
|
||||
component: () => import('@/views/DedupReviewView.vue'),
|
||||
meta: { title: '去重处理', icon: 'mdi-content-duplicate' }
|
||||
},
|
||||
{
|
||||
path: '/rules',
|
||||
name: 'RuleConfig',
|
||||
component: () => import('@/views/RuleConfigView.vue'),
|
||||
meta: { title: '规则管理', icon: 'mdi-cog-outline' }
|
||||
},
|
||||
{
|
||||
path: '/tasks',
|
||||
name: 'ImportTask',
|
||||
component: () => import('@/views/ImportTaskView.vue'),
|
||||
meta: { title: '导入任务', icon: 'mdi-export' }
|
||||
},
|
||||
{
|
||||
path: '/audit',
|
||||
name: 'AuditTrace',
|
||||
component: () => import('@/views/AuditTraceView.vue'),
|
||||
meta: { title: '数据审计', icon: 'mdi-history' }
|
||||
},
|
||||
{
|
||||
path: '/settings',
|
||||
name: 'Settings',
|
||||
component: () => import('@/views/SettingsView.vue'),
|
||||
meta: { title: '系统设置', icon: 'mdi-tune' }
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
#### 页面文件清单
|
||||
|
||||
| 页面 | 文件 | 用途 |
|
||||
|------|------|------|
|
||||
| 导入中心 | `ImportCenterView.vue` | 文件上传 + 批次列表 |
|
||||
| 批次详情 | `BatchDetailView.vue` | 单批次详情 |
|
||||
| 清洗预览 | `PreviewView.vue` | 标准化结果预览 |
|
||||
| 去重处理 | `DedupReviewView.vue` | 疑似重复确认 |
|
||||
| 规则配置 | `RuleConfigView.vue` | 规则 CRUD |
|
||||
| 导入任务 | `ImportTaskView.vue` | 导入结果展示 |
|
||||
| 审计追溯 | `AuditTraceView.vue` | 全链路追溯 |
|
||||
| 系统设置 | `SettingsView.vue` | Firefly 配置 |
|
||||
@@ -0,0 +1,86 @@
|
||||
## 部署与安全设计
|
||||
|
||||
- **DDS-Section**: 13.2 安全设计 + 13.5 部署架构
|
||||
- **DDS-Lines**: L1663-L1732
|
||||
|
||||
### Extract
|
||||
|
||||
#### 安全措施
|
||||
|
||||
| 安全措施 | 说明 |
|
||||
|----------|------|
|
||||
| 本地部署 | 默认本地运行,敏感账单数据不上传云端 |
|
||||
| API Token 加密 | Firefly III API Token 使用 AES 加密存储 |
|
||||
| 审计日志脱敏 | 日志中账号、订单号局部遮罩(如 `138****1234`) |
|
||||
| 文件安全 | 上传文件限制类型和大小(默认最大 50MB) |
|
||||
| CORS 配置 | 仅允许本地来源访问 API |
|
||||
|
||||
#### 部署架构
|
||||
|
||||
```
|
||||
Docker 容器 / 本地部署
|
||||
├── 前端静态资源 (Vue3 Build → /web)
|
||||
├── 后端服务 (Go Binary :8080)
|
||||
└── SQLite 数据库 (/data/projectmoneyx.db)
|
||||
|
||||
外部依赖(可选)
|
||||
├── Firefly III (API 推送)
|
||||
└── Data Importer (API 推送)
|
||||
```
|
||||
|
||||
#### 部署方式
|
||||
|
||||
1. **Docker 部署**(推荐):单容器包含前后端 + SQLite
|
||||
2. **二进制部署**:交叉编译为单体可执行文件,前端资源使用 Go `embed` 嵌入
|
||||
|
||||
#### Dockerfile 示例
|
||||
|
||||
```dockerfile
|
||||
FROM golang:1.21-alpine AS builder
|
||||
WORKDIR /app
|
||||
COPY . .
|
||||
RUN go build -o projectmoneyx ./cmd/server
|
||||
|
||||
FROM alpine:latest
|
||||
WORKDIR /app
|
||||
COPY --from=builder /app/projectmoneyx .
|
||||
COPY --from=builder /app/web ./web
|
||||
VOLUME /data
|
||||
EXPOSE 8080
|
||||
CMD ["./projectmoneyx", "--db", "/data/projectmoneyx.db"]
|
||||
```
|
||||
|
||||
#### 配置管理
|
||||
|
||||
```go
|
||||
type Config struct {
|
||||
Server ServerConfig `yaml:"server"`
|
||||
Database DatabaseConfig `yaml:"database"`
|
||||
Firefly FireflyConfig `yaml:"firefly"`
|
||||
Dedup DedupConfig `yaml:"dedup"`
|
||||
}
|
||||
|
||||
type ServerConfig struct {
|
||||
Port int `yaml:"port" default:"8080"`
|
||||
Mode string `yaml:"mode" default:"release"`
|
||||
}
|
||||
|
||||
type DatabaseConfig struct {
|
||||
Path string `yaml:"path" default:"./data/projectmoneyx.db"`
|
||||
}
|
||||
|
||||
type FireflyConfig struct {
|
||||
BaseURL string `yaml:"base_url"`
|
||||
APIToken string `yaml:"api_token"` // AES 加密存储
|
||||
ImporterURL string `yaml:"importer_url"`
|
||||
Enabled bool `yaml:"enabled"`
|
||||
}
|
||||
|
||||
type DedupConfig struct {
|
||||
FuzzyTimeWindow int `yaml:"fuzzy_time_window" default:"5"`
|
||||
FuzzyThresholdHigh int `yaml:"fuzzy_threshold_high" default:"85"`
|
||||
FuzzyThresholdLow int `yaml:"fuzzy_threshold_low" default:"60"`
|
||||
TransferTimeWindow int `yaml:"transfer_time_window" default:"30"`
|
||||
AmountEpsilon float64 `yaml:"amount_epsilon" default:"0.01"`
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,55 @@
|
||||
## 性能设计
|
||||
|
||||
- **DDS-Section**: 13.1 性能设计 + 13.3 可维护性设计
|
||||
- **DDS-Lines**: L1632-L1681
|
||||
|
||||
### Extract
|
||||
|
||||
#### 性能目标
|
||||
|
||||
单次导入 1 万条记录在主流程内完成解析与清洗,去重计算应在 30 秒内完成。
|
||||
|
||||
#### 优化策略
|
||||
|
||||
| 策略 | 说明 |
|
||||
|------|------|
|
||||
| 批量插入 | `raw_records` 和 `transactions` 使用 GORM `CreateInBatches`,每批 500 条 |
|
||||
| 关键索引 | `source_platform + source_record_id`、`batch_id`、`trade_time`、`order_id` |
|
||||
| 模糊去重分桶 | 按 `trade_time` 时间分桶,避免全表扫描 |
|
||||
| 规则预筛选 | 按平台和启用状态预加载规则,减少无效匹配 |
|
||||
| 异步处理 | ETL Pipeline 使用 goroutine 异步执行,前端轮询状态 |
|
||||
| 连接池 | SQLite 使用 WAL 模式提升并发读写性能 |
|
||||
|
||||
#### SQLite 性能配置
|
||||
|
||||
```go
|
||||
func initDB(dbPath string) *gorm.DB {
|
||||
db, _ := gorm.Open(sqlite.Open(dbPath), &gorm.Config{})
|
||||
sqlDB, _ := db.DB()
|
||||
sqlDB.SetMaxOpenConns(1) // SQLite 单写
|
||||
sqlDB.SetMaxIdleConns(10)
|
||||
db.Exec("PRAGMA journal_mode=WAL")
|
||||
db.Exec("PRAGMA synchronous=NORMAL")
|
||||
db.Exec("PRAGMA cache_size=-64000") // 64MB cache
|
||||
return db
|
||||
}
|
||||
```
|
||||
|
||||
#### 可维护性设计
|
||||
|
||||
| 原则 | 说明 |
|
||||
|------|------|
|
||||
| 解析器插件化 | 新增平台只需实现 `BillParser` 接口并注册 |
|
||||
| 规则条件 JSON 化 | 规则存储为 JSON,灵活扩展匹配条件 |
|
||||
| 导入器解耦 | Export 层独立,可替换下游目标 |
|
||||
| 分层 DTO/VO/Entity | Handler → DTO → Service → Entity → DAO |
|
||||
| 事务分阶段 | 每个 ETL 阶段独立事务,避免超长事务 |
|
||||
|
||||
#### 可追溯性设计
|
||||
|
||||
| 追溯能力 | 实现方式 |
|
||||
|----------|----------|
|
||||
| 任一导入结果 → 原始文件 | `transaction.raw_record_id → raw_record.source_file_id → source_file` |
|
||||
| 任一规则命中 → 解释说明 | `rule_hits` 表记录命中条件和前后字段值对比 |
|
||||
| 任一合并操作 → 判定依据 | `dedup_relations.reason_json` 和 `link_relations.reason_json` |
|
||||
| 任一操作 → 操作日志 | `audit_logs` 表记录实体变更和操作者信息 |
|
||||
101
.agents/skills/developing-projectmoneyx/scripts/verify.sh
Normal file
101
.agents/skills/developing-projectmoneyx/scripts/verify.sh
Normal file
@@ -0,0 +1,101 @@
|
||||
#!/bin/bash
|
||||
# verify.sh - developing-projectmoneyx Skill 结构与内容验证
|
||||
set -e
|
||||
|
||||
PASS=0; FAIL=0
|
||||
check() {
|
||||
if eval "$2"; then
|
||||
echo "✅ PASS: $1"; ((PASS++))
|
||||
else
|
||||
echo "❌ FAIL: $1"; ((FAIL++))
|
||||
fi
|
||||
}
|
||||
|
||||
SKILL_DIR="$(cd "$(dirname "$0")/.." && pwd)"
|
||||
|
||||
# ========================
|
||||
# 结构完整性检查
|
||||
# ========================
|
||||
|
||||
check "SKILL.md 存在" "test -f '$SKILL_DIR/SKILL.md'"
|
||||
check "reference/ 目录存在" "test -d '$SKILL_DIR/reference'"
|
||||
check "scripts/ 目录存在" "test -d '$SKILL_DIR/scripts'"
|
||||
check "SKILL.md < 500 行" "[ $(wc -l < '$SKILL_DIR/SKILL.md') -lt 500 ]"
|
||||
|
||||
# ========================
|
||||
# Frontmatter 检查
|
||||
# ========================
|
||||
|
||||
check "frontmatter 包含 name" "head -20 '$SKILL_DIR/SKILL.md' | grep -q '^name:'"
|
||||
check "frontmatter 包含 description" "head -20 '$SKILL_DIR/SKILL.md' | grep -q '^description:'"
|
||||
check "frontmatter 包含 argument-hint" "head -20 '$SKILL_DIR/SKILL.md' | grep -q '^argument-hint:'"
|
||||
check "frontmatter 包含 allowed-tools" "head -20 '$SKILL_DIR/SKILL.md' | grep -q '^allowed-tools:'"
|
||||
|
||||
# ========================
|
||||
# 章节检查
|
||||
# ========================
|
||||
|
||||
check "包含 Quick Context 章节" "grep -q '## Quick Context' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Plan 章节" "grep -q '## Plan' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Verify 章节" "grep -q '## Verify' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Execute 章节" "grep -q '## Execute' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Pitfalls 章节" "grep -q '## Pitfalls' '$SKILL_DIR/SKILL.md'"
|
||||
check "包含 Related References 章节" "grep -q '## Related References' '$SKILL_DIR/SKILL.md'"
|
||||
|
||||
# ========================
|
||||
# 动态注入检查
|
||||
# ========================
|
||||
|
||||
check "包含至少 2 处动态注入命令" "[ $(grep -c '!\`' '$SKILL_DIR/SKILL.md') -ge 2 ]"
|
||||
|
||||
# ========================
|
||||
# Pitfalls 引用 reference 检查
|
||||
# ========================
|
||||
|
||||
check "Pitfalls 引用 reference/ 至少 2 处" "[ $(grep -A 100 '## Pitfalls' '$SKILL_DIR/SKILL.md' | grep -c 'reference/') -ge 2 ]"
|
||||
|
||||
# ========================
|
||||
# reference 目录结构检查
|
||||
# ========================
|
||||
|
||||
check "reference 有 01-architecture 子目录" "test -d '$SKILL_DIR/reference/01-architecture'"
|
||||
check "reference 有 02-parser-engine 子目录" "test -d '$SKILL_DIR/reference/02-parser-engine'"
|
||||
check "reference 有 03-dedup-engine 子目录" "test -d '$SKILL_DIR/reference/03-dedup-engine'"
|
||||
check "reference 有 04-rule-engine 子目录" "test -d '$SKILL_DIR/reference/04-rule-engine'"
|
||||
check "reference 有 05-database 子目录" "test -d '$SKILL_DIR/reference/05-database'"
|
||||
check "reference 有 06-api-design 子目录" "test -d '$SKILL_DIR/reference/06-api-design'"
|
||||
check "reference 有 07-export-engine 子目录" "test -d '$SKILL_DIR/reference/07-export-engine'"
|
||||
check "reference 有 08-frontend 子目录" "test -d '$SKILL_DIR/reference/08-frontend'"
|
||||
check "reference 有 09-nonfunctional 子目录" "test -d '$SKILL_DIR/reference/09-nonfunctional'"
|
||||
|
||||
# ========================
|
||||
# reference 内容检查
|
||||
# ========================
|
||||
|
||||
check "reference 文件含 DDS-Section 溯源" "grep -rq 'DDS-Section:' '$SKILL_DIR/reference/' 2>/dev/null"
|
||||
check "reference 文件含 DDS-Lines 溯源" "grep -rq 'DDS-Lines:' '$SKILL_DIR/reference/' 2>/dev/null"
|
||||
|
||||
# ========================
|
||||
# 关键 reference 文件存在检查
|
||||
# ========================
|
||||
|
||||
check "system-overview.md 存在" "test -f '$SKILL_DIR/reference/01-architecture/system-overview.md'"
|
||||
check "batch-state-machine.md 存在" "test -f '$SKILL_DIR/reference/01-architecture/batch-state-machine.md'"
|
||||
check "parser-interface.md 存在" "test -f '$SKILL_DIR/reference/02-parser-engine/parser-interface.md'"
|
||||
check "field-mappings.md 存在" "test -f '$SKILL_DIR/reference/02-parser-engine/field-mappings.md'"
|
||||
check "strict-dedup.md 存在" "test -f '$SKILL_DIR/reference/03-dedup-engine/strict-dedup.md'"
|
||||
check "fuzzy-dedup.md 存在" "test -f '$SKILL_DIR/reference/03-dedup-engine/fuzzy-dedup.md'"
|
||||
check "transfer-link.md 存在" "test -f '$SKILL_DIR/reference/03-dedup-engine/transfer-link.md'"
|
||||
check "fingerprint.md 存在" "test -f '$SKILL_DIR/reference/03-dedup-engine/fingerprint.md'"
|
||||
check "rule-conditions.md 存在" "test -f '$SKILL_DIR/reference/04-rule-engine/rule-conditions.md'"
|
||||
check "rule-execution.md 存在" "test -f '$SKILL_DIR/reference/04-rule-engine/rule-execution.md'"
|
||||
check "db-schema.md 存在" "test -f '$SKILL_DIR/reference/05-database/db-schema.md'"
|
||||
check "indexes.md 存在" "test -f '$SKILL_DIR/reference/05-database/indexes.md'"
|
||||
check "api-catalog.md 存在" "test -f '$SKILL_DIR/reference/06-api-design/api-catalog.md'"
|
||||
check "response-format.md 存在" "test -f '$SKILL_DIR/reference/06-api-design/response-format.md'"
|
||||
check "firefly-mapping.md 存在" "test -f '$SKILL_DIR/reference/07-export-engine/firefly-mapping.md'"
|
||||
check "import-validation.md 存在" "test -f '$SKILL_DIR/reference/07-export-engine/import-validation.md'"
|
||||
|
||||
echo ""
|
||||
echo "=== 结果: $PASS PASS / $FAIL FAIL ==="
|
||||
[ $FAIL -eq 0 ] && exit 0 || exit 1
|
||||
141
.agents/skills/frontend-vue3-vuetify/SKILL.md
Normal file
141
.agents/skills/frontend-vue3-vuetify/SKILL.md
Normal file
@@ -0,0 +1,141 @@
|
||||
---
|
||||
name: frontend-vue3-vuetify
|
||||
description: Build production-grade Vue 3 + TypeScript + Vuetify 3 interfaces with architectural rigor. 构建生产级 Vue 3 + TypeScript + Vuetify 3 界面。Use when creating Vue components, pages, layouts, Pinia stores, or API modules. 用于创建 Vue 组件、页面、布局、Pinia 状态管理或 API 模块。Enforces strict typing, Composition API patterns, Material Design 3 aesthetics, and bulletproof data handling.
|
||||
---
|
||||
|
||||
本技能指导构建架构严谨、类型安全、视觉精致的 Vue 3 + Vuetify 3 代码。每个组件都应该达到生产级代码库的标准——让资深工程师也引以为傲。
|
||||
|
||||
用户输入:$ARGUMENTS(组件规格、页面需求、功能请求或架构问题)
|
||||
|
||||
## 架构思维
|
||||
|
||||
动手写代码之前,先建立清晰认知:
|
||||
|
||||
- **组件身份**:这是页面(Page)、布局(Layout)、可复用组件(Component)、组合式函数(Composable)、状态仓库(Store),还是 API 模块?每种都有独特模式。
|
||||
- **数据重力**:状态住在哪里?Props 向下流动,Events 向上冒泡。跨组件状态用 Pinia。深层级传递用 `provide/inject`。
|
||||
- **滚动策略**:哪个容器拥有滚动权?永远不是 body。必须显式声明。必须可控。
|
||||
- **失败模式**:数据为 `null` 时怎么办?空数组?网络超时?先为不幸路径设计。
|
||||
|
||||
**关键原则**:生产代码预判混乱。为一切加类型。为一切加守卫。让一切优雅降级。
|
||||
|
||||
## 核心信条
|
||||
|
||||
### TypeScript 绝对主义
|
||||
- `<script setup lang="ts">` — 唯一可接受的写法
|
||||
- `any` 被禁止 — 使用 `unknown` + 类型守卫、泛型、工具类型
|
||||
- 每个 prop、emit、ref、API 响应都必须穿戴类型
|
||||
- 类型定义放在 `@/types/`,按领域组织:`user.d.ts`、`order.d.ts`
|
||||
|
||||
### Composition API 纯粹性
|
||||
- `ref`、`reactive`、`computed`、`watchEffect` — 掌握这四大金刚
|
||||
- `shallowRef`、`readonly`、`toRaw` — 知道何时使用优化手段
|
||||
- 生命周期用 `onMounted`、`onUnmounted` — 绝不混用 Options API
|
||||
- Pinia stores:类型化的 state、类型化的 getters、类型化的 actions — 无例外
|
||||
|
||||
### Vuetify 3 + Material Design 3
|
||||
- 所有 UI 通过 Vuetify 组件实现 — UI 元素不使用原生 HTML
|
||||
- 始终主题感知 — `rgb(var(--v-theme-surface))`,绝不 `#ffffff`
|
||||
- `useDisplay()` 处理响应式逻辑 — 断点是一等公民
|
||||
- 密度很重要 — 数据密集界面使用 `density="compact"`
|
||||
|
||||
### 布局哲学
|
||||
```
|
||||
┌─────────────────────────────────┐
|
||||
│ 工具栏 (flex-shrink-0) │
|
||||
├─────────────────────────────────┤
|
||||
│ │
|
||||
│ 内容区域 │
|
||||
│ (flex-grow-1, overflow-y-auto) │
|
||||
│ (min-height: 0) ← 关键! │
|
||||
│ │
|
||||
├─────────────────────────────────┤
|
||||
│ 底部栏 (flex-shrink-0) │
|
||||
└─────────────────────────────────┘
|
||||
```
|
||||
- **禁止 body 滚动** — 视口锁定,内容在容器中滚动
|
||||
- **Flexbox 陷阱**:`flex-grow-1` 子元素必须有 `min-height: 0`
|
||||
- **粘性元素**:筛选栏、表头 — 滚动时始终可见
|
||||
|
||||
## 数据健壮性模式
|
||||
|
||||
将所有外部数据视为不可信:
|
||||
|
||||
```typescript
|
||||
// 防御性访问
|
||||
const userName = user?.profile?.name ?? '未知'
|
||||
|
||||
// 数组安全检查
|
||||
const items = Array.isArray(response.data) ? response.data : []
|
||||
|
||||
// 模板中的存在性守卫
|
||||
<template v-if="user">{{ user.name }}</template>
|
||||
<v-empty-state v-else />
|
||||
```
|
||||
|
||||
## UI 状态三位一体
|
||||
|
||||
每个数据驱动视图必须处理三种状态:
|
||||
|
||||
| 状态 | 组件 | 禁止行为 |
|
||||
|------|------|----------|
|
||||
| **加载中** | `v-skeleton-loader` | 显示过期数据或空白屏幕 |
|
||||
| **空数据** | `v-empty-state` + 操作按钮 | 留下白茫茫一片 |
|
||||
| **错误** | Snackbar + 重试选项 | 静默失败 |
|
||||
|
||||
## 表格与列表戒律
|
||||
|
||||
- 每个 `v-data-table` 都要 `fixed-header` — 没有商量余地
|
||||
- 截断文本必须配 `v-tooltip` — 用户有权 hover 看到完整内容
|
||||
- 100+ 条数据?用 `v-virtual-scroll` — DOM 节点数保持恒定
|
||||
- 列宽显式指定 — 不玩布局抽奖
|
||||
|
||||
## 反模式(绝不允许)
|
||||
|
||||
- TypeScript 项目中出现 `.js` 文件
|
||||
- 没有正当理由使用 `any`
|
||||
- 硬编码颜色:`color="#1976d2"` → 应该用 `color="primary"`
|
||||
- SPA 布局中出现 body 级滚动
|
||||
- 表格没有固定表头
|
||||
- 截断文本没有 tooltip
|
||||
- 空状态真的"空空如也"
|
||||
- 加载状态冻结 UI
|
||||
- API 调用没有错误处理
|
||||
|
||||
## 参考文件
|
||||
|
||||
需要实现细节时查阅:
|
||||
|
||||
| 需求 | 文件 |
|
||||
|------|------|
|
||||
| 高级 TypeScript 模式 | `reference/typescript-rules.md` |
|
||||
| 复杂布局结构 | `reference/layout-patterns.md` |
|
||||
| API 客户端架构 | `reference/api-patterns.md` |
|
||||
| 表格、列表、表单、反馈 | `reference/ui-interaction.md` |
|
||||
|
||||
## 项目结构
|
||||
|
||||
```
|
||||
src/
|
||||
├── api/ # Axios 实例 + 模块
|
||||
├── components/ # 共享组件
|
||||
├── composables/ # 可复用 hooks
|
||||
├── layouts/ # 页面外壳
|
||||
├── pages/ # 路由视图
|
||||
├── plugins/ # Vuetify, Pinia, Router
|
||||
├── store/ # Pinia stores
|
||||
├── styles/ # 全局 SCSS
|
||||
├── types/ # 类型定义
|
||||
└── utils/ # 纯函数
|
||||
```
|
||||
|
||||
## 输出规范
|
||||
|
||||
1. 陈述架构方案(2-3 句话)
|
||||
2. 列出要创建的文件及其用途
|
||||
3. 完整实现每个文件 — 无占位符,无 TODO
|
||||
4. 对照反模式清单验证
|
||||
5. 指出任何假设或权衡取舍
|
||||
|
||||
---
|
||||
|
||||
记住:你不是在写"能跑的代码"。你是在写能跑、能扩展、能维护、能令人愉悦的代码。每个 `ref` 都有类型。每个边界情况都有处理。每个加载状态都很美观。这就是"生产级"的含义。
|
||||
202
.agents/skills/skill-creator/LICENSE.txt
Normal file
202
.agents/skills/skill-creator/LICENSE.txt
Normal file
@@ -0,0 +1,202 @@
|
||||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
479
.agents/skills/skill-creator/SKILL.md
Normal file
479
.agents/skills/skill-creator/SKILL.md
Normal file
@@ -0,0 +1,479 @@
|
||||
---
|
||||
name: skill-creator
|
||||
description: Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
|
||||
---
|
||||
|
||||
# Skill Creator
|
||||
|
||||
A skill for creating new skills and iteratively improving them.
|
||||
|
||||
At a high level, the process of creating a skill goes like this:
|
||||
|
||||
- Decide what you want the skill to do and roughly how it should do it
|
||||
- Write a draft of the skill
|
||||
- Create a few test prompts and run claude-with-access-to-the-skill on them
|
||||
- Help the user evaluate the results both qualitatively and quantitatively
|
||||
- While the runs happen in the background, draft some quantitative evals if there aren't any (if there are some, you can either use as is or modify if you feel something needs to change about them). Then explain them to the user (or if they already existed, explain the ones that already exist)
|
||||
- Use the `eval-viewer/generate_review.py` script to show the user the results for them to look at, and also let them look at the quantitative metrics
|
||||
- Rewrite the skill based on feedback from the user's evaluation of the results (and also if there are any glaring flaws that become apparent from the quantitative benchmarks)
|
||||
- Repeat until you're satisfied
|
||||
- Expand the test set and try again at larger scale
|
||||
|
||||
Your job when using this skill is to figure out where the user is in this process and then jump in and help them progress through these stages. So for instance, maybe they're like "I want to make a skill for X". You can help narrow down what they mean, write a draft, write the test cases, figure out how they want to evaluate, run all the prompts, and repeat.
|
||||
|
||||
On the other hand, maybe they already have a draft of the skill. In this case you can go straight to the eval/iterate part of the loop.
|
||||
|
||||
Of course, you should always be flexible and if the user is like "I don't need to run a bunch of evaluations, just vibe with me", you can do that instead.
|
||||
|
||||
Then after the skill is done (but again, the order is flexible), you can also run the skill description improver, which we have a whole separate script for, to optimize the triggering of the skill.
|
||||
|
||||
Cool? Cool.
|
||||
|
||||
## Communicating with the user
|
||||
|
||||
The skill creator is liable to be used by people across a wide range of familiarity with coding jargon. If you haven't heard (and how could you, it's only very recently that it started), there's a trend now where the power of Claude is inspiring plumbers to open up their terminals, parents and grandparents to google "how to install npm". On the other hand, the bulk of users are probably fairly computer-literate.
|
||||
|
||||
So please pay attention to context cues to understand how to phrase your communication! In the default case, just to give you some idea:
|
||||
|
||||
- "evaluation" and "benchmark" are borderline, but OK
|
||||
- for "JSON" and "assertion" you want to see serious cues from the user that they know what those things are before using them without explaining them
|
||||
|
||||
It's OK to briefly explain terms if you're in doubt, and feel free to clarify terms with a short definition if you're unsure if the user will get it.
|
||||
|
||||
---
|
||||
|
||||
## Creating a skill
|
||||
|
||||
### Capture Intent
|
||||
|
||||
Start by understanding the user's intent. The current conversation might already contain a workflow the user wants to capture (e.g., they say "turn this into a skill"). If so, extract answers from the conversation history first — the tools used, the sequence of steps, corrections the user made, input/output formats observed. The user may need to fill the gaps, and should confirm before proceeding to the next step.
|
||||
|
||||
1. What should this skill enable Claude to do?
|
||||
2. When should this skill trigger? (what user phrases/contexts)
|
||||
3. What's the expected output format?
|
||||
4. Should we set up test cases to verify the skill works? Skills with objectively verifiable outputs (file transforms, data extraction, code generation, fixed workflow steps) benefit from test cases. Skills with subjective outputs (writing style, art) often don't need them. Suggest the appropriate default based on the skill type, but let the user decide.
|
||||
|
||||
### Interview and Research
|
||||
|
||||
Proactively ask questions about edge cases, input/output formats, example files, success criteria, and dependencies. Wait to write test prompts until you've got this part ironed out.
|
||||
|
||||
Check available MCPs - if useful for research (searching docs, finding similar skills, looking up best practices), research in parallel via subagents if available, otherwise inline. Come prepared with context to reduce burden on the user.
|
||||
|
||||
### Write the SKILL.md
|
||||
|
||||
Based on the user interview, fill in these components:
|
||||
|
||||
- **name**: Skill identifier
|
||||
- **description**: When to trigger, what it does. This is the primary triggering mechanism - include both what the skill does AND specific contexts for when to use it. All "when to use" info goes here, not in the body. Note: currently Claude has a tendency to "undertrigger" skills -- to not use them when they'd be useful. To combat this, please make the skill descriptions a little bit "pushy". So for instance, instead of "How to build a simple fast dashboard to display internal Anthropic data.", you might write "How to build a simple fast dashboard to display internal Anthropic data. Make sure to use this skill whenever the user mentions dashboards, data visualization, internal metrics, or wants to display any kind of company data, even if they don't explicitly ask for a 'dashboard.'"
|
||||
- **compatibility**: Required tools, dependencies (optional, rarely needed)
|
||||
- **the rest of the skill :)**
|
||||
|
||||
### Skill Writing Guide
|
||||
|
||||
#### Anatomy of a Skill
|
||||
|
||||
```
|
||||
skill-name/
|
||||
├── SKILL.md (required)
|
||||
│ ├── YAML frontmatter (name, description required)
|
||||
│ └── Markdown instructions
|
||||
└── Bundled Resources (optional)
|
||||
├── scripts/ - Executable code for deterministic/repetitive tasks
|
||||
├── references/ - Docs loaded into context as needed
|
||||
└── assets/ - Files used in output (templates, icons, fonts)
|
||||
```
|
||||
|
||||
#### Progressive Disclosure
|
||||
|
||||
Skills use a three-level loading system:
|
||||
1. **Metadata** (name + description) - Always in context (~100 words)
|
||||
2. **SKILL.md body** - In context whenever skill triggers (<500 lines ideal)
|
||||
3. **Bundled resources** - As needed (unlimited, scripts can execute without loading)
|
||||
|
||||
These word counts are approximate and you can feel free to go longer if needed.
|
||||
|
||||
**Key patterns:**
|
||||
- Keep SKILL.md under 500 lines; if you're approaching this limit, add an additional layer of hierarchy along with clear pointers about where the model using the skill should go next to follow up.
|
||||
- Reference files clearly from SKILL.md with guidance on when to read them
|
||||
- For large reference files (>300 lines), include a table of contents
|
||||
|
||||
**Domain organization**: When a skill supports multiple domains/frameworks, organize by variant:
|
||||
```
|
||||
cloud-deploy/
|
||||
├── SKILL.md (workflow + selection)
|
||||
└── references/
|
||||
├── aws.md
|
||||
├── gcp.md
|
||||
└── azure.md
|
||||
```
|
||||
Claude reads only the relevant reference file.
|
||||
|
||||
#### Principle of Lack of Surprise
|
||||
|
||||
This goes without saying, but skills must not contain malware, exploit code, or any content that could compromise system security. A skill's contents should not surprise the user in their intent if described. Don't go along with requests to create misleading skills or skills designed to facilitate unauthorized access, data exfiltration, or other malicious activities. Things like a "roleplay as an XYZ" are OK though.
|
||||
|
||||
#### Writing Patterns
|
||||
|
||||
Prefer using the imperative form in instructions.
|
||||
|
||||
**Defining output formats** - You can do it like this:
|
||||
```markdown
|
||||
## Report structure
|
||||
ALWAYS use this exact template:
|
||||
# [Title]
|
||||
## Executive summary
|
||||
## Key findings
|
||||
## Recommendations
|
||||
```
|
||||
|
||||
**Examples pattern** - It's useful to include examples. You can format them like this (but if "Input" and "Output" are in the examples you might want to deviate a little):
|
||||
```markdown
|
||||
## Commit message format
|
||||
**Example 1:**
|
||||
Input: Added user authentication with JWT tokens
|
||||
Output: feat(auth): implement JWT-based authentication
|
||||
```
|
||||
|
||||
### Writing Style
|
||||
|
||||
Try to explain to the model why things are important in lieu of heavy-handed musty MUSTs. Use theory of mind and try to make the skill general and not super-narrow to specific examples. Start by writing a draft and then look at it with fresh eyes and improve it.
|
||||
|
||||
### Test Cases
|
||||
|
||||
After writing the skill draft, come up with 2-3 realistic test prompts — the kind of thing a real user would actually say. Share them with the user: [you don't have to use this exact language] "Here are a few test cases I'd like to try. Do these look right, or do you want to add more?" Then run them.
|
||||
|
||||
Save test cases to `evals/evals.json`. Don't write assertions yet — just the prompts. You'll draft assertions in the next step while the runs are in progress.
|
||||
|
||||
```json
|
||||
{
|
||||
"skill_name": "example-skill",
|
||||
"evals": [
|
||||
{
|
||||
"id": 1,
|
||||
"prompt": "User's task prompt",
|
||||
"expected_output": "Description of expected result",
|
||||
"files": []
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
See `references/schemas.md` for the full schema (including the `assertions` field, which you'll add later).
|
||||
|
||||
## Running and evaluating test cases
|
||||
|
||||
This section is one continuous sequence — don't stop partway through. Do NOT use `/skill-test` or any other testing skill.
|
||||
|
||||
Put results in `<skill-name>-workspace/` as a sibling to the skill directory. Within the workspace, organize results by iteration (`iteration-1/`, `iteration-2/`, etc.) and within that, each test case gets a directory (`eval-0/`, `eval-1/`, etc.). Don't create all of this upfront — just create directories as you go.
|
||||
|
||||
### Step 1: Spawn all runs (with-skill AND baseline) in the same turn
|
||||
|
||||
For each test case, spawn two subagents in the same turn — one with the skill, one without. This is important: don't spawn the with-skill runs first and then come back for baselines later. Launch everything at once so it all finishes around the same time.
|
||||
|
||||
**With-skill run:**
|
||||
|
||||
```
|
||||
Execute this task:
|
||||
- Skill path: <path-to-skill>
|
||||
- Task: <eval prompt>
|
||||
- Input files: <eval files if any, or "none">
|
||||
- Save outputs to: <workspace>/iteration-<N>/eval-<ID>/with_skill/outputs/
|
||||
- Outputs to save: <what the user cares about — e.g., "the .docx file", "the final CSV">
|
||||
```
|
||||
|
||||
**Baseline run** (same prompt, but the baseline depends on context):
|
||||
- **Creating a new skill**: no skill at all. Same prompt, no skill path, save to `without_skill/outputs/`.
|
||||
- **Improving an existing skill**: the old version. Before editing, snapshot the skill (`cp -r <skill-path> <workspace>/skill-snapshot/`), then point the baseline subagent at the snapshot. Save to `old_skill/outputs/`.
|
||||
|
||||
Write an `eval_metadata.json` for each test case (assertions can be empty for now). Give each eval a descriptive name based on what it's testing — not just "eval-0". Use this name for the directory too. If this iteration uses new or modified eval prompts, create these files for each new eval directory — don't assume they carry over from previous iterations.
|
||||
|
||||
```json
|
||||
{
|
||||
"eval_id": 0,
|
||||
"eval_name": "descriptive-name-here",
|
||||
"prompt": "The user's task prompt",
|
||||
"assertions": []
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: While runs are in progress, draft assertions
|
||||
|
||||
Don't just wait for the runs to finish — you can use this time productively. Draft quantitative assertions for each test case and explain them to the user. If assertions already exist in `evals/evals.json`, review them and explain what they check.
|
||||
|
||||
Good assertions are objectively verifiable and have descriptive names — they should read clearly in the benchmark viewer so someone glancing at the results immediately understands what each one checks. Subjective skills (writing style, design quality) are better evaluated qualitatively — don't force assertions onto things that need human judgment.
|
||||
|
||||
Update the `eval_metadata.json` files and `evals/evals.json` with the assertions once drafted. Also explain to the user what they'll see in the viewer — both the qualitative outputs and the quantitative benchmark.
|
||||
|
||||
### Step 3: As runs complete, capture timing data
|
||||
|
||||
When each subagent task completes, you receive a notification containing `total_tokens` and `duration_ms`. Save this data immediately to `timing.json` in the run directory:
|
||||
|
||||
```json
|
||||
{
|
||||
"total_tokens": 84852,
|
||||
"duration_ms": 23332,
|
||||
"total_duration_seconds": 23.3
|
||||
}
|
||||
```
|
||||
|
||||
This is the only opportunity to capture this data — it comes through the task notification and isn't persisted elsewhere. Process each notification as it arrives rather than trying to batch them.
|
||||
|
||||
### Step 4: Grade, aggregate, and launch the viewer
|
||||
|
||||
Once all runs are done:
|
||||
|
||||
1. **Grade each run** — spawn a grader subagent (or grade inline) that reads `agents/grader.md` and evaluates each assertion against the outputs. Save results to `grading.json` in each run directory. The grading.json expectations array must use the fields `text`, `passed`, and `evidence` (not `name`/`met`/`details` or other variants) — the viewer depends on these exact field names. For assertions that can be checked programmatically, write and run a script rather than eyeballing it — scripts are faster, more reliable, and can be reused across iterations.
|
||||
|
||||
2. **Aggregate into benchmark** — run the aggregation script from the skill-creator directory:
|
||||
```bash
|
||||
python -m scripts.aggregate_benchmark <workspace>/iteration-N --skill-name <name>
|
||||
```
|
||||
This produces `benchmark.json` and `benchmark.md` with pass_rate, time, and tokens for each configuration, with mean ± stddev and the delta. If generating benchmark.json manually, see `references/schemas.md` for the exact schema the viewer expects.
|
||||
Put each with_skill version before its baseline counterpart.
|
||||
|
||||
3. **Do an analyst pass** — read the benchmark data and surface patterns the aggregate stats might hide. See `agents/analyzer.md` (the "Analyzing Benchmark Results" section) for what to look for — things like assertions that always pass regardless of skill (non-discriminating), high-variance evals (possibly flaky), and time/token tradeoffs.
|
||||
|
||||
4. **Launch the viewer** with both qualitative outputs and quantitative data:
|
||||
```bash
|
||||
nohup python <skill-creator-path>/eval-viewer/generate_review.py \
|
||||
<workspace>/iteration-N \
|
||||
--skill-name "my-skill" \
|
||||
--benchmark <workspace>/iteration-N/benchmark.json \
|
||||
> /dev/null 2>&1 &
|
||||
VIEWER_PID=$!
|
||||
```
|
||||
For iteration 2+, also pass `--previous-workspace <workspace>/iteration-<N-1>`.
|
||||
|
||||
**Cowork / headless environments:** If `webbrowser.open()` is not available or the environment has no display, use `--static <output_path>` to write a standalone HTML file instead of starting a server. Feedback will be downloaded as a `feedback.json` file when the user clicks "Submit All Reviews". After download, copy `feedback.json` into the workspace directory for the next iteration to pick up.
|
||||
|
||||
Note: please use generate_review.py to create the viewer; there's no need to write custom HTML.
|
||||
|
||||
5. **Tell the user** something like: "I've opened the results in your browser. There are two tabs — 'Outputs' lets you click through each test case and leave feedback, 'Benchmark' shows the quantitative comparison. When you're done, come back here and let me know."
|
||||
|
||||
### What the user sees in the viewer
|
||||
|
||||
The "Outputs" tab shows one test case at a time:
|
||||
- **Prompt**: the task that was given
|
||||
- **Output**: the files the skill produced, rendered inline where possible
|
||||
- **Previous Output** (iteration 2+): collapsed section showing last iteration's output
|
||||
- **Formal Grades** (if grading was run): collapsed section showing assertion pass/fail
|
||||
- **Feedback**: a textbox that auto-saves as they type
|
||||
- **Previous Feedback** (iteration 2+): their comments from last time, shown below the textbox
|
||||
|
||||
The "Benchmark" tab shows the stats summary: pass rates, timing, and token usage for each configuration, with per-eval breakdowns and analyst observations.
|
||||
|
||||
Navigation is via prev/next buttons or arrow keys. When done, they click "Submit All Reviews" which saves all feedback to `feedback.json`.
|
||||
|
||||
### Step 5: Read the feedback
|
||||
|
||||
When the user tells you they're done, read `feedback.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"reviews": [
|
||||
{"run_id": "eval-0-with_skill", "feedback": "the chart is missing axis labels", "timestamp": "..."},
|
||||
{"run_id": "eval-1-with_skill", "feedback": "", "timestamp": "..."},
|
||||
{"run_id": "eval-2-with_skill", "feedback": "perfect, love this", "timestamp": "..."}
|
||||
],
|
||||
"status": "complete"
|
||||
}
|
||||
```
|
||||
|
||||
Empty feedback means the user thought it was fine. Focus your improvements on the test cases where the user had specific complaints.
|
||||
|
||||
Kill the viewer server when you're done with it:
|
||||
|
||||
```bash
|
||||
kill $VIEWER_PID 2>/dev/null
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Improving the skill
|
||||
|
||||
This is the heart of the loop. You've run the test cases, the user has reviewed the results, and now you need to make the skill better based on their feedback.
|
||||
|
||||
### How to think about improvements
|
||||
|
||||
1. **Generalize from the feedback.** The big picture thing that's happening here is that we're trying to create skills that can be used a million times (maybe literally, maybe even more who knows) across many different prompts. Here you and the user are iterating on only a few examples over and over again because it helps move faster. The user knows these examples in and out and it's quick for them to assess new outputs. But if the skill you and the user are codeveloping works only for those examples, it's useless. Rather than put in fiddly overfitty changes, or oppressively constrictive MUSTs, if there's some stubborn issue, you might try branching out and using different metaphors, or recommending different patterns of working. It's relatively cheap to try and maybe you'll land on something great.
|
||||
|
||||
2. **Keep the prompt lean.** Remove things that aren't pulling their weight. Make sure to read the transcripts, not just the final outputs — if it looks like the skill is making the model waste a bunch of time doing things that are unproductive, you can try getting rid of the parts of the skill that are making it do that and seeing what happens.
|
||||
|
||||
3. **Explain the why.** Try hard to explain the **why** behind everything you're asking the model to do. Today's LLMs are *smart*. They have good theory of mind and when given a good harness can go beyond rote instructions and really make things happen. Even if the feedback from the user is terse or frustrated, try to actually understand the task and why the user is writing what they wrote, and what they actually wrote, and then transmit this understanding into the instructions. If you find yourself writing ALWAYS or NEVER in all caps, or using super rigid structures, that's a yellow flag — if possible, reframe and explain the reasoning so that the model understands why the thing you're asking for is important. That's a more humane, powerful, and effective approach.
|
||||
|
||||
4. **Look for repeated work across test cases.** Read the transcripts from the test runs and notice if the subagents all independently wrote similar helper scripts or took the same multi-step approach to something. If all 3 test cases resulted in the subagent writing a `create_docx.py` or a `build_chart.py`, that's a strong signal the skill should bundle that script. Write it once, put it in `scripts/`, and tell the skill to use it. This saves every future invocation from reinventing the wheel.
|
||||
|
||||
This task is pretty important (we are trying to create billions a year in economic value here!) and your thinking time is not the blocker; take your time and really mull things over. I'd suggest writing a draft revision and then looking at it anew and making improvements. Really do your best to get into the head of the user and understand what they want and need.
|
||||
|
||||
### The iteration loop
|
||||
|
||||
After improving the skill:
|
||||
|
||||
1. Apply your improvements to the skill
|
||||
2. Rerun all test cases into a new `iteration-<N+1>/` directory, including baseline runs. If you're creating a new skill, the baseline is always `without_skill` (no skill) — that stays the same across iterations. If you're improving an existing skill, use your judgment on what makes sense as the baseline: the original version the user came in with, or the previous iteration.
|
||||
3. Launch the reviewer with `--previous-workspace` pointing at the previous iteration
|
||||
4. Wait for the user to review and tell you they're done
|
||||
5. Read the new feedback, improve again, repeat
|
||||
|
||||
Keep going until:
|
||||
- The user says they're happy
|
||||
- The feedback is all empty (everything looks good)
|
||||
- You're not making meaningful progress
|
||||
|
||||
---
|
||||
|
||||
## Advanced: Blind comparison
|
||||
|
||||
For situations where you want a more rigorous comparison between two versions of a skill (e.g., the user asks "is the new version actually better?"), there's a blind comparison system. Read `agents/comparator.md` and `agents/analyzer.md` for the details. The basic idea is: give two outputs to an independent agent without telling it which is which, and let it judge quality. Then analyze why the winner won.
|
||||
|
||||
This is optional, requires subagents, and most users won't need it. The human review loop is usually sufficient.
|
||||
|
||||
---
|
||||
|
||||
## Description Optimization
|
||||
|
||||
The description field in SKILL.md frontmatter is the primary mechanism that determines whether Claude invokes a skill. After creating or improving a skill, offer to optimize the description for better triggering accuracy.
|
||||
|
||||
### Step 1: Generate trigger eval queries
|
||||
|
||||
Create 20 eval queries — a mix of should-trigger and should-not-trigger. Save as JSON:
|
||||
|
||||
```json
|
||||
[
|
||||
{"query": "the user prompt", "should_trigger": true},
|
||||
{"query": "another prompt", "should_trigger": false}
|
||||
]
|
||||
```
|
||||
|
||||
The queries must be realistic and something a Claude Code or Claude.ai user would actually type. Not abstract requests, but requests that are concrete and specific and have a good amount of detail. For instance, file paths, personal context about the user's job or situation, column names and values, company names, URLs. A little bit of backstory. Some might be in lowercase or contain abbreviations or typos or casual speech. Use a mix of different lengths, and focus on edge cases rather than making them clear-cut (the user will get a chance to sign off on them).
|
||||
|
||||
Bad: `"Format this data"`, `"Extract text from PDF"`, `"Create a chart"`
|
||||
|
||||
Good: `"ok so my boss just sent me this xlsx file (its in my downloads, called something like 'Q4 sales final FINAL v2.xlsx') and she wants me to add a column that shows the profit margin as a percentage. The revenue is in column C and costs are in column D i think"`
|
||||
|
||||
For the **should-trigger** queries (8-10), think about coverage. You want different phrasings of the same intent — some formal, some casual. Include cases where the user doesn't explicitly name the skill or file type but clearly needs it. Throw in some uncommon use cases and cases where this skill competes with another but should win.
|
||||
|
||||
For the **should-not-trigger** queries (8-10), the most valuable ones are the near-misses — queries that share keywords or concepts with the skill but actually need something different. Think adjacent domains, ambiguous phrasing where a naive keyword match would trigger but shouldn't, and cases where the query touches on something the skill does but in a context where another tool is more appropriate.
|
||||
|
||||
The key thing to avoid: don't make should-not-trigger queries obviously irrelevant. "Write a fibonacci function" as a negative test for a PDF skill is too easy — it doesn't test anything. The negative cases should be genuinely tricky.
|
||||
|
||||
### Step 2: Review with user
|
||||
|
||||
Present the eval set to the user for review using the HTML template:
|
||||
|
||||
1. Read the template from `assets/eval_review.html`
|
||||
2. Replace the placeholders:
|
||||
- `__EVAL_DATA_PLACEHOLDER__` → the JSON array of eval items (no quotes around it — it's a JS variable assignment)
|
||||
- `__SKILL_NAME_PLACEHOLDER__` → the skill's name
|
||||
- `__SKILL_DESCRIPTION_PLACEHOLDER__` → the skill's current description
|
||||
3. Write to a temp file (e.g., `/tmp/eval_review_<skill-name>.html`) and open it: `open /tmp/eval_review_<skill-name>.html`
|
||||
4. The user can edit queries, toggle should-trigger, add/remove entries, then click "Export Eval Set"
|
||||
5. The file downloads to `~/Downloads/eval_set.json` — check the Downloads folder for the most recent version in case there are multiple (e.g., `eval_set (1).json`)
|
||||
|
||||
This step matters — bad eval queries lead to bad descriptions.
|
||||
|
||||
### Step 3: Run the optimization loop
|
||||
|
||||
Tell the user: "This will take some time — I'll run the optimization loop in the background and check on it periodically."
|
||||
|
||||
Save the eval set to the workspace, then run in the background:
|
||||
|
||||
```bash
|
||||
python -m scripts.run_loop \
|
||||
--eval-set <path-to-trigger-eval.json> \
|
||||
--skill-path <path-to-skill> \
|
||||
--model <model-id-powering-this-session> \
|
||||
--max-iterations 5 \
|
||||
--verbose
|
||||
```
|
||||
|
||||
Use the model ID from your system prompt (the one powering the current session) so the triggering test matches what the user actually experiences.
|
||||
|
||||
While it runs, periodically tail the output to give the user updates on which iteration it's on and what the scores look like.
|
||||
|
||||
This handles the full optimization loop automatically. It splits the eval set into 60% train and 40% held-out test, evaluates the current description (running each query 3 times to get a reliable trigger rate), then calls Claude with extended thinking to propose improvements based on what failed. It re-evaluates each new description on both train and test, iterating up to 5 times. When it's done, it opens an HTML report in the browser showing the results per iteration and returns JSON with `best_description` — selected by test score rather than train score to avoid overfitting.
|
||||
|
||||
### How skill triggering works
|
||||
|
||||
Understanding the triggering mechanism helps design better eval queries. Skills appear in Claude's `available_skills` list with their name + description, and Claude decides whether to consult a skill based on that description. The important thing to know is that Claude only consults skills for tasks it can't easily handle on its own — simple, one-step queries like "read this PDF" may not trigger a skill even if the description matches perfectly, because Claude can handle them directly with basic tools. Complex, multi-step, or specialized queries reliably trigger skills when the description matches.
|
||||
|
||||
This means your eval queries should be substantive enough that Claude would actually benefit from consulting a skill. Simple queries like "read file X" are poor test cases — they won't trigger skills regardless of description quality.
|
||||
|
||||
### Step 4: Apply the result
|
||||
|
||||
Take `best_description` from the JSON output and update the skill's SKILL.md frontmatter. Show the user before/after and report the scores.
|
||||
|
||||
---
|
||||
|
||||
### Package and Present (only if `present_files` tool is available)
|
||||
|
||||
Check whether you have access to the `present_files` tool. If you don't, skip this step. If you do, package the skill and present the .skill file to the user:
|
||||
|
||||
```bash
|
||||
python -m scripts.package_skill <path/to/skill-folder>
|
||||
```
|
||||
|
||||
After packaging, direct the user to the resulting `.skill` file path so they can install it.
|
||||
|
||||
---
|
||||
|
||||
## Claude.ai-specific instructions
|
||||
|
||||
In Claude.ai, the core workflow is the same (draft → test → review → improve → repeat), but because Claude.ai doesn't have subagents, some mechanics change. Here's what to adapt:
|
||||
|
||||
**Running test cases**: No subagents means no parallel execution. For each test case, read the skill's SKILL.md, then follow its instructions to accomplish the test prompt yourself. Do them one at a time. This is less rigorous than independent subagents (you wrote the skill and you're also running it, so you have full context), but it's a useful sanity check — and the human review step compensates. Skip the baseline runs — just use the skill to complete the task as requested.
|
||||
|
||||
**Reviewing results**: If you can't open a browser (e.g., Claude.ai's VM has no display, or you're on a remote server), skip the browser reviewer entirely. Instead, present results directly in the conversation. For each test case, show the prompt and the output. If the output is a file the user needs to see (like a .docx or .xlsx), save it to the filesystem and tell them where it is so they can download and inspect it. Ask for feedback inline: "How does this look? Anything you'd change?"
|
||||
|
||||
**Benchmarking**: Skip the quantitative benchmarking — it relies on baseline comparisons which aren't meaningful without subagents. Focus on qualitative feedback from the user.
|
||||
|
||||
**The iteration loop**: Same as before — improve the skill, rerun the test cases, ask for feedback — just without the browser reviewer in the middle. You can still organize results into iteration directories on the filesystem if you have one.
|
||||
|
||||
**Description optimization**: This section requires the `claude` CLI tool (specifically `claude -p`) which is only available in Claude Code. Skip it if you're on Claude.ai.
|
||||
|
||||
**Blind comparison**: Requires subagents. Skip it.
|
||||
|
||||
**Packaging**: The `package_skill.py` script works anywhere with Python and a filesystem. On Claude.ai, you can run it and the user can download the resulting `.skill` file.
|
||||
|
||||
---
|
||||
|
||||
## Cowork-Specific Instructions
|
||||
|
||||
If you're in Cowork, the main things to know are:
|
||||
|
||||
- You have subagents, so the main workflow (spawn test cases in parallel, run baselines, grade, etc.) all works. (However, if you run into severe problems with timeouts, it's OK to run the test prompts in series rather than parallel.)
|
||||
- You don't have a browser or display, so when generating the eval viewer, use `--static <output_path>` to write a standalone HTML file instead of starting a server. Then proffer a link that the user can click to open the HTML in their browser.
|
||||
- For whatever reason, the Cowork setup seems to disincline Claude from generating the eval viewer after running the tests, so just to reiterate: whether you're in Cowork or in Claude Code, after running tests, you should always generate the eval viewer for the human to look at examples before revising the skill yourself and trying to make corrections, using `generate_review.py` (not writing your own boutique html code). Sorry in advance but I'm gonna go all caps here: GENERATE THE EVAL VIEWER *BEFORE* evaluating inputs yourself. You want to get them in front of the human ASAP!
|
||||
- Feedback works differently: since there's no running server, the viewer's "Submit All Reviews" button will download `feedback.json` as a file. You can then read it from there (you may have to request access first).
|
||||
- Packaging works — `package_skill.py` just needs Python and a filesystem.
|
||||
- Description optimization (`run_loop.py` / `run_eval.py`) should work in Cowork just fine since it uses `claude -p` via subprocess, not a browser, but please save it until you've fully finished making the skill and the user agrees it's in good shape.
|
||||
|
||||
---
|
||||
|
||||
## Reference files
|
||||
|
||||
The agents/ directory contains instructions for specialized subagents. Read them when you need to spawn the relevant subagent.
|
||||
|
||||
- `agents/grader.md` — How to evaluate assertions against outputs
|
||||
- `agents/comparator.md` — How to do blind A/B comparison between two outputs
|
||||
- `agents/analyzer.md` — How to analyze why one version beat another
|
||||
|
||||
The references/ directory has additional documentation:
|
||||
- `references/schemas.md` — JSON structures for evals.json, grading.json, etc.
|
||||
|
||||
---
|
||||
|
||||
Repeating one more time the core loop here for emphasis:
|
||||
|
||||
- Figure out what the skill is about
|
||||
- Draft or edit the skill
|
||||
- Run claude-with-access-to-the-skill on test prompts
|
||||
- With the user, evaluate the outputs:
|
||||
- Create benchmark.json and run `eval-viewer/generate_review.py` to help the user review them
|
||||
- Run quantitative evals
|
||||
- Repeat until you and the user are satisfied
|
||||
- Package the final skill and return it to the user.
|
||||
|
||||
Please add steps to your TodoList, if you have such a thing, to make sure you don't forget. If you're in Cowork, please specifically put "Create evals JSON and run `eval-viewer/generate_review.py` so human can review test cases" in your TodoList to make sure it happens.
|
||||
|
||||
Good luck!
|
||||
274
.agents/skills/skill-creator/agents/analyzer.md
Normal file
274
.agents/skills/skill-creator/agents/analyzer.md
Normal file
@@ -0,0 +1,274 @@
|
||||
# Post-hoc Analyzer Agent
|
||||
|
||||
Analyze blind comparison results to understand WHY the winner won and generate improvement suggestions.
|
||||
|
||||
## Role
|
||||
|
||||
After the blind comparator determines a winner, the Post-hoc Analyzer "unblids" the results by examining the skills and transcripts. The goal is to extract actionable insights: what made the winner better, and how can the loser be improved?
|
||||
|
||||
## Inputs
|
||||
|
||||
You receive these parameters in your prompt:
|
||||
|
||||
- **winner**: "A" or "B" (from blind comparison)
|
||||
- **winner_skill_path**: Path to the skill that produced the winning output
|
||||
- **winner_transcript_path**: Path to the execution transcript for the winner
|
||||
- **loser_skill_path**: Path to the skill that produced the losing output
|
||||
- **loser_transcript_path**: Path to the execution transcript for the loser
|
||||
- **comparison_result_path**: Path to the blind comparator's output JSON
|
||||
- **output_path**: Where to save the analysis results
|
||||
|
||||
## Process
|
||||
|
||||
### Step 1: Read Comparison Result
|
||||
|
||||
1. Read the blind comparator's output at comparison_result_path
|
||||
2. Note the winning side (A or B), the reasoning, and any scores
|
||||
3. Understand what the comparator valued in the winning output
|
||||
|
||||
### Step 2: Read Both Skills
|
||||
|
||||
1. Read the winner skill's SKILL.md and key referenced files
|
||||
2. Read the loser skill's SKILL.md and key referenced files
|
||||
3. Identify structural differences:
|
||||
- Instructions clarity and specificity
|
||||
- Script/tool usage patterns
|
||||
- Example coverage
|
||||
- Edge case handling
|
||||
|
||||
### Step 3: Read Both Transcripts
|
||||
|
||||
1. Read the winner's transcript
|
||||
2. Read the loser's transcript
|
||||
3. Compare execution patterns:
|
||||
- How closely did each follow their skill's instructions?
|
||||
- What tools were used differently?
|
||||
- Where did the loser diverge from optimal behavior?
|
||||
- Did either encounter errors or make recovery attempts?
|
||||
|
||||
### Step 4: Analyze Instruction Following
|
||||
|
||||
For each transcript, evaluate:
|
||||
- Did the agent follow the skill's explicit instructions?
|
||||
- Did the agent use the skill's provided tools/scripts?
|
||||
- Were there missed opportunities to leverage skill content?
|
||||
- Did the agent add unnecessary steps not in the skill?
|
||||
|
||||
Score instruction following 1-10 and note specific issues.
|
||||
|
||||
### Step 5: Identify Winner Strengths
|
||||
|
||||
Determine what made the winner better:
|
||||
- Clearer instructions that led to better behavior?
|
||||
- Better scripts/tools that produced better output?
|
||||
- More comprehensive examples that guided edge cases?
|
||||
- Better error handling guidance?
|
||||
|
||||
Be specific. Quote from skills/transcripts where relevant.
|
||||
|
||||
### Step 6: Identify Loser Weaknesses
|
||||
|
||||
Determine what held the loser back:
|
||||
- Ambiguous instructions that led to suboptimal choices?
|
||||
- Missing tools/scripts that forced workarounds?
|
||||
- Gaps in edge case coverage?
|
||||
- Poor error handling that caused failures?
|
||||
|
||||
### Step 7: Generate Improvement Suggestions
|
||||
|
||||
Based on the analysis, produce actionable suggestions for improving the loser skill:
|
||||
- Specific instruction changes to make
|
||||
- Tools/scripts to add or modify
|
||||
- Examples to include
|
||||
- Edge cases to address
|
||||
|
||||
Prioritize by impact. Focus on changes that would have changed the outcome.
|
||||
|
||||
### Step 8: Write Analysis Results
|
||||
|
||||
Save structured analysis to `{output_path}`.
|
||||
|
||||
## Output Format
|
||||
|
||||
Write a JSON file with this structure:
|
||||
|
||||
```json
|
||||
{
|
||||
"comparison_summary": {
|
||||
"winner": "A",
|
||||
"winner_skill": "path/to/winner/skill",
|
||||
"loser_skill": "path/to/loser/skill",
|
||||
"comparator_reasoning": "Brief summary of why comparator chose winner"
|
||||
},
|
||||
"winner_strengths": [
|
||||
"Clear step-by-step instructions for handling multi-page documents",
|
||||
"Included validation script that caught formatting errors",
|
||||
"Explicit guidance on fallback behavior when OCR fails"
|
||||
],
|
||||
"loser_weaknesses": [
|
||||
"Vague instruction 'process the document appropriately' led to inconsistent behavior",
|
||||
"No script for validation, agent had to improvise and made errors",
|
||||
"No guidance on OCR failure, agent gave up instead of trying alternatives"
|
||||
],
|
||||
"instruction_following": {
|
||||
"winner": {
|
||||
"score": 9,
|
||||
"issues": [
|
||||
"Minor: skipped optional logging step"
|
||||
]
|
||||
},
|
||||
"loser": {
|
||||
"score": 6,
|
||||
"issues": [
|
||||
"Did not use the skill's formatting template",
|
||||
"Invented own approach instead of following step 3",
|
||||
"Missed the 'always validate output' instruction"
|
||||
]
|
||||
}
|
||||
},
|
||||
"improvement_suggestions": [
|
||||
{
|
||||
"priority": "high",
|
||||
"category": "instructions",
|
||||
"suggestion": "Replace 'process the document appropriately' with explicit steps: 1) Extract text, 2) Identify sections, 3) Format per template",
|
||||
"expected_impact": "Would eliminate ambiguity that caused inconsistent behavior"
|
||||
},
|
||||
{
|
||||
"priority": "high",
|
||||
"category": "tools",
|
||||
"suggestion": "Add validate_output.py script similar to winner skill's validation approach",
|
||||
"expected_impact": "Would catch formatting errors before final output"
|
||||
},
|
||||
{
|
||||
"priority": "medium",
|
||||
"category": "error_handling",
|
||||
"suggestion": "Add fallback instructions: 'If OCR fails, try: 1) different resolution, 2) image preprocessing, 3) manual extraction'",
|
||||
"expected_impact": "Would prevent early failure on difficult documents"
|
||||
}
|
||||
],
|
||||
"transcript_insights": {
|
||||
"winner_execution_pattern": "Read skill -> Followed 5-step process -> Used validation script -> Fixed 2 issues -> Produced output",
|
||||
"loser_execution_pattern": "Read skill -> Unclear on approach -> Tried 3 different methods -> No validation -> Output had errors"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Guidelines
|
||||
|
||||
- **Be specific**: Quote from skills and transcripts, don't just say "instructions were unclear"
|
||||
- **Be actionable**: Suggestions should be concrete changes, not vague advice
|
||||
- **Focus on skill improvements**: The goal is to improve the losing skill, not critique the agent
|
||||
- **Prioritize by impact**: Which changes would most likely have changed the outcome?
|
||||
- **Consider causation**: Did the skill weakness actually cause the worse output, or is it incidental?
|
||||
- **Stay objective**: Analyze what happened, don't editorialize
|
||||
- **Think about generalization**: Would this improvement help on other evals too?
|
||||
|
||||
## Categories for Suggestions
|
||||
|
||||
Use these categories to organize improvement suggestions:
|
||||
|
||||
| Category | Description |
|
||||
|----------|-------------|
|
||||
| `instructions` | Changes to the skill's prose instructions |
|
||||
| `tools` | Scripts, templates, or utilities to add/modify |
|
||||
| `examples` | Example inputs/outputs to include |
|
||||
| `error_handling` | Guidance for handling failures |
|
||||
| `structure` | Reorganization of skill content |
|
||||
| `references` | External docs or resources to add |
|
||||
|
||||
## Priority Levels
|
||||
|
||||
- **high**: Would likely change the outcome of this comparison
|
||||
- **medium**: Would improve quality but may not change win/loss
|
||||
- **low**: Nice to have, marginal improvement
|
||||
|
||||
---
|
||||
|
||||
# Analyzing Benchmark Results
|
||||
|
||||
When analyzing benchmark results, the analyzer's purpose is to **surface patterns and anomalies** across multiple runs, not suggest skill improvements.
|
||||
|
||||
## Role
|
||||
|
||||
Review all benchmark run results and generate freeform notes that help the user understand skill performance. Focus on patterns that wouldn't be visible from aggregate metrics alone.
|
||||
|
||||
## Inputs
|
||||
|
||||
You receive these parameters in your prompt:
|
||||
|
||||
- **benchmark_data_path**: Path to the in-progress benchmark.json with all run results
|
||||
- **skill_path**: Path to the skill being benchmarked
|
||||
- **output_path**: Where to save the notes (as JSON array of strings)
|
||||
|
||||
## Process
|
||||
|
||||
### Step 1: Read Benchmark Data
|
||||
|
||||
1. Read the benchmark.json containing all run results
|
||||
2. Note the configurations tested (with_skill, without_skill)
|
||||
3. Understand the run_summary aggregates already calculated
|
||||
|
||||
### Step 2: Analyze Per-Assertion Patterns
|
||||
|
||||
For each expectation across all runs:
|
||||
- Does it **always pass** in both configurations? (may not differentiate skill value)
|
||||
- Does it **always fail** in both configurations? (may be broken or beyond capability)
|
||||
- Does it **always pass with skill but fail without**? (skill clearly adds value here)
|
||||
- Does it **always fail with skill but pass without**? (skill may be hurting)
|
||||
- Is it **highly variable**? (flaky expectation or non-deterministic behavior)
|
||||
|
||||
### Step 3: Analyze Cross-Eval Patterns
|
||||
|
||||
Look for patterns across evals:
|
||||
- Are certain eval types consistently harder/easier?
|
||||
- Do some evals show high variance while others are stable?
|
||||
- Are there surprising results that contradict expectations?
|
||||
|
||||
### Step 4: Analyze Metrics Patterns
|
||||
|
||||
Look at time_seconds, tokens, tool_calls:
|
||||
- Does the skill significantly increase execution time?
|
||||
- Is there high variance in resource usage?
|
||||
- Are there outlier runs that skew the aggregates?
|
||||
|
||||
### Step 5: Generate Notes
|
||||
|
||||
Write freeform observations as a list of strings. Each note should:
|
||||
- State a specific observation
|
||||
- Be grounded in the data (not speculation)
|
||||
- Help the user understand something the aggregate metrics don't show
|
||||
|
||||
Examples:
|
||||
- "Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value"
|
||||
- "Eval 3 shows high variance (50% ± 40%) - run 2 had an unusual failure that may be flaky"
|
||||
- "Without-skill runs consistently fail on table extraction expectations (0% pass rate)"
|
||||
- "Skill adds 13s average execution time but improves pass rate by 50%"
|
||||
- "Token usage is 80% higher with skill, primarily due to script output parsing"
|
||||
- "All 3 without-skill runs for eval 1 produced empty output"
|
||||
|
||||
### Step 6: Write Notes
|
||||
|
||||
Save notes to `{output_path}` as a JSON array of strings:
|
||||
|
||||
```json
|
||||
[
|
||||
"Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value",
|
||||
"Eval 3 shows high variance (50% ± 40%) - run 2 had an unusual failure",
|
||||
"Without-skill runs consistently fail on table extraction expectations",
|
||||
"Skill adds 13s average execution time but improves pass rate by 50%"
|
||||
]
|
||||
```
|
||||
|
||||
## Guidelines
|
||||
|
||||
**DO:**
|
||||
- Report what you observe in the data
|
||||
- Be specific about which evals, expectations, or runs you're referring to
|
||||
- Note patterns that aggregate metrics would hide
|
||||
- Provide context that helps interpret the numbers
|
||||
|
||||
**DO NOT:**
|
||||
- Suggest improvements to the skill (that's for the improvement step, not benchmarking)
|
||||
- Make subjective quality judgments ("the output was good/bad")
|
||||
- Speculate about causes without evidence
|
||||
- Repeat information already in the run_summary aggregates
|
||||
202
.agents/skills/skill-creator/agents/comparator.md
Normal file
202
.agents/skills/skill-creator/agents/comparator.md
Normal file
@@ -0,0 +1,202 @@
|
||||
# Blind Comparator Agent
|
||||
|
||||
Compare two outputs WITHOUT knowing which skill produced them.
|
||||
|
||||
## Role
|
||||
|
||||
The Blind Comparator judges which output better accomplishes the eval task. You receive two outputs labeled A and B, but you do NOT know which skill produced which. This prevents bias toward a particular skill or approach.
|
||||
|
||||
Your judgment is based purely on output quality and task completion.
|
||||
|
||||
## Inputs
|
||||
|
||||
You receive these parameters in your prompt:
|
||||
|
||||
- **output_a_path**: Path to the first output file or directory
|
||||
- **output_b_path**: Path to the second output file or directory
|
||||
- **eval_prompt**: The original task/prompt that was executed
|
||||
- **expectations**: List of expectations to check (optional - may be empty)
|
||||
|
||||
## Process
|
||||
|
||||
### Step 1: Read Both Outputs
|
||||
|
||||
1. Examine output A (file or directory)
|
||||
2. Examine output B (file or directory)
|
||||
3. Note the type, structure, and content of each
|
||||
4. If outputs are directories, examine all relevant files inside
|
||||
|
||||
### Step 2: Understand the Task
|
||||
|
||||
1. Read the eval_prompt carefully
|
||||
2. Identify what the task requires:
|
||||
- What should be produced?
|
||||
- What qualities matter (accuracy, completeness, format)?
|
||||
- What would distinguish a good output from a poor one?
|
||||
|
||||
### Step 3: Generate Evaluation Rubric
|
||||
|
||||
Based on the task, generate a rubric with two dimensions:
|
||||
|
||||
**Content Rubric** (what the output contains):
|
||||
| Criterion | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
|
||||
|-----------|----------|----------------|---------------|
|
||||
| Correctness | Major errors | Minor errors | Fully correct |
|
||||
| Completeness | Missing key elements | Mostly complete | All elements present |
|
||||
| Accuracy | Significant inaccuracies | Minor inaccuracies | Accurate throughout |
|
||||
|
||||
**Structure Rubric** (how the output is organized):
|
||||
| Criterion | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
|
||||
|-----------|----------|----------------|---------------|
|
||||
| Organization | Disorganized | Reasonably organized | Clear, logical structure |
|
||||
| Formatting | Inconsistent/broken | Mostly consistent | Professional, polished |
|
||||
| Usability | Difficult to use | Usable with effort | Easy to use |
|
||||
|
||||
Adapt criteria to the specific task. For example:
|
||||
- PDF form → "Field alignment", "Text readability", "Data placement"
|
||||
- Document → "Section structure", "Heading hierarchy", "Paragraph flow"
|
||||
- Data output → "Schema correctness", "Data types", "Completeness"
|
||||
|
||||
### Step 4: Evaluate Each Output Against the Rubric
|
||||
|
||||
For each output (A and B):
|
||||
|
||||
1. **Score each criterion** on the rubric (1-5 scale)
|
||||
2. **Calculate dimension totals**: Content score, Structure score
|
||||
3. **Calculate overall score**: Average of dimension scores, scaled to 1-10
|
||||
|
||||
### Step 5: Check Assertions (if provided)
|
||||
|
||||
If expectations are provided:
|
||||
|
||||
1. Check each expectation against output A
|
||||
2. Check each expectation against output B
|
||||
3. Count pass rates for each output
|
||||
4. Use expectation scores as secondary evidence (not the primary decision factor)
|
||||
|
||||
### Step 6: Determine the Winner
|
||||
|
||||
Compare A and B based on (in priority order):
|
||||
|
||||
1. **Primary**: Overall rubric score (content + structure)
|
||||
2. **Secondary**: Assertion pass rates (if applicable)
|
||||
3. **Tiebreaker**: If truly equal, declare a TIE
|
||||
|
||||
Be decisive - ties should be rare. One output is usually better, even if marginally.
|
||||
|
||||
### Step 7: Write Comparison Results
|
||||
|
||||
Save results to a JSON file at the path specified (or `comparison.json` if not specified).
|
||||
|
||||
## Output Format
|
||||
|
||||
Write a JSON file with this structure:
|
||||
|
||||
```json
|
||||
{
|
||||
"winner": "A",
|
||||
"reasoning": "Output A provides a complete solution with proper formatting and all required fields. Output B is missing the date field and has formatting inconsistencies.",
|
||||
"rubric": {
|
||||
"A": {
|
||||
"content": {
|
||||
"correctness": 5,
|
||||
"completeness": 5,
|
||||
"accuracy": 4
|
||||
},
|
||||
"structure": {
|
||||
"organization": 4,
|
||||
"formatting": 5,
|
||||
"usability": 4
|
||||
},
|
||||
"content_score": 4.7,
|
||||
"structure_score": 4.3,
|
||||
"overall_score": 9.0
|
||||
},
|
||||
"B": {
|
||||
"content": {
|
||||
"correctness": 3,
|
||||
"completeness": 2,
|
||||
"accuracy": 3
|
||||
},
|
||||
"structure": {
|
||||
"organization": 3,
|
||||
"formatting": 2,
|
||||
"usability": 3
|
||||
},
|
||||
"content_score": 2.7,
|
||||
"structure_score": 2.7,
|
||||
"overall_score": 5.4
|
||||
}
|
||||
},
|
||||
"output_quality": {
|
||||
"A": {
|
||||
"score": 9,
|
||||
"strengths": ["Complete solution", "Well-formatted", "All fields present"],
|
||||
"weaknesses": ["Minor style inconsistency in header"]
|
||||
},
|
||||
"B": {
|
||||
"score": 5,
|
||||
"strengths": ["Readable output", "Correct basic structure"],
|
||||
"weaknesses": ["Missing date field", "Formatting inconsistencies", "Partial data extraction"]
|
||||
}
|
||||
},
|
||||
"expectation_results": {
|
||||
"A": {
|
||||
"passed": 4,
|
||||
"total": 5,
|
||||
"pass_rate": 0.80,
|
||||
"details": [
|
||||
{"text": "Output includes name", "passed": true},
|
||||
{"text": "Output includes date", "passed": true},
|
||||
{"text": "Format is PDF", "passed": true},
|
||||
{"text": "Contains signature", "passed": false},
|
||||
{"text": "Readable text", "passed": true}
|
||||
]
|
||||
},
|
||||
"B": {
|
||||
"passed": 3,
|
||||
"total": 5,
|
||||
"pass_rate": 0.60,
|
||||
"details": [
|
||||
{"text": "Output includes name", "passed": true},
|
||||
{"text": "Output includes date", "passed": false},
|
||||
{"text": "Format is PDF", "passed": true},
|
||||
{"text": "Contains signature", "passed": false},
|
||||
{"text": "Readable text", "passed": true}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If no expectations were provided, omit the `expectation_results` field entirely.
|
||||
|
||||
## Field Descriptions
|
||||
|
||||
- **winner**: "A", "B", or "TIE"
|
||||
- **reasoning**: Clear explanation of why the winner was chosen (or why it's a tie)
|
||||
- **rubric**: Structured rubric evaluation for each output
|
||||
- **content**: Scores for content criteria (correctness, completeness, accuracy)
|
||||
- **structure**: Scores for structure criteria (organization, formatting, usability)
|
||||
- **content_score**: Average of content criteria (1-5)
|
||||
- **structure_score**: Average of structure criteria (1-5)
|
||||
- **overall_score**: Combined score scaled to 1-10
|
||||
- **output_quality**: Summary quality assessment
|
||||
- **score**: 1-10 rating (should match rubric overall_score)
|
||||
- **strengths**: List of positive aspects
|
||||
- **weaknesses**: List of issues or shortcomings
|
||||
- **expectation_results**: (Only if expectations provided)
|
||||
- **passed**: Number of expectations that passed
|
||||
- **total**: Total number of expectations
|
||||
- **pass_rate**: Fraction passed (0.0 to 1.0)
|
||||
- **details**: Individual expectation results
|
||||
|
||||
## Guidelines
|
||||
|
||||
- **Stay blind**: DO NOT try to infer which skill produced which output. Judge purely on output quality.
|
||||
- **Be specific**: Cite specific examples when explaining strengths and weaknesses.
|
||||
- **Be decisive**: Choose a winner unless outputs are genuinely equivalent.
|
||||
- **Output quality first**: Assertion scores are secondary to overall task completion.
|
||||
- **Be objective**: Don't favor outputs based on style preferences; focus on correctness and completeness.
|
||||
- **Explain your reasoning**: The reasoning field should make it clear why you chose the winner.
|
||||
- **Handle edge cases**: If both outputs fail, pick the one that fails less badly. If both are excellent, pick the one that's marginally better.
|
||||
223
.agents/skills/skill-creator/agents/grader.md
Normal file
223
.agents/skills/skill-creator/agents/grader.md
Normal file
@@ -0,0 +1,223 @@
|
||||
# Grader Agent
|
||||
|
||||
Evaluate expectations against an execution transcript and outputs.
|
||||
|
||||
## Role
|
||||
|
||||
The Grader reviews a transcript and output files, then determines whether each expectation passes or fails. Provide clear evidence for each judgment.
|
||||
|
||||
You have two jobs: grade the outputs, and critique the evals themselves. A passing grade on a weak assertion is worse than useless — it creates false confidence. When you notice an assertion that's trivially satisfied, or an important outcome that no assertion checks, say so.
|
||||
|
||||
## Inputs
|
||||
|
||||
You receive these parameters in your prompt:
|
||||
|
||||
- **expectations**: List of expectations to evaluate (strings)
|
||||
- **transcript_path**: Path to the execution transcript (markdown file)
|
||||
- **outputs_dir**: Directory containing output files from execution
|
||||
|
||||
## Process
|
||||
|
||||
### Step 1: Read the Transcript
|
||||
|
||||
1. Read the transcript file completely
|
||||
2. Note the eval prompt, execution steps, and final result
|
||||
3. Identify any issues or errors documented
|
||||
|
||||
### Step 2: Examine Output Files
|
||||
|
||||
1. List files in outputs_dir
|
||||
2. Read/examine each file relevant to the expectations. If outputs aren't plain text, use the inspection tools provided in your prompt — don't rely solely on what the transcript says the executor produced.
|
||||
3. Note contents, structure, and quality
|
||||
|
||||
### Step 3: Evaluate Each Assertion
|
||||
|
||||
For each expectation:
|
||||
|
||||
1. **Search for evidence** in the transcript and outputs
|
||||
2. **Determine verdict**:
|
||||
- **PASS**: Clear evidence the expectation is true AND the evidence reflects genuine task completion, not just surface-level compliance
|
||||
- **FAIL**: No evidence, or evidence contradicts the expectation, or the evidence is superficial (e.g., correct filename but empty/wrong content)
|
||||
3. **Cite the evidence**: Quote the specific text or describe what you found
|
||||
|
||||
### Step 4: Extract and Verify Claims
|
||||
|
||||
Beyond the predefined expectations, extract implicit claims from the outputs and verify them:
|
||||
|
||||
1. **Extract claims** from the transcript and outputs:
|
||||
- Factual statements ("The form has 12 fields")
|
||||
- Process claims ("Used pypdf to fill the form")
|
||||
- Quality claims ("All fields were filled correctly")
|
||||
|
||||
2. **Verify each claim**:
|
||||
- **Factual claims**: Can be checked against the outputs or external sources
|
||||
- **Process claims**: Can be verified from the transcript
|
||||
- **Quality claims**: Evaluate whether the claim is justified
|
||||
|
||||
3. **Flag unverifiable claims**: Note claims that cannot be verified with available information
|
||||
|
||||
This catches issues that predefined expectations might miss.
|
||||
|
||||
### Step 5: Read User Notes
|
||||
|
||||
If `{outputs_dir}/user_notes.md` exists:
|
||||
1. Read it and note any uncertainties or issues flagged by the executor
|
||||
2. Include relevant concerns in the grading output
|
||||
3. These may reveal problems even when expectations pass
|
||||
|
||||
### Step 6: Critique the Evals
|
||||
|
||||
After grading, consider whether the evals themselves could be improved. Only surface suggestions when there's a clear gap.
|
||||
|
||||
Good suggestions test meaningful outcomes — assertions that are hard to satisfy without actually doing the work correctly. Think about what makes an assertion *discriminating*: it passes when the skill genuinely succeeds and fails when it doesn't.
|
||||
|
||||
Suggestions worth raising:
|
||||
- An assertion that passed but would also pass for a clearly wrong output (e.g., checking filename existence but not file content)
|
||||
- An important outcome you observed — good or bad — that no assertion covers at all
|
||||
- An assertion that can't actually be verified from the available outputs
|
||||
|
||||
Keep the bar high. The goal is to flag things the eval author would say "good catch" about, not to nitpick every assertion.
|
||||
|
||||
### Step 7: Write Grading Results
|
||||
|
||||
Save results to `{outputs_dir}/../grading.json` (sibling to outputs_dir).
|
||||
|
||||
## Grading Criteria
|
||||
|
||||
**PASS when**:
|
||||
- The transcript or outputs clearly demonstrate the expectation is true
|
||||
- Specific evidence can be cited
|
||||
- The evidence reflects genuine substance, not just surface compliance (e.g., a file exists AND contains correct content, not just the right filename)
|
||||
|
||||
**FAIL when**:
|
||||
- No evidence found for the expectation
|
||||
- Evidence contradicts the expectation
|
||||
- The expectation cannot be verified from available information
|
||||
- The evidence is superficial — the assertion is technically satisfied but the underlying task outcome is wrong or incomplete
|
||||
- The output appears to meet the assertion by coincidence rather than by actually doing the work
|
||||
|
||||
**When uncertain**: The burden of proof to pass is on the expectation.
|
||||
|
||||
### Step 8: Read Executor Metrics and Timing
|
||||
|
||||
1. If `{outputs_dir}/metrics.json` exists, read it and include in grading output
|
||||
2. If `{outputs_dir}/../timing.json` exists, read it and include timing data
|
||||
|
||||
## Output Format
|
||||
|
||||
Write a JSON file with this structure:
|
||||
|
||||
```json
|
||||
{
|
||||
"expectations": [
|
||||
{
|
||||
"text": "The output includes the name 'John Smith'",
|
||||
"passed": true,
|
||||
"evidence": "Found in transcript Step 3: 'Extracted names: John Smith, Sarah Johnson'"
|
||||
},
|
||||
{
|
||||
"text": "The spreadsheet has a SUM formula in cell B10",
|
||||
"passed": false,
|
||||
"evidence": "No spreadsheet was created. The output was a text file."
|
||||
},
|
||||
{
|
||||
"text": "The assistant used the skill's OCR script",
|
||||
"passed": true,
|
||||
"evidence": "Transcript Step 2 shows: 'Tool: Bash - python ocr_script.py image.png'"
|
||||
}
|
||||
],
|
||||
"summary": {
|
||||
"passed": 2,
|
||||
"failed": 1,
|
||||
"total": 3,
|
||||
"pass_rate": 0.67
|
||||
},
|
||||
"execution_metrics": {
|
||||
"tool_calls": {
|
||||
"Read": 5,
|
||||
"Write": 2,
|
||||
"Bash": 8
|
||||
},
|
||||
"total_tool_calls": 15,
|
||||
"total_steps": 6,
|
||||
"errors_encountered": 0,
|
||||
"output_chars": 12450,
|
||||
"transcript_chars": 3200
|
||||
},
|
||||
"timing": {
|
||||
"executor_duration_seconds": 165.0,
|
||||
"grader_duration_seconds": 26.0,
|
||||
"total_duration_seconds": 191.0
|
||||
},
|
||||
"claims": [
|
||||
{
|
||||
"claim": "The form has 12 fillable fields",
|
||||
"type": "factual",
|
||||
"verified": true,
|
||||
"evidence": "Counted 12 fields in field_info.json"
|
||||
},
|
||||
{
|
||||
"claim": "All required fields were populated",
|
||||
"type": "quality",
|
||||
"verified": false,
|
||||
"evidence": "Reference section was left blank despite data being available"
|
||||
}
|
||||
],
|
||||
"user_notes_summary": {
|
||||
"uncertainties": ["Used 2023 data, may be stale"],
|
||||
"needs_review": [],
|
||||
"workarounds": ["Fell back to text overlay for non-fillable fields"]
|
||||
},
|
||||
"eval_feedback": {
|
||||
"suggestions": [
|
||||
{
|
||||
"assertion": "The output includes the name 'John Smith'",
|
||||
"reason": "A hallucinated document that mentions the name would also pass — consider checking it appears as the primary contact with matching phone and email from the input"
|
||||
},
|
||||
{
|
||||
"reason": "No assertion checks whether the extracted phone numbers match the input — I observed incorrect numbers in the output that went uncaught"
|
||||
}
|
||||
],
|
||||
"overall": "Assertions check presence but not correctness. Consider adding content verification."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Field Descriptions
|
||||
|
||||
- **expectations**: Array of graded expectations
|
||||
- **text**: The original expectation text
|
||||
- **passed**: Boolean - true if expectation passes
|
||||
- **evidence**: Specific quote or description supporting the verdict
|
||||
- **summary**: Aggregate statistics
|
||||
- **passed**: Count of passed expectations
|
||||
- **failed**: Count of failed expectations
|
||||
- **total**: Total expectations evaluated
|
||||
- **pass_rate**: Fraction passed (0.0 to 1.0)
|
||||
- **execution_metrics**: Copied from executor's metrics.json (if available)
|
||||
- **output_chars**: Total character count of output files (proxy for tokens)
|
||||
- **transcript_chars**: Character count of transcript
|
||||
- **timing**: Wall clock timing from timing.json (if available)
|
||||
- **executor_duration_seconds**: Time spent in executor subagent
|
||||
- **total_duration_seconds**: Total elapsed time for the run
|
||||
- **claims**: Extracted and verified claims from the output
|
||||
- **claim**: The statement being verified
|
||||
- **type**: "factual", "process", or "quality"
|
||||
- **verified**: Boolean - whether the claim holds
|
||||
- **evidence**: Supporting or contradicting evidence
|
||||
- **user_notes_summary**: Issues flagged by the executor
|
||||
- **uncertainties**: Things the executor wasn't sure about
|
||||
- **needs_review**: Items requiring human attention
|
||||
- **workarounds**: Places where the skill didn't work as expected
|
||||
- **eval_feedback**: Improvement suggestions for the evals (only when warranted)
|
||||
- **suggestions**: List of concrete suggestions, each with a `reason` and optionally an `assertion` it relates to
|
||||
- **overall**: Brief assessment — can be "No suggestions, evals look solid" if nothing to flag
|
||||
|
||||
## Guidelines
|
||||
|
||||
- **Be objective**: Base verdicts on evidence, not assumptions
|
||||
- **Be specific**: Quote the exact text that supports your verdict
|
||||
- **Be thorough**: Check both transcript and output files
|
||||
- **Be consistent**: Apply the same standard to each expectation
|
||||
- **Explain failures**: Make it clear why evidence was insufficient
|
||||
- **No partial credit**: Each expectation is pass or fail, not partial
|
||||
146
.agents/skills/skill-creator/assets/eval_review.html
Normal file
146
.agents/skills/skill-creator/assets/eval_review.html
Normal file
@@ -0,0 +1,146 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Eval Set Review - __SKILL_NAME_PLACEHOLDER__</title>
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Poppins:wght@500;600&family=Lora:wght@400;500&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
* { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
body { font-family: 'Lora', Georgia, serif; background: #faf9f5; padding: 2rem; color: #141413; }
|
||||
h1 { font-family: 'Poppins', sans-serif; margin-bottom: 0.5rem; font-size: 1.5rem; }
|
||||
.description { color: #b0aea5; margin-bottom: 1.5rem; font-style: italic; max-width: 900px; }
|
||||
.controls { margin-bottom: 1rem; display: flex; gap: 0.5rem; }
|
||||
.btn { font-family: 'Poppins', sans-serif; padding: 0.5rem 1rem; border: none; border-radius: 6px; cursor: pointer; font-size: 0.875rem; font-weight: 500; }
|
||||
.btn-add { background: #6a9bcc; color: white; }
|
||||
.btn-add:hover { background: #5889b8; }
|
||||
.btn-export { background: #d97757; color: white; }
|
||||
.btn-export:hover { background: #c4613f; }
|
||||
table { width: 100%; max-width: 1100px; border-collapse: collapse; background: white; border-radius: 6px; overflow: hidden; box-shadow: 0 1px 3px rgba(0,0,0,0.08); }
|
||||
th { font-family: 'Poppins', sans-serif; background: #141413; color: #faf9f5; padding: 0.75rem 1rem; text-align: left; font-size: 0.875rem; }
|
||||
td { padding: 0.75rem 1rem; border-bottom: 1px solid #e8e6dc; vertical-align: top; }
|
||||
tr:nth-child(even) td { background: #faf9f5; }
|
||||
tr:hover td { background: #f3f1ea; }
|
||||
.section-header td { background: #e8e6dc; font-family: 'Poppins', sans-serif; font-weight: 500; font-size: 0.8rem; color: #141413; text-transform: uppercase; letter-spacing: 0.05em; }
|
||||
.query-input { width: 100%; padding: 0.4rem; border: 1px solid #e8e6dc; border-radius: 4px; font-size: 0.875rem; font-family: 'Lora', Georgia, serif; resize: vertical; min-height: 60px; }
|
||||
.query-input:focus { outline: none; border-color: #d97757; box-shadow: 0 0 0 2px rgba(217,119,87,0.15); }
|
||||
.toggle { position: relative; display: inline-block; width: 44px; height: 24px; }
|
||||
.toggle input { opacity: 0; width: 0; height: 0; }
|
||||
.toggle .slider { position: absolute; inset: 0; background: #b0aea5; border-radius: 24px; cursor: pointer; transition: 0.2s; }
|
||||
.toggle .slider::before { content: ""; position: absolute; width: 18px; height: 18px; left: 3px; bottom: 3px; background: white; border-radius: 50%; transition: 0.2s; }
|
||||
.toggle input:checked + .slider { background: #d97757; }
|
||||
.toggle input:checked + .slider::before { transform: translateX(20px); }
|
||||
.btn-delete { background: #c44; color: white; padding: 0.3rem 0.6rem; border: none; border-radius: 4px; cursor: pointer; font-size: 0.75rem; font-family: 'Poppins', sans-serif; }
|
||||
.btn-delete:hover { background: #a33; }
|
||||
.summary { margin-top: 1rem; color: #b0aea5; font-size: 0.875rem; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Eval Set Review: <span id="skill-name">__SKILL_NAME_PLACEHOLDER__</span></h1>
|
||||
<p class="description">Current description: <span id="skill-desc">__SKILL_DESCRIPTION_PLACEHOLDER__</span></p>
|
||||
|
||||
<div class="controls">
|
||||
<button class="btn btn-add" onclick="addRow()">+ Add Query</button>
|
||||
<button class="btn btn-export" onclick="exportEvalSet()">Export Eval Set</button>
|
||||
</div>
|
||||
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th style="width:65%">Query</th>
|
||||
<th style="width:18%">Should Trigger</th>
|
||||
<th style="width:10%">Actions</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="eval-body"></tbody>
|
||||
</table>
|
||||
|
||||
<p class="summary" id="summary"></p>
|
||||
|
||||
<script>
|
||||
const EVAL_DATA = __EVAL_DATA_PLACEHOLDER__;
|
||||
|
||||
let evalItems = [...EVAL_DATA];
|
||||
|
||||
function render() {
|
||||
const tbody = document.getElementById('eval-body');
|
||||
tbody.innerHTML = '';
|
||||
|
||||
// Sort: should-trigger first, then should-not-trigger
|
||||
const sorted = evalItems
|
||||
.map((item, origIdx) => ({ ...item, origIdx }))
|
||||
.sort((a, b) => (b.should_trigger ? 1 : 0) - (a.should_trigger ? 1 : 0));
|
||||
|
||||
let lastGroup = null;
|
||||
sorted.forEach(item => {
|
||||
const group = item.should_trigger ? 'trigger' : 'no-trigger';
|
||||
if (group !== lastGroup) {
|
||||
const headerRow = document.createElement('tr');
|
||||
headerRow.className = 'section-header';
|
||||
headerRow.innerHTML = `<td colspan="3">${item.should_trigger ? 'Should Trigger' : 'Should NOT Trigger'}</td>`;
|
||||
tbody.appendChild(headerRow);
|
||||
lastGroup = group;
|
||||
}
|
||||
|
||||
const idx = item.origIdx;
|
||||
const tr = document.createElement('tr');
|
||||
tr.innerHTML = `
|
||||
<td><textarea class="query-input" onchange="updateQuery(${idx}, this.value)">${escapeHtml(item.query)}</textarea></td>
|
||||
<td>
|
||||
<label class="toggle">
|
||||
<input type="checkbox" ${item.should_trigger ? 'checked' : ''} onchange="updateTrigger(${idx}, this.checked)">
|
||||
<span class="slider"></span>
|
||||
</label>
|
||||
<span style="margin-left:8px;font-size:0.8rem;color:#b0aea5">${item.should_trigger ? 'Yes' : 'No'}</span>
|
||||
</td>
|
||||
<td><button class="btn-delete" onclick="deleteRow(${idx})">Delete</button></td>
|
||||
`;
|
||||
tbody.appendChild(tr);
|
||||
});
|
||||
updateSummary();
|
||||
}
|
||||
|
||||
function escapeHtml(text) {
|
||||
const div = document.createElement('div');
|
||||
div.textContent = text;
|
||||
return div.innerHTML;
|
||||
}
|
||||
|
||||
function updateQuery(idx, value) { evalItems[idx].query = value; updateSummary(); }
|
||||
function updateTrigger(idx, value) { evalItems[idx].should_trigger = value; render(); }
|
||||
function deleteRow(idx) { evalItems.splice(idx, 1); render(); }
|
||||
|
||||
function addRow() {
|
||||
evalItems.push({ query: '', should_trigger: true });
|
||||
render();
|
||||
const inputs = document.querySelectorAll('.query-input');
|
||||
inputs[inputs.length - 1].focus();
|
||||
}
|
||||
|
||||
function updateSummary() {
|
||||
const trigger = evalItems.filter(i => i.should_trigger).length;
|
||||
const noTrigger = evalItems.filter(i => !i.should_trigger).length;
|
||||
document.getElementById('summary').textContent =
|
||||
`${evalItems.length} queries total: ${trigger} should trigger, ${noTrigger} should not trigger`;
|
||||
}
|
||||
|
||||
function exportEvalSet() {
|
||||
const valid = evalItems.filter(i => i.query.trim() !== '');
|
||||
const data = valid.map(i => ({ query: i.query.trim(), should_trigger: i.should_trigger }));
|
||||
const blob = new Blob([JSON.stringify(data, null, 2)], { type: 'application/json' });
|
||||
const url = URL.createObjectURL(blob);
|
||||
const a = document.createElement('a');
|
||||
a.href = url;
|
||||
a.download = 'eval_set.json';
|
||||
document.body.appendChild(a);
|
||||
a.click();
|
||||
document.body.removeChild(a);
|
||||
URL.revokeObjectURL(url);
|
||||
}
|
||||
|
||||
render();
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
471
.agents/skills/skill-creator/eval-viewer/generate_review.py
Normal file
471
.agents/skills/skill-creator/eval-viewer/generate_review.py
Normal file
@@ -0,0 +1,471 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Generate and serve a review page for eval results.
|
||||
|
||||
Reads the workspace directory, discovers runs (directories with outputs/),
|
||||
embeds all output data into a self-contained HTML page, and serves it via
|
||||
a tiny HTTP server. Feedback auto-saves to feedback.json in the workspace.
|
||||
|
||||
Usage:
|
||||
python generate_review.py <workspace-path> [--port PORT] [--skill-name NAME]
|
||||
python generate_review.py <workspace-path> --previous-feedback /path/to/old/feedback.json
|
||||
|
||||
No dependencies beyond the Python stdlib are required.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import base64
|
||||
import json
|
||||
import mimetypes
|
||||
import os
|
||||
import re
|
||||
import signal
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import webbrowser
|
||||
from functools import partial
|
||||
from http.server import HTTPServer, BaseHTTPRequestHandler
|
||||
from pathlib import Path
|
||||
|
||||
# Files to exclude from output listings
|
||||
METADATA_FILES = {"transcript.md", "user_notes.md", "metrics.json"}
|
||||
|
||||
# Extensions we render as inline text
|
||||
TEXT_EXTENSIONS = {
|
||||
".txt", ".md", ".json", ".csv", ".py", ".js", ".ts", ".tsx", ".jsx",
|
||||
".yaml", ".yml", ".xml", ".html", ".css", ".sh", ".rb", ".go", ".rs",
|
||||
".java", ".c", ".cpp", ".h", ".hpp", ".sql", ".r", ".toml",
|
||||
}
|
||||
|
||||
# Extensions we render as inline images
|
||||
IMAGE_EXTENSIONS = {".png", ".jpg", ".jpeg", ".gif", ".svg", ".webp"}
|
||||
|
||||
# MIME type overrides for common types
|
||||
MIME_OVERRIDES = {
|
||||
".svg": "image/svg+xml",
|
||||
".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
|
||||
".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
|
||||
".pptx": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
|
||||
}
|
||||
|
||||
|
||||
def get_mime_type(path: Path) -> str:
|
||||
ext = path.suffix.lower()
|
||||
if ext in MIME_OVERRIDES:
|
||||
return MIME_OVERRIDES[ext]
|
||||
mime, _ = mimetypes.guess_type(str(path))
|
||||
return mime or "application/octet-stream"
|
||||
|
||||
|
||||
def find_runs(workspace: Path) -> list[dict]:
|
||||
"""Recursively find directories that contain an outputs/ subdirectory."""
|
||||
runs: list[dict] = []
|
||||
_find_runs_recursive(workspace, workspace, runs)
|
||||
runs.sort(key=lambda r: (r.get("eval_id", float("inf")), r["id"]))
|
||||
return runs
|
||||
|
||||
|
||||
def _find_runs_recursive(root: Path, current: Path, runs: list[dict]) -> None:
|
||||
if not current.is_dir():
|
||||
return
|
||||
|
||||
outputs_dir = current / "outputs"
|
||||
if outputs_dir.is_dir():
|
||||
run = build_run(root, current)
|
||||
if run:
|
||||
runs.append(run)
|
||||
return
|
||||
|
||||
skip = {"node_modules", ".git", "__pycache__", "skill", "inputs"}
|
||||
for child in sorted(current.iterdir()):
|
||||
if child.is_dir() and child.name not in skip:
|
||||
_find_runs_recursive(root, child, runs)
|
||||
|
||||
|
||||
def build_run(root: Path, run_dir: Path) -> dict | None:
|
||||
"""Build a run dict with prompt, outputs, and grading data."""
|
||||
prompt = ""
|
||||
eval_id = None
|
||||
|
||||
# Try eval_metadata.json
|
||||
for candidate in [run_dir / "eval_metadata.json", run_dir.parent / "eval_metadata.json"]:
|
||||
if candidate.exists():
|
||||
try:
|
||||
metadata = json.loads(candidate.read_text())
|
||||
prompt = metadata.get("prompt", "")
|
||||
eval_id = metadata.get("eval_id")
|
||||
except (json.JSONDecodeError, OSError):
|
||||
pass
|
||||
if prompt:
|
||||
break
|
||||
|
||||
# Fall back to transcript.md
|
||||
if not prompt:
|
||||
for candidate in [run_dir / "transcript.md", run_dir / "outputs" / "transcript.md"]:
|
||||
if candidate.exists():
|
||||
try:
|
||||
text = candidate.read_text()
|
||||
match = re.search(r"## Eval Prompt\n\n([\s\S]*?)(?=\n##|$)", text)
|
||||
if match:
|
||||
prompt = match.group(1).strip()
|
||||
except OSError:
|
||||
pass
|
||||
if prompt:
|
||||
break
|
||||
|
||||
if not prompt:
|
||||
prompt = "(No prompt found)"
|
||||
|
||||
run_id = str(run_dir.relative_to(root)).replace("/", "-").replace("\\", "-")
|
||||
|
||||
# Collect output files
|
||||
outputs_dir = run_dir / "outputs"
|
||||
output_files: list[dict] = []
|
||||
if outputs_dir.is_dir():
|
||||
for f in sorted(outputs_dir.iterdir()):
|
||||
if f.is_file() and f.name not in METADATA_FILES:
|
||||
output_files.append(embed_file(f))
|
||||
|
||||
# Load grading if present
|
||||
grading = None
|
||||
for candidate in [run_dir / "grading.json", run_dir.parent / "grading.json"]:
|
||||
if candidate.exists():
|
||||
try:
|
||||
grading = json.loads(candidate.read_text())
|
||||
except (json.JSONDecodeError, OSError):
|
||||
pass
|
||||
if grading:
|
||||
break
|
||||
|
||||
return {
|
||||
"id": run_id,
|
||||
"prompt": prompt,
|
||||
"eval_id": eval_id,
|
||||
"outputs": output_files,
|
||||
"grading": grading,
|
||||
}
|
||||
|
||||
|
||||
def embed_file(path: Path) -> dict:
|
||||
"""Read a file and return an embedded representation."""
|
||||
ext = path.suffix.lower()
|
||||
mime = get_mime_type(path)
|
||||
|
||||
if ext in TEXT_EXTENSIONS:
|
||||
try:
|
||||
content = path.read_text(errors="replace")
|
||||
except OSError:
|
||||
content = "(Error reading file)"
|
||||
return {
|
||||
"name": path.name,
|
||||
"type": "text",
|
||||
"content": content,
|
||||
}
|
||||
elif ext in IMAGE_EXTENSIONS:
|
||||
try:
|
||||
raw = path.read_bytes()
|
||||
b64 = base64.b64encode(raw).decode("ascii")
|
||||
except OSError:
|
||||
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
|
||||
return {
|
||||
"name": path.name,
|
||||
"type": "image",
|
||||
"mime": mime,
|
||||
"data_uri": f"data:{mime};base64,{b64}",
|
||||
}
|
||||
elif ext == ".pdf":
|
||||
try:
|
||||
raw = path.read_bytes()
|
||||
b64 = base64.b64encode(raw).decode("ascii")
|
||||
except OSError:
|
||||
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
|
||||
return {
|
||||
"name": path.name,
|
||||
"type": "pdf",
|
||||
"data_uri": f"data:{mime};base64,{b64}",
|
||||
}
|
||||
elif ext == ".xlsx":
|
||||
try:
|
||||
raw = path.read_bytes()
|
||||
b64 = base64.b64encode(raw).decode("ascii")
|
||||
except OSError:
|
||||
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
|
||||
return {
|
||||
"name": path.name,
|
||||
"type": "xlsx",
|
||||
"data_b64": b64,
|
||||
}
|
||||
else:
|
||||
# Binary / unknown — base64 download link
|
||||
try:
|
||||
raw = path.read_bytes()
|
||||
b64 = base64.b64encode(raw).decode("ascii")
|
||||
except OSError:
|
||||
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
|
||||
return {
|
||||
"name": path.name,
|
||||
"type": "binary",
|
||||
"mime": mime,
|
||||
"data_uri": f"data:{mime};base64,{b64}",
|
||||
}
|
||||
|
||||
|
||||
def load_previous_iteration(workspace: Path) -> dict[str, dict]:
|
||||
"""Load previous iteration's feedback and outputs.
|
||||
|
||||
Returns a map of run_id -> {"feedback": str, "outputs": list[dict]}.
|
||||
"""
|
||||
result: dict[str, dict] = {}
|
||||
|
||||
# Load feedback
|
||||
feedback_map: dict[str, str] = {}
|
||||
feedback_path = workspace / "feedback.json"
|
||||
if feedback_path.exists():
|
||||
try:
|
||||
data = json.loads(feedback_path.read_text())
|
||||
feedback_map = {
|
||||
r["run_id"]: r["feedback"]
|
||||
for r in data.get("reviews", [])
|
||||
if r.get("feedback", "").strip()
|
||||
}
|
||||
except (json.JSONDecodeError, OSError, KeyError):
|
||||
pass
|
||||
|
||||
# Load runs (to get outputs)
|
||||
prev_runs = find_runs(workspace)
|
||||
for run in prev_runs:
|
||||
result[run["id"]] = {
|
||||
"feedback": feedback_map.get(run["id"], ""),
|
||||
"outputs": run.get("outputs", []),
|
||||
}
|
||||
|
||||
# Also add feedback for run_ids that had feedback but no matching run
|
||||
for run_id, fb in feedback_map.items():
|
||||
if run_id not in result:
|
||||
result[run_id] = {"feedback": fb, "outputs": []}
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def generate_html(
|
||||
runs: list[dict],
|
||||
skill_name: str,
|
||||
previous: dict[str, dict] | None = None,
|
||||
benchmark: dict | None = None,
|
||||
) -> str:
|
||||
"""Generate the complete standalone HTML page with embedded data."""
|
||||
template_path = Path(__file__).parent / "viewer.html"
|
||||
template = template_path.read_text()
|
||||
|
||||
# Build previous_feedback and previous_outputs maps for the template
|
||||
previous_feedback: dict[str, str] = {}
|
||||
previous_outputs: dict[str, list[dict]] = {}
|
||||
if previous:
|
||||
for run_id, data in previous.items():
|
||||
if data.get("feedback"):
|
||||
previous_feedback[run_id] = data["feedback"]
|
||||
if data.get("outputs"):
|
||||
previous_outputs[run_id] = data["outputs"]
|
||||
|
||||
embedded = {
|
||||
"skill_name": skill_name,
|
||||
"runs": runs,
|
||||
"previous_feedback": previous_feedback,
|
||||
"previous_outputs": previous_outputs,
|
||||
}
|
||||
if benchmark:
|
||||
embedded["benchmark"] = benchmark
|
||||
|
||||
data_json = json.dumps(embedded)
|
||||
|
||||
return template.replace("/*__EMBEDDED_DATA__*/", f"const EMBEDDED_DATA = {data_json};")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HTTP server (stdlib only, zero dependencies)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _kill_port(port: int) -> None:
|
||||
"""Kill any process listening on the given port."""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["lsof", "-ti", f":{port}"],
|
||||
capture_output=True, text=True, timeout=5,
|
||||
)
|
||||
for pid_str in result.stdout.strip().split("\n"):
|
||||
if pid_str.strip():
|
||||
try:
|
||||
os.kill(int(pid_str.strip()), signal.SIGTERM)
|
||||
except (ProcessLookupError, ValueError):
|
||||
pass
|
||||
if result.stdout.strip():
|
||||
time.sleep(0.5)
|
||||
except subprocess.TimeoutExpired:
|
||||
pass
|
||||
except FileNotFoundError:
|
||||
print("Note: lsof not found, cannot check if port is in use", file=sys.stderr)
|
||||
|
||||
class ReviewHandler(BaseHTTPRequestHandler):
|
||||
"""Serves the review HTML and handles feedback saves.
|
||||
|
||||
Regenerates the HTML on each page load so that refreshing the browser
|
||||
picks up new eval outputs without restarting the server.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
workspace: Path,
|
||||
skill_name: str,
|
||||
feedback_path: Path,
|
||||
previous: dict[str, dict],
|
||||
benchmark_path: Path | None,
|
||||
*args,
|
||||
**kwargs,
|
||||
):
|
||||
self.workspace = workspace
|
||||
self.skill_name = skill_name
|
||||
self.feedback_path = feedback_path
|
||||
self.previous = previous
|
||||
self.benchmark_path = benchmark_path
|
||||
super().__init__(*args, **kwargs)
|
||||
|
||||
def do_GET(self) -> None:
|
||||
if self.path == "/" or self.path == "/index.html":
|
||||
# Regenerate HTML on each request (re-scans workspace for new outputs)
|
||||
runs = find_runs(self.workspace)
|
||||
benchmark = None
|
||||
if self.benchmark_path and self.benchmark_path.exists():
|
||||
try:
|
||||
benchmark = json.loads(self.benchmark_path.read_text())
|
||||
except (json.JSONDecodeError, OSError):
|
||||
pass
|
||||
html = generate_html(runs, self.skill_name, self.previous, benchmark)
|
||||
content = html.encode("utf-8")
|
||||
self.send_response(200)
|
||||
self.send_header("Content-Type", "text/html; charset=utf-8")
|
||||
self.send_header("Content-Length", str(len(content)))
|
||||
self.end_headers()
|
||||
self.wfile.write(content)
|
||||
elif self.path == "/api/feedback":
|
||||
data = b"{}"
|
||||
if self.feedback_path.exists():
|
||||
data = self.feedback_path.read_bytes()
|
||||
self.send_response(200)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.send_header("Content-Length", str(len(data)))
|
||||
self.end_headers()
|
||||
self.wfile.write(data)
|
||||
else:
|
||||
self.send_error(404)
|
||||
|
||||
def do_POST(self) -> None:
|
||||
if self.path == "/api/feedback":
|
||||
length = int(self.headers.get("Content-Length", 0))
|
||||
body = self.rfile.read(length)
|
||||
try:
|
||||
data = json.loads(body)
|
||||
if not isinstance(data, dict) or "reviews" not in data:
|
||||
raise ValueError("Expected JSON object with 'reviews' key")
|
||||
self.feedback_path.write_text(json.dumps(data, indent=2) + "\n")
|
||||
resp = b'{"ok":true}'
|
||||
self.send_response(200)
|
||||
except (json.JSONDecodeError, OSError, ValueError) as e:
|
||||
resp = json.dumps({"error": str(e)}).encode()
|
||||
self.send_response(500)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.send_header("Content-Length", str(len(resp)))
|
||||
self.end_headers()
|
||||
self.wfile.write(resp)
|
||||
else:
|
||||
self.send_error(404)
|
||||
|
||||
def log_message(self, format: str, *args: object) -> None:
|
||||
# Suppress request logging to keep terminal clean
|
||||
pass
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description="Generate and serve eval review")
|
||||
parser.add_argument("workspace", type=Path, help="Path to workspace directory")
|
||||
parser.add_argument("--port", "-p", type=int, default=3117, help="Server port (default: 3117)")
|
||||
parser.add_argument("--skill-name", "-n", type=str, default=None, help="Skill name for header")
|
||||
parser.add_argument(
|
||||
"--previous-workspace", type=Path, default=None,
|
||||
help="Path to previous iteration's workspace (shows old outputs and feedback as context)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--benchmark", type=Path, default=None,
|
||||
help="Path to benchmark.json to show in the Benchmark tab",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--static", "-s", type=Path, default=None,
|
||||
help="Write standalone HTML to this path instead of starting a server",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
workspace = args.workspace.resolve()
|
||||
if not workspace.is_dir():
|
||||
print(f"Error: {workspace} is not a directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
runs = find_runs(workspace)
|
||||
if not runs:
|
||||
print(f"No runs found in {workspace}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
skill_name = args.skill_name or workspace.name.replace("-workspace", "")
|
||||
feedback_path = workspace / "feedback.json"
|
||||
|
||||
previous: dict[str, dict] = {}
|
||||
if args.previous_workspace:
|
||||
previous = load_previous_iteration(args.previous_workspace.resolve())
|
||||
|
||||
benchmark_path = args.benchmark.resolve() if args.benchmark else None
|
||||
benchmark = None
|
||||
if benchmark_path and benchmark_path.exists():
|
||||
try:
|
||||
benchmark = json.loads(benchmark_path.read_text())
|
||||
except (json.JSONDecodeError, OSError):
|
||||
pass
|
||||
|
||||
if args.static:
|
||||
html = generate_html(runs, skill_name, previous, benchmark)
|
||||
args.static.parent.mkdir(parents=True, exist_ok=True)
|
||||
args.static.write_text(html)
|
||||
print(f"\n Static viewer written to: {args.static}\n")
|
||||
sys.exit(0)
|
||||
|
||||
# Kill any existing process on the target port
|
||||
port = args.port
|
||||
_kill_port(port)
|
||||
handler = partial(ReviewHandler, workspace, skill_name, feedback_path, previous, benchmark_path)
|
||||
try:
|
||||
server = HTTPServer(("127.0.0.1", port), handler)
|
||||
except OSError:
|
||||
# Port still in use after kill attempt — find a free one
|
||||
server = HTTPServer(("127.0.0.1", 0), handler)
|
||||
port = server.server_address[1]
|
||||
|
||||
url = f"http://localhost:{port}"
|
||||
print(f"\n Eval Viewer")
|
||||
print(f" ─────────────────────────────────")
|
||||
print(f" URL: {url}")
|
||||
print(f" Workspace: {workspace}")
|
||||
print(f" Feedback: {feedback_path}")
|
||||
if previous:
|
||||
print(f" Previous: {args.previous_workspace} ({len(previous)} runs)")
|
||||
if benchmark_path:
|
||||
print(f" Benchmark: {benchmark_path}")
|
||||
print(f"\n Press Ctrl+C to stop.\n")
|
||||
|
||||
webbrowser.open(url)
|
||||
|
||||
try:
|
||||
server.serve_forever()
|
||||
except KeyboardInterrupt:
|
||||
print("\nStopped.")
|
||||
server.server_close()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1325
.agents/skills/skill-creator/eval-viewer/viewer.html
Normal file
1325
.agents/skills/skill-creator/eval-viewer/viewer.html
Normal file
File diff suppressed because it is too large
Load Diff
430
.agents/skills/skill-creator/references/schemas.md
Normal file
430
.agents/skills/skill-creator/references/schemas.md
Normal file
@@ -0,0 +1,430 @@
|
||||
# JSON Schemas
|
||||
|
||||
This document defines the JSON schemas used by skill-creator.
|
||||
|
||||
---
|
||||
|
||||
## evals.json
|
||||
|
||||
Defines the evals for a skill. Located at `evals/evals.json` within the skill directory.
|
||||
|
||||
```json
|
||||
{
|
||||
"skill_name": "example-skill",
|
||||
"evals": [
|
||||
{
|
||||
"id": 1,
|
||||
"prompt": "User's example prompt",
|
||||
"expected_output": "Description of expected result",
|
||||
"files": ["evals/files/sample1.pdf"],
|
||||
"expectations": [
|
||||
"The output includes X",
|
||||
"The skill used script Y"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
- `skill_name`: Name matching the skill's frontmatter
|
||||
- `evals[].id`: Unique integer identifier
|
||||
- `evals[].prompt`: The task to execute
|
||||
- `evals[].expected_output`: Human-readable description of success
|
||||
- `evals[].files`: Optional list of input file paths (relative to skill root)
|
||||
- `evals[].expectations`: List of verifiable statements
|
||||
|
||||
---
|
||||
|
||||
## history.json
|
||||
|
||||
Tracks version progression in Improve mode. Located at workspace root.
|
||||
|
||||
```json
|
||||
{
|
||||
"started_at": "2026-01-15T10:30:00Z",
|
||||
"skill_name": "pdf",
|
||||
"current_best": "v2",
|
||||
"iterations": [
|
||||
{
|
||||
"version": "v0",
|
||||
"parent": null,
|
||||
"expectation_pass_rate": 0.65,
|
||||
"grading_result": "baseline",
|
||||
"is_current_best": false
|
||||
},
|
||||
{
|
||||
"version": "v1",
|
||||
"parent": "v0",
|
||||
"expectation_pass_rate": 0.75,
|
||||
"grading_result": "won",
|
||||
"is_current_best": false
|
||||
},
|
||||
{
|
||||
"version": "v2",
|
||||
"parent": "v1",
|
||||
"expectation_pass_rate": 0.85,
|
||||
"grading_result": "won",
|
||||
"is_current_best": true
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
- `started_at`: ISO timestamp of when improvement started
|
||||
- `skill_name`: Name of the skill being improved
|
||||
- `current_best`: Version identifier of the best performer
|
||||
- `iterations[].version`: Version identifier (v0, v1, ...)
|
||||
- `iterations[].parent`: Parent version this was derived from
|
||||
- `iterations[].expectation_pass_rate`: Pass rate from grading
|
||||
- `iterations[].grading_result`: "baseline", "won", "lost", or "tie"
|
||||
- `iterations[].is_current_best`: Whether this is the current best version
|
||||
|
||||
---
|
||||
|
||||
## grading.json
|
||||
|
||||
Output from the grader agent. Located at `<run-dir>/grading.json`.
|
||||
|
||||
```json
|
||||
{
|
||||
"expectations": [
|
||||
{
|
||||
"text": "The output includes the name 'John Smith'",
|
||||
"passed": true,
|
||||
"evidence": "Found in transcript Step 3: 'Extracted names: John Smith, Sarah Johnson'"
|
||||
},
|
||||
{
|
||||
"text": "The spreadsheet has a SUM formula in cell B10",
|
||||
"passed": false,
|
||||
"evidence": "No spreadsheet was created. The output was a text file."
|
||||
}
|
||||
],
|
||||
"summary": {
|
||||
"passed": 2,
|
||||
"failed": 1,
|
||||
"total": 3,
|
||||
"pass_rate": 0.67
|
||||
},
|
||||
"execution_metrics": {
|
||||
"tool_calls": {
|
||||
"Read": 5,
|
||||
"Write": 2,
|
||||
"Bash": 8
|
||||
},
|
||||
"total_tool_calls": 15,
|
||||
"total_steps": 6,
|
||||
"errors_encountered": 0,
|
||||
"output_chars": 12450,
|
||||
"transcript_chars": 3200
|
||||
},
|
||||
"timing": {
|
||||
"executor_duration_seconds": 165.0,
|
||||
"grader_duration_seconds": 26.0,
|
||||
"total_duration_seconds": 191.0
|
||||
},
|
||||
"claims": [
|
||||
{
|
||||
"claim": "The form has 12 fillable fields",
|
||||
"type": "factual",
|
||||
"verified": true,
|
||||
"evidence": "Counted 12 fields in field_info.json"
|
||||
}
|
||||
],
|
||||
"user_notes_summary": {
|
||||
"uncertainties": ["Used 2023 data, may be stale"],
|
||||
"needs_review": [],
|
||||
"workarounds": ["Fell back to text overlay for non-fillable fields"]
|
||||
},
|
||||
"eval_feedback": {
|
||||
"suggestions": [
|
||||
{
|
||||
"assertion": "The output includes the name 'John Smith'",
|
||||
"reason": "A hallucinated document that mentions the name would also pass"
|
||||
}
|
||||
],
|
||||
"overall": "Assertions check presence but not correctness."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
- `expectations[]`: Graded expectations with evidence
|
||||
- `summary`: Aggregate pass/fail counts
|
||||
- `execution_metrics`: Tool usage and output size (from executor's metrics.json)
|
||||
- `timing`: Wall clock timing (from timing.json)
|
||||
- `claims`: Extracted and verified claims from the output
|
||||
- `user_notes_summary`: Issues flagged by the executor
|
||||
- `eval_feedback`: (optional) Improvement suggestions for the evals, only present when the grader identifies issues worth raising
|
||||
|
||||
---
|
||||
|
||||
## metrics.json
|
||||
|
||||
Output from the executor agent. Located at `<run-dir>/outputs/metrics.json`.
|
||||
|
||||
```json
|
||||
{
|
||||
"tool_calls": {
|
||||
"Read": 5,
|
||||
"Write": 2,
|
||||
"Bash": 8,
|
||||
"Edit": 1,
|
||||
"Glob": 2,
|
||||
"Grep": 0
|
||||
},
|
||||
"total_tool_calls": 18,
|
||||
"total_steps": 6,
|
||||
"files_created": ["filled_form.pdf", "field_values.json"],
|
||||
"errors_encountered": 0,
|
||||
"output_chars": 12450,
|
||||
"transcript_chars": 3200
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
- `tool_calls`: Count per tool type
|
||||
- `total_tool_calls`: Sum of all tool calls
|
||||
- `total_steps`: Number of major execution steps
|
||||
- `files_created`: List of output files created
|
||||
- `errors_encountered`: Number of errors during execution
|
||||
- `output_chars`: Total character count of output files
|
||||
- `transcript_chars`: Character count of transcript
|
||||
|
||||
---
|
||||
|
||||
## timing.json
|
||||
|
||||
Wall clock timing for a run. Located at `<run-dir>/timing.json`.
|
||||
|
||||
**How to capture:** When a subagent task completes, the task notification includes `total_tokens` and `duration_ms`. Save these immediately — they are not persisted anywhere else and cannot be recovered after the fact.
|
||||
|
||||
```json
|
||||
{
|
||||
"total_tokens": 84852,
|
||||
"duration_ms": 23332,
|
||||
"total_duration_seconds": 23.3,
|
||||
"executor_start": "2026-01-15T10:30:00Z",
|
||||
"executor_end": "2026-01-15T10:32:45Z",
|
||||
"executor_duration_seconds": 165.0,
|
||||
"grader_start": "2026-01-15T10:32:46Z",
|
||||
"grader_end": "2026-01-15T10:33:12Z",
|
||||
"grader_duration_seconds": 26.0
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## benchmark.json
|
||||
|
||||
Output from Benchmark mode. Located at `benchmarks/<timestamp>/benchmark.json`.
|
||||
|
||||
```json
|
||||
{
|
||||
"metadata": {
|
||||
"skill_name": "pdf",
|
||||
"skill_path": "/path/to/pdf",
|
||||
"executor_model": "claude-sonnet-4-20250514",
|
||||
"analyzer_model": "most-capable-model",
|
||||
"timestamp": "2026-01-15T10:30:00Z",
|
||||
"evals_run": [1, 2, 3],
|
||||
"runs_per_configuration": 3
|
||||
},
|
||||
|
||||
"runs": [
|
||||
{
|
||||
"eval_id": 1,
|
||||
"eval_name": "Ocean",
|
||||
"configuration": "with_skill",
|
||||
"run_number": 1,
|
||||
"result": {
|
||||
"pass_rate": 0.85,
|
||||
"passed": 6,
|
||||
"failed": 1,
|
||||
"total": 7,
|
||||
"time_seconds": 42.5,
|
||||
"tokens": 3800,
|
||||
"tool_calls": 18,
|
||||
"errors": 0
|
||||
},
|
||||
"expectations": [
|
||||
{"text": "...", "passed": true, "evidence": "..."}
|
||||
],
|
||||
"notes": [
|
||||
"Used 2023 data, may be stale",
|
||||
"Fell back to text overlay for non-fillable fields"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
"run_summary": {
|
||||
"with_skill": {
|
||||
"pass_rate": {"mean": 0.85, "stddev": 0.05, "min": 0.80, "max": 0.90},
|
||||
"time_seconds": {"mean": 45.0, "stddev": 12.0, "min": 32.0, "max": 58.0},
|
||||
"tokens": {"mean": 3800, "stddev": 400, "min": 3200, "max": 4100}
|
||||
},
|
||||
"without_skill": {
|
||||
"pass_rate": {"mean": 0.35, "stddev": 0.08, "min": 0.28, "max": 0.45},
|
||||
"time_seconds": {"mean": 32.0, "stddev": 8.0, "min": 24.0, "max": 42.0},
|
||||
"tokens": {"mean": 2100, "stddev": 300, "min": 1800, "max": 2500}
|
||||
},
|
||||
"delta": {
|
||||
"pass_rate": "+0.50",
|
||||
"time_seconds": "+13.0",
|
||||
"tokens": "+1700"
|
||||
}
|
||||
},
|
||||
|
||||
"notes": [
|
||||
"Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value",
|
||||
"Eval 3 shows high variance (50% ± 40%) - may be flaky or model-dependent",
|
||||
"Without-skill runs consistently fail on table extraction expectations",
|
||||
"Skill adds 13s average execution time but improves pass rate by 50%"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
- `metadata`: Information about the benchmark run
|
||||
- `skill_name`: Name of the skill
|
||||
- `timestamp`: When the benchmark was run
|
||||
- `evals_run`: List of eval names or IDs
|
||||
- `runs_per_configuration`: Number of runs per config (e.g. 3)
|
||||
- `runs[]`: Individual run results
|
||||
- `eval_id`: Numeric eval identifier
|
||||
- `eval_name`: Human-readable eval name (used as section header in the viewer)
|
||||
- `configuration`: Must be `"with_skill"` or `"without_skill"` (the viewer uses this exact string for grouping and color coding)
|
||||
- `run_number`: Integer run number (1, 2, 3...)
|
||||
- `result`: Nested object with `pass_rate`, `passed`, `total`, `time_seconds`, `tokens`, `errors`
|
||||
- `run_summary`: Statistical aggregates per configuration
|
||||
- `with_skill` / `without_skill`: Each contains `pass_rate`, `time_seconds`, `tokens` objects with `mean` and `stddev` fields
|
||||
- `delta`: Difference strings like `"+0.50"`, `"+13.0"`, `"+1700"`
|
||||
- `notes`: Freeform observations from the analyzer
|
||||
|
||||
**Important:** The viewer reads these field names exactly. Using `config` instead of `configuration`, or putting `pass_rate` at the top level of a run instead of nested under `result`, will cause the viewer to show empty/zero values. Always reference this schema when generating benchmark.json manually.
|
||||
|
||||
---
|
||||
|
||||
## comparison.json
|
||||
|
||||
Output from blind comparator. Located at `<grading-dir>/comparison-N.json`.
|
||||
|
||||
```json
|
||||
{
|
||||
"winner": "A",
|
||||
"reasoning": "Output A provides a complete solution with proper formatting and all required fields. Output B is missing the date field and has formatting inconsistencies.",
|
||||
"rubric": {
|
||||
"A": {
|
||||
"content": {
|
||||
"correctness": 5,
|
||||
"completeness": 5,
|
||||
"accuracy": 4
|
||||
},
|
||||
"structure": {
|
||||
"organization": 4,
|
||||
"formatting": 5,
|
||||
"usability": 4
|
||||
},
|
||||
"content_score": 4.7,
|
||||
"structure_score": 4.3,
|
||||
"overall_score": 9.0
|
||||
},
|
||||
"B": {
|
||||
"content": {
|
||||
"correctness": 3,
|
||||
"completeness": 2,
|
||||
"accuracy": 3
|
||||
},
|
||||
"structure": {
|
||||
"organization": 3,
|
||||
"formatting": 2,
|
||||
"usability": 3
|
||||
},
|
||||
"content_score": 2.7,
|
||||
"structure_score": 2.7,
|
||||
"overall_score": 5.4
|
||||
}
|
||||
},
|
||||
"output_quality": {
|
||||
"A": {
|
||||
"score": 9,
|
||||
"strengths": ["Complete solution", "Well-formatted", "All fields present"],
|
||||
"weaknesses": ["Minor style inconsistency in header"]
|
||||
},
|
||||
"B": {
|
||||
"score": 5,
|
||||
"strengths": ["Readable output", "Correct basic structure"],
|
||||
"weaknesses": ["Missing date field", "Formatting inconsistencies", "Partial data extraction"]
|
||||
}
|
||||
},
|
||||
"expectation_results": {
|
||||
"A": {
|
||||
"passed": 4,
|
||||
"total": 5,
|
||||
"pass_rate": 0.80,
|
||||
"details": [
|
||||
{"text": "Output includes name", "passed": true}
|
||||
]
|
||||
},
|
||||
"B": {
|
||||
"passed": 3,
|
||||
"total": 5,
|
||||
"pass_rate": 0.60,
|
||||
"details": [
|
||||
{"text": "Output includes name", "passed": true}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## analysis.json
|
||||
|
||||
Output from post-hoc analyzer. Located at `<grading-dir>/analysis.json`.
|
||||
|
||||
```json
|
||||
{
|
||||
"comparison_summary": {
|
||||
"winner": "A",
|
||||
"winner_skill": "path/to/winner/skill",
|
||||
"loser_skill": "path/to/loser/skill",
|
||||
"comparator_reasoning": "Brief summary of why comparator chose winner"
|
||||
},
|
||||
"winner_strengths": [
|
||||
"Clear step-by-step instructions for handling multi-page documents",
|
||||
"Included validation script that caught formatting errors"
|
||||
],
|
||||
"loser_weaknesses": [
|
||||
"Vague instruction 'process the document appropriately' led to inconsistent behavior",
|
||||
"No script for validation, agent had to improvise"
|
||||
],
|
||||
"instruction_following": {
|
||||
"winner": {
|
||||
"score": 9,
|
||||
"issues": ["Minor: skipped optional logging step"]
|
||||
},
|
||||
"loser": {
|
||||
"score": 6,
|
||||
"issues": [
|
||||
"Did not use the skill's formatting template",
|
||||
"Invented own approach instead of following step 3"
|
||||
]
|
||||
}
|
||||
},
|
||||
"improvement_suggestions": [
|
||||
{
|
||||
"priority": "high",
|
||||
"category": "instructions",
|
||||
"suggestion": "Replace 'process the document appropriately' with explicit steps",
|
||||
"expected_impact": "Would eliminate ambiguity that caused inconsistent behavior"
|
||||
}
|
||||
],
|
||||
"transcript_insights": {
|
||||
"winner_execution_pattern": "Read skill -> Followed 5-step process -> Used validation script",
|
||||
"loser_execution_pattern": "Read skill -> Unclear on approach -> Tried 3 different methods"
|
||||
}
|
||||
}
|
||||
```
|
||||
401
.agents/skills/skill-creator/scripts/aggregate_benchmark.py
Normal file
401
.agents/skills/skill-creator/scripts/aggregate_benchmark.py
Normal file
@@ -0,0 +1,401 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Aggregate individual run results into benchmark summary statistics.
|
||||
|
||||
Reads grading.json files from run directories and produces:
|
||||
- run_summary with mean, stddev, min, max for each metric
|
||||
- delta between with_skill and without_skill configurations
|
||||
|
||||
Usage:
|
||||
python aggregate_benchmark.py <benchmark_dir>
|
||||
|
||||
Example:
|
||||
python aggregate_benchmark.py benchmarks/2026-01-15T10-30-00/
|
||||
|
||||
The script supports two directory layouts:
|
||||
|
||||
Workspace layout (from skill-creator iterations):
|
||||
<benchmark_dir>/
|
||||
└── eval-N/
|
||||
├── with_skill/
|
||||
│ ├── run-1/grading.json
|
||||
│ └── run-2/grading.json
|
||||
└── without_skill/
|
||||
├── run-1/grading.json
|
||||
└── run-2/grading.json
|
||||
|
||||
Legacy layout (with runs/ subdirectory):
|
||||
<benchmark_dir>/
|
||||
└── runs/
|
||||
└── eval-N/
|
||||
├── with_skill/
|
||||
│ └── run-1/grading.json
|
||||
└── without_skill/
|
||||
└── run-1/grading.json
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import math
|
||||
import sys
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def calculate_stats(values: list[float]) -> dict:
|
||||
"""Calculate mean, stddev, min, max for a list of values."""
|
||||
if not values:
|
||||
return {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0}
|
||||
|
||||
n = len(values)
|
||||
mean = sum(values) / n
|
||||
|
||||
if n > 1:
|
||||
variance = sum((x - mean) ** 2 for x in values) / (n - 1)
|
||||
stddev = math.sqrt(variance)
|
||||
else:
|
||||
stddev = 0.0
|
||||
|
||||
return {
|
||||
"mean": round(mean, 4),
|
||||
"stddev": round(stddev, 4),
|
||||
"min": round(min(values), 4),
|
||||
"max": round(max(values), 4)
|
||||
}
|
||||
|
||||
|
||||
def load_run_results(benchmark_dir: Path) -> dict:
|
||||
"""
|
||||
Load all run results from a benchmark directory.
|
||||
|
||||
Returns dict keyed by config name (e.g. "with_skill"/"without_skill",
|
||||
or "new_skill"/"old_skill"), each containing a list of run results.
|
||||
"""
|
||||
# Support both layouts: eval dirs directly under benchmark_dir, or under runs/
|
||||
runs_dir = benchmark_dir / "runs"
|
||||
if runs_dir.exists():
|
||||
search_dir = runs_dir
|
||||
elif list(benchmark_dir.glob("eval-*")):
|
||||
search_dir = benchmark_dir
|
||||
else:
|
||||
print(f"No eval directories found in {benchmark_dir} or {benchmark_dir / 'runs'}")
|
||||
return {}
|
||||
|
||||
results: dict[str, list] = {}
|
||||
|
||||
for eval_idx, eval_dir in enumerate(sorted(search_dir.glob("eval-*"))):
|
||||
metadata_path = eval_dir / "eval_metadata.json"
|
||||
if metadata_path.exists():
|
||||
try:
|
||||
with open(metadata_path) as mf:
|
||||
eval_id = json.load(mf).get("eval_id", eval_idx)
|
||||
except (json.JSONDecodeError, OSError):
|
||||
eval_id = eval_idx
|
||||
else:
|
||||
try:
|
||||
eval_id = int(eval_dir.name.split("-")[1])
|
||||
except ValueError:
|
||||
eval_id = eval_idx
|
||||
|
||||
# Discover config directories dynamically rather than hardcoding names
|
||||
for config_dir in sorted(eval_dir.iterdir()):
|
||||
if not config_dir.is_dir():
|
||||
continue
|
||||
# Skip non-config directories (inputs, outputs, etc.)
|
||||
if not list(config_dir.glob("run-*")):
|
||||
continue
|
||||
config = config_dir.name
|
||||
if config not in results:
|
||||
results[config] = []
|
||||
|
||||
for run_dir in sorted(config_dir.glob("run-*")):
|
||||
run_number = int(run_dir.name.split("-")[1])
|
||||
grading_file = run_dir / "grading.json"
|
||||
|
||||
if not grading_file.exists():
|
||||
print(f"Warning: grading.json not found in {run_dir}")
|
||||
continue
|
||||
|
||||
try:
|
||||
with open(grading_file) as f:
|
||||
grading = json.load(f)
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"Warning: Invalid JSON in {grading_file}: {e}")
|
||||
continue
|
||||
|
||||
# Extract metrics
|
||||
result = {
|
||||
"eval_id": eval_id,
|
||||
"run_number": run_number,
|
||||
"pass_rate": grading.get("summary", {}).get("pass_rate", 0.0),
|
||||
"passed": grading.get("summary", {}).get("passed", 0),
|
||||
"failed": grading.get("summary", {}).get("failed", 0),
|
||||
"total": grading.get("summary", {}).get("total", 0),
|
||||
}
|
||||
|
||||
# Extract timing — check grading.json first, then sibling timing.json
|
||||
timing = grading.get("timing", {})
|
||||
result["time_seconds"] = timing.get("total_duration_seconds", 0.0)
|
||||
timing_file = run_dir / "timing.json"
|
||||
if result["time_seconds"] == 0.0 and timing_file.exists():
|
||||
try:
|
||||
with open(timing_file) as tf:
|
||||
timing_data = json.load(tf)
|
||||
result["time_seconds"] = timing_data.get("total_duration_seconds", 0.0)
|
||||
result["tokens"] = timing_data.get("total_tokens", 0)
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
# Extract metrics if available
|
||||
metrics = grading.get("execution_metrics", {})
|
||||
result["tool_calls"] = metrics.get("total_tool_calls", 0)
|
||||
if not result.get("tokens"):
|
||||
result["tokens"] = metrics.get("output_chars", 0)
|
||||
result["errors"] = metrics.get("errors_encountered", 0)
|
||||
|
||||
# Extract expectations — viewer requires fields: text, passed, evidence
|
||||
raw_expectations = grading.get("expectations", [])
|
||||
for exp in raw_expectations:
|
||||
if "text" not in exp or "passed" not in exp:
|
||||
print(f"Warning: expectation in {grading_file} missing required fields (text, passed, evidence): {exp}")
|
||||
result["expectations"] = raw_expectations
|
||||
|
||||
# Extract notes from user_notes_summary
|
||||
notes_summary = grading.get("user_notes_summary", {})
|
||||
notes = []
|
||||
notes.extend(notes_summary.get("uncertainties", []))
|
||||
notes.extend(notes_summary.get("needs_review", []))
|
||||
notes.extend(notes_summary.get("workarounds", []))
|
||||
result["notes"] = notes
|
||||
|
||||
results[config].append(result)
|
||||
|
||||
return results
|
||||
|
||||
|
||||
def aggregate_results(results: dict) -> dict:
|
||||
"""
|
||||
Aggregate run results into summary statistics.
|
||||
|
||||
Returns run_summary with stats for each configuration and delta.
|
||||
"""
|
||||
run_summary = {}
|
||||
configs = list(results.keys())
|
||||
|
||||
for config in configs:
|
||||
runs = results.get(config, [])
|
||||
|
||||
if not runs:
|
||||
run_summary[config] = {
|
||||
"pass_rate": {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0},
|
||||
"time_seconds": {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0},
|
||||
"tokens": {"mean": 0, "stddev": 0, "min": 0, "max": 0}
|
||||
}
|
||||
continue
|
||||
|
||||
pass_rates = [r["pass_rate"] for r in runs]
|
||||
times = [r["time_seconds"] for r in runs]
|
||||
tokens = [r.get("tokens", 0) for r in runs]
|
||||
|
||||
run_summary[config] = {
|
||||
"pass_rate": calculate_stats(pass_rates),
|
||||
"time_seconds": calculate_stats(times),
|
||||
"tokens": calculate_stats(tokens)
|
||||
}
|
||||
|
||||
# Calculate delta between the first two configs (if two exist)
|
||||
if len(configs) >= 2:
|
||||
primary = run_summary.get(configs[0], {})
|
||||
baseline = run_summary.get(configs[1], {})
|
||||
else:
|
||||
primary = run_summary.get(configs[0], {}) if configs else {}
|
||||
baseline = {}
|
||||
|
||||
delta_pass_rate = primary.get("pass_rate", {}).get("mean", 0) - baseline.get("pass_rate", {}).get("mean", 0)
|
||||
delta_time = primary.get("time_seconds", {}).get("mean", 0) - baseline.get("time_seconds", {}).get("mean", 0)
|
||||
delta_tokens = primary.get("tokens", {}).get("mean", 0) - baseline.get("tokens", {}).get("mean", 0)
|
||||
|
||||
run_summary["delta"] = {
|
||||
"pass_rate": f"{delta_pass_rate:+.2f}",
|
||||
"time_seconds": f"{delta_time:+.1f}",
|
||||
"tokens": f"{delta_tokens:+.0f}"
|
||||
}
|
||||
|
||||
return run_summary
|
||||
|
||||
|
||||
def generate_benchmark(benchmark_dir: Path, skill_name: str = "", skill_path: str = "") -> dict:
|
||||
"""
|
||||
Generate complete benchmark.json from run results.
|
||||
"""
|
||||
results = load_run_results(benchmark_dir)
|
||||
run_summary = aggregate_results(results)
|
||||
|
||||
# Build runs array for benchmark.json
|
||||
runs = []
|
||||
for config in results:
|
||||
for result in results[config]:
|
||||
runs.append({
|
||||
"eval_id": result["eval_id"],
|
||||
"configuration": config,
|
||||
"run_number": result["run_number"],
|
||||
"result": {
|
||||
"pass_rate": result["pass_rate"],
|
||||
"passed": result["passed"],
|
||||
"failed": result["failed"],
|
||||
"total": result["total"],
|
||||
"time_seconds": result["time_seconds"],
|
||||
"tokens": result.get("tokens", 0),
|
||||
"tool_calls": result.get("tool_calls", 0),
|
||||
"errors": result.get("errors", 0)
|
||||
},
|
||||
"expectations": result["expectations"],
|
||||
"notes": result["notes"]
|
||||
})
|
||||
|
||||
# Determine eval IDs from results
|
||||
eval_ids = sorted(set(
|
||||
r["eval_id"]
|
||||
for config in results.values()
|
||||
for r in config
|
||||
))
|
||||
|
||||
benchmark = {
|
||||
"metadata": {
|
||||
"skill_name": skill_name or "<skill-name>",
|
||||
"skill_path": skill_path or "<path/to/skill>",
|
||||
"executor_model": "<model-name>",
|
||||
"analyzer_model": "<model-name>",
|
||||
"timestamp": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
|
||||
"evals_run": eval_ids,
|
||||
"runs_per_configuration": 3
|
||||
},
|
||||
"runs": runs,
|
||||
"run_summary": run_summary,
|
||||
"notes": [] # To be filled by analyzer
|
||||
}
|
||||
|
||||
return benchmark
|
||||
|
||||
|
||||
def generate_markdown(benchmark: dict) -> str:
|
||||
"""Generate human-readable benchmark.md from benchmark data."""
|
||||
metadata = benchmark["metadata"]
|
||||
run_summary = benchmark["run_summary"]
|
||||
|
||||
# Determine config names (excluding "delta")
|
||||
configs = [k for k in run_summary if k != "delta"]
|
||||
config_a = configs[0] if len(configs) >= 1 else "config_a"
|
||||
config_b = configs[1] if len(configs) >= 2 else "config_b"
|
||||
label_a = config_a.replace("_", " ").title()
|
||||
label_b = config_b.replace("_", " ").title()
|
||||
|
||||
lines = [
|
||||
f"# Skill Benchmark: {metadata['skill_name']}",
|
||||
"",
|
||||
f"**Model**: {metadata['executor_model']}",
|
||||
f"**Date**: {metadata['timestamp']}",
|
||||
f"**Evals**: {', '.join(map(str, metadata['evals_run']))} ({metadata['runs_per_configuration']} runs each per configuration)",
|
||||
"",
|
||||
"## Summary",
|
||||
"",
|
||||
f"| Metric | {label_a} | {label_b} | Delta |",
|
||||
"|--------|------------|---------------|-------|",
|
||||
]
|
||||
|
||||
a_summary = run_summary.get(config_a, {})
|
||||
b_summary = run_summary.get(config_b, {})
|
||||
delta = run_summary.get("delta", {})
|
||||
|
||||
# Format pass rate
|
||||
a_pr = a_summary.get("pass_rate", {})
|
||||
b_pr = b_summary.get("pass_rate", {})
|
||||
lines.append(f"| Pass Rate | {a_pr.get('mean', 0)*100:.0f}% ± {a_pr.get('stddev', 0)*100:.0f}% | {b_pr.get('mean', 0)*100:.0f}% ± {b_pr.get('stddev', 0)*100:.0f}% | {delta.get('pass_rate', '—')} |")
|
||||
|
||||
# Format time
|
||||
a_time = a_summary.get("time_seconds", {})
|
||||
b_time = b_summary.get("time_seconds", {})
|
||||
lines.append(f"| Time | {a_time.get('mean', 0):.1f}s ± {a_time.get('stddev', 0):.1f}s | {b_time.get('mean', 0):.1f}s ± {b_time.get('stddev', 0):.1f}s | {delta.get('time_seconds', '—')}s |")
|
||||
|
||||
# Format tokens
|
||||
a_tokens = a_summary.get("tokens", {})
|
||||
b_tokens = b_summary.get("tokens", {})
|
||||
lines.append(f"| Tokens | {a_tokens.get('mean', 0):.0f} ± {a_tokens.get('stddev', 0):.0f} | {b_tokens.get('mean', 0):.0f} ± {b_tokens.get('stddev', 0):.0f} | {delta.get('tokens', '—')} |")
|
||||
|
||||
# Notes section
|
||||
if benchmark.get("notes"):
|
||||
lines.extend([
|
||||
"",
|
||||
"## Notes",
|
||||
""
|
||||
])
|
||||
for note in benchmark["notes"]:
|
||||
lines.append(f"- {note}")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Aggregate benchmark run results into summary statistics"
|
||||
)
|
||||
parser.add_argument(
|
||||
"benchmark_dir",
|
||||
type=Path,
|
||||
help="Path to the benchmark directory"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--skill-name",
|
||||
default="",
|
||||
help="Name of the skill being benchmarked"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--skill-path",
|
||||
default="",
|
||||
help="Path to the skill being benchmarked"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--output", "-o",
|
||||
type=Path,
|
||||
help="Output path for benchmark.json (default: <benchmark_dir>/benchmark.json)"
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.benchmark_dir.exists():
|
||||
print(f"Directory not found: {args.benchmark_dir}")
|
||||
sys.exit(1)
|
||||
|
||||
# Generate benchmark
|
||||
benchmark = generate_benchmark(args.benchmark_dir, args.skill_name, args.skill_path)
|
||||
|
||||
# Determine output paths
|
||||
output_json = args.output or (args.benchmark_dir / "benchmark.json")
|
||||
output_md = output_json.with_suffix(".md")
|
||||
|
||||
# Write benchmark.json
|
||||
with open(output_json, "w") as f:
|
||||
json.dump(benchmark, f, indent=2)
|
||||
print(f"Generated: {output_json}")
|
||||
|
||||
# Write benchmark.md
|
||||
markdown = generate_markdown(benchmark)
|
||||
with open(output_md, "w") as f:
|
||||
f.write(markdown)
|
||||
print(f"Generated: {output_md}")
|
||||
|
||||
# Print summary
|
||||
run_summary = benchmark["run_summary"]
|
||||
configs = [k for k in run_summary if k != "delta"]
|
||||
delta = run_summary.get("delta", {})
|
||||
|
||||
print(f"\nSummary:")
|
||||
for config in configs:
|
||||
pr = run_summary[config]["pass_rate"]["mean"]
|
||||
label = config.replace("_", " ").title()
|
||||
print(f" {label}: {pr*100:.1f}% pass rate")
|
||||
print(f" Delta: {delta.get('pass_rate', '—')}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
326
.agents/skills/skill-creator/scripts/generate_report.py
Normal file
326
.agents/skills/skill-creator/scripts/generate_report.py
Normal file
@@ -0,0 +1,326 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Generate an HTML report from run_loop.py output.
|
||||
|
||||
Takes the JSON output from run_loop.py and generates a visual HTML report
|
||||
showing each description attempt with check/x for each test case.
|
||||
Distinguishes between train and test queries.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import html
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def generate_html(data: dict, auto_refresh: bool = False, skill_name: str = "") -> str:
|
||||
"""Generate HTML report from loop output data. If auto_refresh is True, adds a meta refresh tag."""
|
||||
history = data.get("history", [])
|
||||
holdout = data.get("holdout", 0)
|
||||
title_prefix = html.escape(skill_name + " \u2014 ") if skill_name else ""
|
||||
|
||||
# Get all unique queries from train and test sets, with should_trigger info
|
||||
train_queries: list[dict] = []
|
||||
test_queries: list[dict] = []
|
||||
if history:
|
||||
for r in history[0].get("train_results", history[0].get("results", [])):
|
||||
train_queries.append({"query": r["query"], "should_trigger": r.get("should_trigger", True)})
|
||||
if history[0].get("test_results"):
|
||||
for r in history[0].get("test_results", []):
|
||||
test_queries.append({"query": r["query"], "should_trigger": r.get("should_trigger", True)})
|
||||
|
||||
refresh_tag = ' <meta http-equiv="refresh" content="5">\n' if auto_refresh else ""
|
||||
|
||||
html_parts = ["""<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
""" + refresh_tag + """ <title>""" + title_prefix + """Skill Description Optimization</title>
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Poppins:wght@500;600&family=Lora:wght@400;500&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
body {
|
||||
font-family: 'Lora', Georgia, serif;
|
||||
max-width: 100%;
|
||||
margin: 0 auto;
|
||||
padding: 20px;
|
||||
background: #faf9f5;
|
||||
color: #141413;
|
||||
}
|
||||
h1 { font-family: 'Poppins', sans-serif; color: #141413; }
|
||||
.explainer {
|
||||
background: white;
|
||||
padding: 15px;
|
||||
border-radius: 6px;
|
||||
margin-bottom: 20px;
|
||||
border: 1px solid #e8e6dc;
|
||||
color: #b0aea5;
|
||||
font-size: 0.875rem;
|
||||
line-height: 1.6;
|
||||
}
|
||||
.summary {
|
||||
background: white;
|
||||
padding: 15px;
|
||||
border-radius: 6px;
|
||||
margin-bottom: 20px;
|
||||
border: 1px solid #e8e6dc;
|
||||
}
|
||||
.summary p { margin: 5px 0; }
|
||||
.best { color: #788c5d; font-weight: bold; }
|
||||
.table-container {
|
||||
overflow-x: auto;
|
||||
width: 100%;
|
||||
}
|
||||
table {
|
||||
border-collapse: collapse;
|
||||
background: white;
|
||||
border: 1px solid #e8e6dc;
|
||||
border-radius: 6px;
|
||||
font-size: 12px;
|
||||
min-width: 100%;
|
||||
}
|
||||
th, td {
|
||||
padding: 8px;
|
||||
text-align: left;
|
||||
border: 1px solid #e8e6dc;
|
||||
white-space: normal;
|
||||
word-wrap: break-word;
|
||||
}
|
||||
th {
|
||||
font-family: 'Poppins', sans-serif;
|
||||
background: #141413;
|
||||
color: #faf9f5;
|
||||
font-weight: 500;
|
||||
}
|
||||
th.test-col {
|
||||
background: #6a9bcc;
|
||||
}
|
||||
th.query-col { min-width: 200px; }
|
||||
td.description {
|
||||
font-family: monospace;
|
||||
font-size: 11px;
|
||||
word-wrap: break-word;
|
||||
max-width: 400px;
|
||||
}
|
||||
td.result {
|
||||
text-align: center;
|
||||
font-size: 16px;
|
||||
min-width: 40px;
|
||||
}
|
||||
td.test-result {
|
||||
background: #f0f6fc;
|
||||
}
|
||||
.pass { color: #788c5d; }
|
||||
.fail { color: #c44; }
|
||||
.rate {
|
||||
font-size: 9px;
|
||||
color: #b0aea5;
|
||||
display: block;
|
||||
}
|
||||
tr:hover { background: #faf9f5; }
|
||||
.score {
|
||||
display: inline-block;
|
||||
padding: 2px 6px;
|
||||
border-radius: 4px;
|
||||
font-weight: bold;
|
||||
font-size: 11px;
|
||||
}
|
||||
.score-good { background: #eef2e8; color: #788c5d; }
|
||||
.score-ok { background: #fef3c7; color: #d97706; }
|
||||
.score-bad { background: #fceaea; color: #c44; }
|
||||
.train-label { color: #b0aea5; font-size: 10px; }
|
||||
.test-label { color: #6a9bcc; font-size: 10px; font-weight: bold; }
|
||||
.best-row { background: #f5f8f2; }
|
||||
th.positive-col { border-bottom: 3px solid #788c5d; }
|
||||
th.negative-col { border-bottom: 3px solid #c44; }
|
||||
th.test-col.positive-col { border-bottom: 3px solid #788c5d; }
|
||||
th.test-col.negative-col { border-bottom: 3px solid #c44; }
|
||||
.legend { font-family: 'Poppins', sans-serif; display: flex; gap: 20px; margin-bottom: 10px; font-size: 13px; align-items: center; }
|
||||
.legend-item { display: flex; align-items: center; gap: 6px; }
|
||||
.legend-swatch { width: 16px; height: 16px; border-radius: 3px; display: inline-block; }
|
||||
.swatch-positive { background: #141413; border-bottom: 3px solid #788c5d; }
|
||||
.swatch-negative { background: #141413; border-bottom: 3px solid #c44; }
|
||||
.swatch-test { background: #6a9bcc; }
|
||||
.swatch-train { background: #141413; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>""" + title_prefix + """Skill Description Optimization</h1>
|
||||
<div class="explainer">
|
||||
<strong>Optimizing your skill's description.</strong> This page updates automatically as Claude tests different versions of your skill's description. Each row is an iteration — a new description attempt. The columns show test queries: green checkmarks mean the skill triggered correctly (or correctly didn't trigger), red crosses mean it got it wrong. The "Train" score shows performance on queries used to improve the description; the "Test" score shows performance on held-out queries the optimizer hasn't seen. When it's done, Claude will apply the best-performing description to your skill.
|
||||
</div>
|
||||
"""]
|
||||
|
||||
# Summary section
|
||||
best_test_score = data.get('best_test_score')
|
||||
best_train_score = data.get('best_train_score')
|
||||
html_parts.append(f"""
|
||||
<div class="summary">
|
||||
<p><strong>Original:</strong> {html.escape(data.get('original_description', 'N/A'))}</p>
|
||||
<p class="best"><strong>Best:</strong> {html.escape(data.get('best_description', 'N/A'))}</p>
|
||||
<p><strong>Best Score:</strong> {data.get('best_score', 'N/A')} {'(test)' if best_test_score else '(train)'}</p>
|
||||
<p><strong>Iterations:</strong> {data.get('iterations_run', 0)} | <strong>Train:</strong> {data.get('train_size', '?')} | <strong>Test:</strong> {data.get('test_size', '?')}</p>
|
||||
</div>
|
||||
""")
|
||||
|
||||
# Legend
|
||||
html_parts.append("""
|
||||
<div class="legend">
|
||||
<span style="font-weight:600">Query columns:</span>
|
||||
<span class="legend-item"><span class="legend-swatch swatch-positive"></span> Should trigger</span>
|
||||
<span class="legend-item"><span class="legend-swatch swatch-negative"></span> Should NOT trigger</span>
|
||||
<span class="legend-item"><span class="legend-swatch swatch-train"></span> Train</span>
|
||||
<span class="legend-item"><span class="legend-swatch swatch-test"></span> Test</span>
|
||||
</div>
|
||||
""")
|
||||
|
||||
# Table header
|
||||
html_parts.append("""
|
||||
<div class="table-container">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Iter</th>
|
||||
<th>Train</th>
|
||||
<th>Test</th>
|
||||
<th class="query-col">Description</th>
|
||||
""")
|
||||
|
||||
# Add column headers for train queries
|
||||
for qinfo in train_queries:
|
||||
polarity = "positive-col" if qinfo["should_trigger"] else "negative-col"
|
||||
html_parts.append(f' <th class="{polarity}">{html.escape(qinfo["query"])}</th>\n')
|
||||
|
||||
# Add column headers for test queries (different color)
|
||||
for qinfo in test_queries:
|
||||
polarity = "positive-col" if qinfo["should_trigger"] else "negative-col"
|
||||
html_parts.append(f' <th class="test-col {polarity}">{html.escape(qinfo["query"])}</th>\n')
|
||||
|
||||
html_parts.append(""" </tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
""")
|
||||
|
||||
# Find best iteration for highlighting
|
||||
if test_queries:
|
||||
best_iter = max(history, key=lambda h: h.get("test_passed") or 0).get("iteration")
|
||||
else:
|
||||
best_iter = max(history, key=lambda h: h.get("train_passed", h.get("passed", 0))).get("iteration")
|
||||
|
||||
# Add rows for each iteration
|
||||
for h in history:
|
||||
iteration = h.get("iteration", "?")
|
||||
train_passed = h.get("train_passed", h.get("passed", 0))
|
||||
train_total = h.get("train_total", h.get("total", 0))
|
||||
test_passed = h.get("test_passed")
|
||||
test_total = h.get("test_total")
|
||||
description = h.get("description", "")
|
||||
train_results = h.get("train_results", h.get("results", []))
|
||||
test_results = h.get("test_results", [])
|
||||
|
||||
# Create lookups for results by query
|
||||
train_by_query = {r["query"]: r for r in train_results}
|
||||
test_by_query = {r["query"]: r for r in test_results} if test_results else {}
|
||||
|
||||
# Compute aggregate correct/total runs across all retries
|
||||
def aggregate_runs(results: list[dict]) -> tuple[int, int]:
|
||||
correct = 0
|
||||
total = 0
|
||||
for r in results:
|
||||
runs = r.get("runs", 0)
|
||||
triggers = r.get("triggers", 0)
|
||||
total += runs
|
||||
if r.get("should_trigger", True):
|
||||
correct += triggers
|
||||
else:
|
||||
correct += runs - triggers
|
||||
return correct, total
|
||||
|
||||
train_correct, train_runs = aggregate_runs(train_results)
|
||||
test_correct, test_runs = aggregate_runs(test_results)
|
||||
|
||||
# Determine score classes
|
||||
def score_class(correct: int, total: int) -> str:
|
||||
if total > 0:
|
||||
ratio = correct / total
|
||||
if ratio >= 0.8:
|
||||
return "score-good"
|
||||
elif ratio >= 0.5:
|
||||
return "score-ok"
|
||||
return "score-bad"
|
||||
|
||||
train_class = score_class(train_correct, train_runs)
|
||||
test_class = score_class(test_correct, test_runs)
|
||||
|
||||
row_class = "best-row" if iteration == best_iter else ""
|
||||
|
||||
html_parts.append(f""" <tr class="{row_class}">
|
||||
<td>{iteration}</td>
|
||||
<td><span class="score {train_class}">{train_correct}/{train_runs}</span></td>
|
||||
<td><span class="score {test_class}">{test_correct}/{test_runs}</span></td>
|
||||
<td class="description">{html.escape(description)}</td>
|
||||
""")
|
||||
|
||||
# Add result for each train query
|
||||
for qinfo in train_queries:
|
||||
r = train_by_query.get(qinfo["query"], {})
|
||||
did_pass = r.get("pass", False)
|
||||
triggers = r.get("triggers", 0)
|
||||
runs = r.get("runs", 0)
|
||||
|
||||
icon = "✓" if did_pass else "✗"
|
||||
css_class = "pass" if did_pass else "fail"
|
||||
|
||||
html_parts.append(f' <td class="result {css_class}">{icon}<span class="rate">{triggers}/{runs}</span></td>\n')
|
||||
|
||||
# Add result for each test query (with different background)
|
||||
for qinfo in test_queries:
|
||||
r = test_by_query.get(qinfo["query"], {})
|
||||
did_pass = r.get("pass", False)
|
||||
triggers = r.get("triggers", 0)
|
||||
runs = r.get("runs", 0)
|
||||
|
||||
icon = "✓" if did_pass else "✗"
|
||||
css_class = "pass" if did_pass else "fail"
|
||||
|
||||
html_parts.append(f' <td class="result test-result {css_class}">{icon}<span class="rate">{triggers}/{runs}</span></td>\n')
|
||||
|
||||
html_parts.append(" </tr>\n")
|
||||
|
||||
html_parts.append(""" </tbody>
|
||||
</table>
|
||||
</div>
|
||||
""")
|
||||
|
||||
html_parts.append("""
|
||||
</body>
|
||||
</html>
|
||||
""")
|
||||
|
||||
return "".join(html_parts)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Generate HTML report from run_loop output")
|
||||
parser.add_argument("input", help="Path to JSON output from run_loop.py (or - for stdin)")
|
||||
parser.add_argument("-o", "--output", default=None, help="Output HTML file (default: stdout)")
|
||||
parser.add_argument("--skill-name", default="", help="Skill name to include in the report title")
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.input == "-":
|
||||
data = json.load(sys.stdin)
|
||||
else:
|
||||
data = json.loads(Path(args.input).read_text())
|
||||
|
||||
html_output = generate_html(data, skill_name=args.skill_name)
|
||||
|
||||
if args.output:
|
||||
Path(args.output).write_text(html_output)
|
||||
print(f"Report written to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(html_output)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
248
.agents/skills/skill-creator/scripts/improve_description.py
Normal file
248
.agents/skills/skill-creator/scripts/improve_description.py
Normal file
@@ -0,0 +1,248 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Improve a skill description based on eval results.
|
||||
|
||||
Takes eval results (from run_eval.py) and generates an improved description
|
||||
using Claude with extended thinking.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import anthropic
|
||||
|
||||
from scripts.utils import parse_skill_md
|
||||
|
||||
|
||||
def improve_description(
|
||||
client: anthropic.Anthropic,
|
||||
skill_name: str,
|
||||
skill_content: str,
|
||||
current_description: str,
|
||||
eval_results: dict,
|
||||
history: list[dict],
|
||||
model: str,
|
||||
test_results: dict | None = None,
|
||||
log_dir: Path | None = None,
|
||||
iteration: int | None = None,
|
||||
) -> str:
|
||||
"""Call Claude to improve the description based on eval results."""
|
||||
failed_triggers = [
|
||||
r for r in eval_results["results"]
|
||||
if r["should_trigger"] and not r["pass"]
|
||||
]
|
||||
false_triggers = [
|
||||
r for r in eval_results["results"]
|
||||
if not r["should_trigger"] and not r["pass"]
|
||||
]
|
||||
|
||||
# Build scores summary
|
||||
train_score = f"{eval_results['summary']['passed']}/{eval_results['summary']['total']}"
|
||||
if test_results:
|
||||
test_score = f"{test_results['summary']['passed']}/{test_results['summary']['total']}"
|
||||
scores_summary = f"Train: {train_score}, Test: {test_score}"
|
||||
else:
|
||||
scores_summary = f"Train: {train_score}"
|
||||
|
||||
prompt = f"""You are optimizing a skill description for a Claude Code skill called "{skill_name}". A "skill" is sort of like a prompt, but with progressive disclosure -- there's a title and description that Claude sees when deciding whether to use the skill, and then if it does use the skill, it reads the .md file which has lots more details and potentially links to other resources in the skill folder like helper files and scripts and additional documentation or examples.
|
||||
|
||||
The description appears in Claude's "available_skills" list. When a user sends a query, Claude decides whether to invoke the skill based solely on the title and on this description. Your goal is to write a description that triggers for relevant queries, and doesn't trigger for irrelevant ones.
|
||||
|
||||
Here's the current description:
|
||||
<current_description>
|
||||
"{current_description}"
|
||||
</current_description>
|
||||
|
||||
Current scores ({scores_summary}):
|
||||
<scores_summary>
|
||||
"""
|
||||
if failed_triggers:
|
||||
prompt += "FAILED TO TRIGGER (should have triggered but didn't):\n"
|
||||
for r in failed_triggers:
|
||||
prompt += f' - "{r["query"]}" (triggered {r["triggers"]}/{r["runs"]} times)\n'
|
||||
prompt += "\n"
|
||||
|
||||
if false_triggers:
|
||||
prompt += "FALSE TRIGGERS (triggered but shouldn't have):\n"
|
||||
for r in false_triggers:
|
||||
prompt += f' - "{r["query"]}" (triggered {r["triggers"]}/{r["runs"]} times)\n'
|
||||
prompt += "\n"
|
||||
|
||||
if history:
|
||||
prompt += "PREVIOUS ATTEMPTS (do NOT repeat these — try something structurally different):\n\n"
|
||||
for h in history:
|
||||
train_s = f"{h.get('train_passed', h.get('passed', 0))}/{h.get('train_total', h.get('total', 0))}"
|
||||
test_s = f"{h.get('test_passed', '?')}/{h.get('test_total', '?')}" if h.get('test_passed') is not None else None
|
||||
score_str = f"train={train_s}" + (f", test={test_s}" if test_s else "")
|
||||
prompt += f'<attempt {score_str}>\n'
|
||||
prompt += f'Description: "{h["description"]}"\n'
|
||||
if "results" in h:
|
||||
prompt += "Train results:\n"
|
||||
for r in h["results"]:
|
||||
status = "PASS" if r["pass"] else "FAIL"
|
||||
prompt += f' [{status}] "{r["query"][:80]}" (triggered {r["triggers"]}/{r["runs"]})\n'
|
||||
if h.get("note"):
|
||||
prompt += f'Note: {h["note"]}\n'
|
||||
prompt += "</attempt>\n\n"
|
||||
|
||||
prompt += f"""</scores_summary>
|
||||
|
||||
Skill content (for context on what the skill does):
|
||||
<skill_content>
|
||||
{skill_content}
|
||||
</skill_content>
|
||||
|
||||
Based on the failures, write a new and improved description that is more likely to trigger correctly. When I say "based on the failures", it's a bit of a tricky line to walk because we don't want to overfit to the specific cases you're seeing. So what I DON'T want you to do is produce an ever-expanding list of specific queries that this skill should or shouldn't trigger for. Instead, try to generalize from the failures to broader categories of user intent and situations where this skill would be useful or not useful. The reason for this is twofold:
|
||||
|
||||
1. Avoid overfitting
|
||||
2. The list might get loooong and it's injected into ALL queries and there might be a lot of skills, so we don't want to blow too much space on any given description.
|
||||
|
||||
Concretely, your description should not be more than about 100-200 words, even if that comes at the cost of accuracy.
|
||||
|
||||
Here are some tips that we've found to work well in writing these descriptions:
|
||||
- The skill should be phrased in the imperative -- "Use this skill for" rather than "this skill does"
|
||||
- The skill description should focus on the user's intent, what they are trying to achieve, vs. the implementation details of how the skill works.
|
||||
- The description competes with other skills for Claude's attention — make it distinctive and immediately recognizable.
|
||||
- If you're getting lots of failures after repeated attempts, change things up. Try different sentence structures or wordings.
|
||||
|
||||
I'd encourage you to be creative and mix up the style in different iterations since you'll have multiple opportunities to try different approaches and we'll just grab the highest-scoring one at the end.
|
||||
|
||||
Please respond with only the new description text in <new_description> tags, nothing else."""
|
||||
|
||||
response = client.messages.create(
|
||||
model=model,
|
||||
max_tokens=16000,
|
||||
thinking={
|
||||
"type": "enabled",
|
||||
"budget_tokens": 10000,
|
||||
},
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
|
||||
# Extract thinking and text from response
|
||||
thinking_text = ""
|
||||
text = ""
|
||||
for block in response.content:
|
||||
if block.type == "thinking":
|
||||
thinking_text = block.thinking
|
||||
elif block.type == "text":
|
||||
text = block.text
|
||||
|
||||
# Parse out the <new_description> tags
|
||||
match = re.search(r"<new_description>(.*?)</new_description>", text, re.DOTALL)
|
||||
description = match.group(1).strip().strip('"') if match else text.strip().strip('"')
|
||||
|
||||
# Log the transcript
|
||||
transcript: dict = {
|
||||
"iteration": iteration,
|
||||
"prompt": prompt,
|
||||
"thinking": thinking_text,
|
||||
"response": text,
|
||||
"parsed_description": description,
|
||||
"char_count": len(description),
|
||||
"over_limit": len(description) > 1024,
|
||||
}
|
||||
|
||||
# If over 1024 chars, ask the model to shorten it
|
||||
if len(description) > 1024:
|
||||
shorten_prompt = f"Your description is {len(description)} characters, which exceeds the hard 1024 character limit. Please rewrite it to be under 1024 characters while preserving the most important trigger words and intent coverage. Respond with only the new description in <new_description> tags."
|
||||
shorten_response = client.messages.create(
|
||||
model=model,
|
||||
max_tokens=16000,
|
||||
thinking={
|
||||
"type": "enabled",
|
||||
"budget_tokens": 10000,
|
||||
},
|
||||
messages=[
|
||||
{"role": "user", "content": prompt},
|
||||
{"role": "assistant", "content": text},
|
||||
{"role": "user", "content": shorten_prompt},
|
||||
],
|
||||
)
|
||||
|
||||
shorten_thinking = ""
|
||||
shorten_text = ""
|
||||
for block in shorten_response.content:
|
||||
if block.type == "thinking":
|
||||
shorten_thinking = block.thinking
|
||||
elif block.type == "text":
|
||||
shorten_text = block.text
|
||||
|
||||
match = re.search(r"<new_description>(.*?)</new_description>", shorten_text, re.DOTALL)
|
||||
shortened = match.group(1).strip().strip('"') if match else shorten_text.strip().strip('"')
|
||||
|
||||
transcript["rewrite_prompt"] = shorten_prompt
|
||||
transcript["rewrite_thinking"] = shorten_thinking
|
||||
transcript["rewrite_response"] = shorten_text
|
||||
transcript["rewrite_description"] = shortened
|
||||
transcript["rewrite_char_count"] = len(shortened)
|
||||
description = shortened
|
||||
|
||||
transcript["final_description"] = description
|
||||
|
||||
if log_dir:
|
||||
log_dir.mkdir(parents=True, exist_ok=True)
|
||||
log_file = log_dir / f"improve_iter_{iteration or 'unknown'}.json"
|
||||
log_file.write_text(json.dumps(transcript, indent=2))
|
||||
|
||||
return description
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Improve a skill description based on eval results")
|
||||
parser.add_argument("--eval-results", required=True, help="Path to eval results JSON (from run_eval.py)")
|
||||
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
|
||||
parser.add_argument("--history", default=None, help="Path to history JSON (previous attempts)")
|
||||
parser.add_argument("--model", required=True, help="Model for improvement")
|
||||
parser.add_argument("--verbose", action="store_true", help="Print thinking to stderr")
|
||||
args = parser.parse_args()
|
||||
|
||||
skill_path = Path(args.skill_path)
|
||||
if not (skill_path / "SKILL.md").exists():
|
||||
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
eval_results = json.loads(Path(args.eval_results).read_text())
|
||||
history = []
|
||||
if args.history:
|
||||
history = json.loads(Path(args.history).read_text())
|
||||
|
||||
name, _, content = parse_skill_md(skill_path)
|
||||
current_description = eval_results["description"]
|
||||
|
||||
if args.verbose:
|
||||
print(f"Current: {current_description}", file=sys.stderr)
|
||||
print(f"Score: {eval_results['summary']['passed']}/{eval_results['summary']['total']}", file=sys.stderr)
|
||||
|
||||
client = anthropic.Anthropic()
|
||||
new_description = improve_description(
|
||||
client=client,
|
||||
skill_name=name,
|
||||
skill_content=content,
|
||||
current_description=current_description,
|
||||
eval_results=eval_results,
|
||||
history=history,
|
||||
model=args.model,
|
||||
)
|
||||
|
||||
if args.verbose:
|
||||
print(f"Improved: {new_description}", file=sys.stderr)
|
||||
|
||||
# Output as JSON with both the new description and updated history
|
||||
output = {
|
||||
"description": new_description,
|
||||
"history": history + [{
|
||||
"description": current_description,
|
||||
"passed": eval_results["summary"]["passed"],
|
||||
"failed": eval_results["summary"]["failed"],
|
||||
"total": eval_results["summary"]["total"],
|
||||
"results": eval_results["results"],
|
||||
}],
|
||||
}
|
||||
print(json.dumps(output, indent=2))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
136
.agents/skills/skill-creator/scripts/package_skill.py
Normal file
136
.agents/skills/skill-creator/scripts/package_skill.py
Normal file
@@ -0,0 +1,136 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Skill Packager - Creates a distributable .skill file of a skill folder
|
||||
|
||||
Usage:
|
||||
python utils/package_skill.py <path/to/skill-folder> [output-directory]
|
||||
|
||||
Example:
|
||||
python utils/package_skill.py skills/public/my-skill
|
||||
python utils/package_skill.py skills/public/my-skill ./dist
|
||||
"""
|
||||
|
||||
import fnmatch
|
||||
import sys
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
from scripts.quick_validate import validate_skill
|
||||
|
||||
# Patterns to exclude when packaging skills.
|
||||
EXCLUDE_DIRS = {"__pycache__", "node_modules"}
|
||||
EXCLUDE_GLOBS = {"*.pyc"}
|
||||
EXCLUDE_FILES = {".DS_Store"}
|
||||
# Directories excluded only at the skill root (not when nested deeper).
|
||||
ROOT_EXCLUDE_DIRS = {"evals"}
|
||||
|
||||
|
||||
def should_exclude(rel_path: Path) -> bool:
|
||||
"""Check if a path should be excluded from packaging."""
|
||||
parts = rel_path.parts
|
||||
if any(part in EXCLUDE_DIRS for part in parts):
|
||||
return True
|
||||
# rel_path is relative to skill_path.parent, so parts[0] is the skill
|
||||
# folder name and parts[1] (if present) is the first subdir.
|
||||
if len(parts) > 1 and parts[1] in ROOT_EXCLUDE_DIRS:
|
||||
return True
|
||||
name = rel_path.name
|
||||
if name in EXCLUDE_FILES:
|
||||
return True
|
||||
return any(fnmatch.fnmatch(name, pat) for pat in EXCLUDE_GLOBS)
|
||||
|
||||
|
||||
def package_skill(skill_path, output_dir=None):
|
||||
"""
|
||||
Package a skill folder into a .skill file.
|
||||
|
||||
Args:
|
||||
skill_path: Path to the skill folder
|
||||
output_dir: Optional output directory for the .skill file (defaults to current directory)
|
||||
|
||||
Returns:
|
||||
Path to the created .skill file, or None if error
|
||||
"""
|
||||
skill_path = Path(skill_path).resolve()
|
||||
|
||||
# Validate skill folder exists
|
||||
if not skill_path.exists():
|
||||
print(f"❌ Error: Skill folder not found: {skill_path}")
|
||||
return None
|
||||
|
||||
if not skill_path.is_dir():
|
||||
print(f"❌ Error: Path is not a directory: {skill_path}")
|
||||
return None
|
||||
|
||||
# Validate SKILL.md exists
|
||||
skill_md = skill_path / "SKILL.md"
|
||||
if not skill_md.exists():
|
||||
print(f"❌ Error: SKILL.md not found in {skill_path}")
|
||||
return None
|
||||
|
||||
# Run validation before packaging
|
||||
print("🔍 Validating skill...")
|
||||
valid, message = validate_skill(skill_path)
|
||||
if not valid:
|
||||
print(f"❌ Validation failed: {message}")
|
||||
print(" Please fix the validation errors before packaging.")
|
||||
return None
|
||||
print(f"✅ {message}\n")
|
||||
|
||||
# Determine output location
|
||||
skill_name = skill_path.name
|
||||
if output_dir:
|
||||
output_path = Path(output_dir).resolve()
|
||||
output_path.mkdir(parents=True, exist_ok=True)
|
||||
else:
|
||||
output_path = Path.cwd()
|
||||
|
||||
skill_filename = output_path / f"{skill_name}.skill"
|
||||
|
||||
# Create the .skill file (zip format)
|
||||
try:
|
||||
with zipfile.ZipFile(skill_filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
|
||||
# Walk through the skill directory, excluding build artifacts
|
||||
for file_path in skill_path.rglob('*'):
|
||||
if not file_path.is_file():
|
||||
continue
|
||||
arcname = file_path.relative_to(skill_path.parent)
|
||||
if should_exclude(arcname):
|
||||
print(f" Skipped: {arcname}")
|
||||
continue
|
||||
zipf.write(file_path, arcname)
|
||||
print(f" Added: {arcname}")
|
||||
|
||||
print(f"\n✅ Successfully packaged skill to: {skill_filename}")
|
||||
return skill_filename
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error creating .skill file: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 2:
|
||||
print("Usage: python utils/package_skill.py <path/to/skill-folder> [output-directory]")
|
||||
print("\nExample:")
|
||||
print(" python utils/package_skill.py skills/public/my-skill")
|
||||
print(" python utils/package_skill.py skills/public/my-skill ./dist")
|
||||
sys.exit(1)
|
||||
|
||||
skill_path = sys.argv[1]
|
||||
output_dir = sys.argv[2] if len(sys.argv) > 2 else None
|
||||
|
||||
print(f"📦 Packaging skill: {skill_path}")
|
||||
if output_dir:
|
||||
print(f" Output directory: {output_dir}")
|
||||
print()
|
||||
|
||||
result = package_skill(skill_path, output_dir)
|
||||
|
||||
if result:
|
||||
sys.exit(0)
|
||||
else:
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
103
.agents/skills/skill-creator/scripts/quick_validate.py
Normal file
103
.agents/skills/skill-creator/scripts/quick_validate.py
Normal file
@@ -0,0 +1,103 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Quick validation script for skills - minimal version
|
||||
"""
|
||||
|
||||
import sys
|
||||
import os
|
||||
import re
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
|
||||
def validate_skill(skill_path):
|
||||
"""Basic validation of a skill"""
|
||||
skill_path = Path(skill_path)
|
||||
|
||||
# Check SKILL.md exists
|
||||
skill_md = skill_path / 'SKILL.md'
|
||||
if not skill_md.exists():
|
||||
return False, "SKILL.md not found"
|
||||
|
||||
# Read and validate frontmatter
|
||||
content = skill_md.read_text()
|
||||
if not content.startswith('---'):
|
||||
return False, "No YAML frontmatter found"
|
||||
|
||||
# Extract frontmatter
|
||||
match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
|
||||
if not match:
|
||||
return False, "Invalid frontmatter format"
|
||||
|
||||
frontmatter_text = match.group(1)
|
||||
|
||||
# Parse YAML frontmatter
|
||||
try:
|
||||
frontmatter = yaml.safe_load(frontmatter_text)
|
||||
if not isinstance(frontmatter, dict):
|
||||
return False, "Frontmatter must be a YAML dictionary"
|
||||
except yaml.YAMLError as e:
|
||||
return False, f"Invalid YAML in frontmatter: {e}"
|
||||
|
||||
# Define allowed properties
|
||||
ALLOWED_PROPERTIES = {'name', 'description', 'license', 'allowed-tools', 'metadata', 'compatibility'}
|
||||
|
||||
# Check for unexpected properties (excluding nested keys under metadata)
|
||||
unexpected_keys = set(frontmatter.keys()) - ALLOWED_PROPERTIES
|
||||
if unexpected_keys:
|
||||
return False, (
|
||||
f"Unexpected key(s) in SKILL.md frontmatter: {', '.join(sorted(unexpected_keys))}. "
|
||||
f"Allowed properties are: {', '.join(sorted(ALLOWED_PROPERTIES))}"
|
||||
)
|
||||
|
||||
# Check required fields
|
||||
if 'name' not in frontmatter:
|
||||
return False, "Missing 'name' in frontmatter"
|
||||
if 'description' not in frontmatter:
|
||||
return False, "Missing 'description' in frontmatter"
|
||||
|
||||
# Extract name for validation
|
||||
name = frontmatter.get('name', '')
|
||||
if not isinstance(name, str):
|
||||
return False, f"Name must be a string, got {type(name).__name__}"
|
||||
name = name.strip()
|
||||
if name:
|
||||
# Check naming convention (kebab-case: lowercase with hyphens)
|
||||
if not re.match(r'^[a-z0-9-]+$', name):
|
||||
return False, f"Name '{name}' should be kebab-case (lowercase letters, digits, and hyphens only)"
|
||||
if name.startswith('-') or name.endswith('-') or '--' in name:
|
||||
return False, f"Name '{name}' cannot start/end with hyphen or contain consecutive hyphens"
|
||||
# Check name length (max 64 characters per spec)
|
||||
if len(name) > 64:
|
||||
return False, f"Name is too long ({len(name)} characters). Maximum is 64 characters."
|
||||
|
||||
# Extract and validate description
|
||||
description = frontmatter.get('description', '')
|
||||
if not isinstance(description, str):
|
||||
return False, f"Description must be a string, got {type(description).__name__}"
|
||||
description = description.strip()
|
||||
if description:
|
||||
# Check for angle brackets
|
||||
if '<' in description or '>' in description:
|
||||
return False, "Description cannot contain angle brackets (< or >)"
|
||||
# Check description length (max 1024 characters per spec)
|
||||
if len(description) > 1024:
|
||||
return False, f"Description is too long ({len(description)} characters). Maximum is 1024 characters."
|
||||
|
||||
# Validate compatibility field if present (optional)
|
||||
compatibility = frontmatter.get('compatibility', '')
|
||||
if compatibility:
|
||||
if not isinstance(compatibility, str):
|
||||
return False, f"Compatibility must be a string, got {type(compatibility).__name__}"
|
||||
if len(compatibility) > 500:
|
||||
return False, f"Compatibility is too long ({len(compatibility)} characters). Maximum is 500 characters."
|
||||
|
||||
return True, "Skill is valid!"
|
||||
|
||||
if __name__ == "__main__":
|
||||
if len(sys.argv) != 2:
|
||||
print("Usage: python quick_validate.py <skill_directory>")
|
||||
sys.exit(1)
|
||||
|
||||
valid, message = validate_skill(sys.argv[1])
|
||||
print(message)
|
||||
sys.exit(0 if valid else 1)
|
||||
310
.agents/skills/skill-creator/scripts/run_eval.py
Normal file
310
.agents/skills/skill-creator/scripts/run_eval.py
Normal file
@@ -0,0 +1,310 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Run trigger evaluation for a skill description.
|
||||
|
||||
Tests whether a skill's description causes Claude to trigger (read the skill)
|
||||
for a set of queries. Outputs results as JSON.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import select
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import uuid
|
||||
from concurrent.futures import ProcessPoolExecutor, as_completed
|
||||
from pathlib import Path
|
||||
|
||||
from scripts.utils import parse_skill_md
|
||||
|
||||
|
||||
def find_project_root() -> Path:
|
||||
"""Find the project root by walking up from cwd looking for .claude/.
|
||||
|
||||
Mimics how Claude Code discovers its project root, so the command file
|
||||
we create ends up where claude -p will look for it.
|
||||
"""
|
||||
current = Path.cwd()
|
||||
for parent in [current, *current.parents]:
|
||||
if (parent / ".claude").is_dir():
|
||||
return parent
|
||||
return current
|
||||
|
||||
|
||||
def run_single_query(
|
||||
query: str,
|
||||
skill_name: str,
|
||||
skill_description: str,
|
||||
timeout: int,
|
||||
project_root: str,
|
||||
model: str | None = None,
|
||||
) -> bool:
|
||||
"""Run a single query and return whether the skill was triggered.
|
||||
|
||||
Creates a command file in .claude/commands/ so it appears in Claude's
|
||||
available_skills list, then runs `claude -p` with the raw query.
|
||||
Uses --include-partial-messages to detect triggering early from
|
||||
stream events (content_block_start) rather than waiting for the
|
||||
full assistant message, which only arrives after tool execution.
|
||||
"""
|
||||
unique_id = uuid.uuid4().hex[:8]
|
||||
clean_name = f"{skill_name}-skill-{unique_id}"
|
||||
project_commands_dir = Path(project_root) / ".claude" / "commands"
|
||||
command_file = project_commands_dir / f"{clean_name}.md"
|
||||
|
||||
try:
|
||||
project_commands_dir.mkdir(parents=True, exist_ok=True)
|
||||
# Use YAML block scalar to avoid breaking on quotes in description
|
||||
indented_desc = "\n ".join(skill_description.split("\n"))
|
||||
command_content = (
|
||||
f"---\n"
|
||||
f"description: |\n"
|
||||
f" {indented_desc}\n"
|
||||
f"---\n\n"
|
||||
f"# {skill_name}\n\n"
|
||||
f"This skill handles: {skill_description}\n"
|
||||
)
|
||||
command_file.write_text(command_content)
|
||||
|
||||
cmd = [
|
||||
"claude",
|
||||
"-p", query,
|
||||
"--output-format", "stream-json",
|
||||
"--verbose",
|
||||
"--include-partial-messages",
|
||||
]
|
||||
if model:
|
||||
cmd.extend(["--model", model])
|
||||
|
||||
# Remove CLAUDECODE env var to allow nesting claude -p inside a
|
||||
# Claude Code session. The guard is for interactive terminal conflicts;
|
||||
# programmatic subprocess usage is safe.
|
||||
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"}
|
||||
|
||||
process = subprocess.Popen(
|
||||
cmd,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.DEVNULL,
|
||||
cwd=project_root,
|
||||
env=env,
|
||||
)
|
||||
|
||||
triggered = False
|
||||
start_time = time.time()
|
||||
buffer = ""
|
||||
# Track state for stream event detection
|
||||
pending_tool_name = None
|
||||
accumulated_json = ""
|
||||
|
||||
try:
|
||||
while time.time() - start_time < timeout:
|
||||
if process.poll() is not None:
|
||||
remaining = process.stdout.read()
|
||||
if remaining:
|
||||
buffer += remaining.decode("utf-8", errors="replace")
|
||||
break
|
||||
|
||||
ready, _, _ = select.select([process.stdout], [], [], 1.0)
|
||||
if not ready:
|
||||
continue
|
||||
|
||||
chunk = os.read(process.stdout.fileno(), 8192)
|
||||
if not chunk:
|
||||
break
|
||||
buffer += chunk.decode("utf-8", errors="replace")
|
||||
|
||||
while "\n" in buffer:
|
||||
line, buffer = buffer.split("\n", 1)
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
|
||||
try:
|
||||
event = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
|
||||
# Early detection via stream events
|
||||
if event.get("type") == "stream_event":
|
||||
se = event.get("event", {})
|
||||
se_type = se.get("type", "")
|
||||
|
||||
if se_type == "content_block_start":
|
||||
cb = se.get("content_block", {})
|
||||
if cb.get("type") == "tool_use":
|
||||
tool_name = cb.get("name", "")
|
||||
if tool_name in ("Skill", "Read"):
|
||||
pending_tool_name = tool_name
|
||||
accumulated_json = ""
|
||||
else:
|
||||
return False
|
||||
|
||||
elif se_type == "content_block_delta" and pending_tool_name:
|
||||
delta = se.get("delta", {})
|
||||
if delta.get("type") == "input_json_delta":
|
||||
accumulated_json += delta.get("partial_json", "")
|
||||
if clean_name in accumulated_json:
|
||||
return True
|
||||
|
||||
elif se_type in ("content_block_stop", "message_stop"):
|
||||
if pending_tool_name:
|
||||
return clean_name in accumulated_json
|
||||
if se_type == "message_stop":
|
||||
return False
|
||||
|
||||
# Fallback: full assistant message
|
||||
elif event.get("type") == "assistant":
|
||||
message = event.get("message", {})
|
||||
for content_item in message.get("content", []):
|
||||
if content_item.get("type") != "tool_use":
|
||||
continue
|
||||
tool_name = content_item.get("name", "")
|
||||
tool_input = content_item.get("input", {})
|
||||
if tool_name == "Skill" and clean_name in tool_input.get("skill", ""):
|
||||
triggered = True
|
||||
elif tool_name == "Read" and clean_name in tool_input.get("file_path", ""):
|
||||
triggered = True
|
||||
return triggered
|
||||
|
||||
elif event.get("type") == "result":
|
||||
return triggered
|
||||
finally:
|
||||
# Clean up process on any exit path (return, exception, timeout)
|
||||
if process.poll() is None:
|
||||
process.kill()
|
||||
process.wait()
|
||||
|
||||
return triggered
|
||||
finally:
|
||||
if command_file.exists():
|
||||
command_file.unlink()
|
||||
|
||||
|
||||
def run_eval(
|
||||
eval_set: list[dict],
|
||||
skill_name: str,
|
||||
description: str,
|
||||
num_workers: int,
|
||||
timeout: int,
|
||||
project_root: Path,
|
||||
runs_per_query: int = 1,
|
||||
trigger_threshold: float = 0.5,
|
||||
model: str | None = None,
|
||||
) -> dict:
|
||||
"""Run the full eval set and return results."""
|
||||
results = []
|
||||
|
||||
with ProcessPoolExecutor(max_workers=num_workers) as executor:
|
||||
future_to_info = {}
|
||||
for item in eval_set:
|
||||
for run_idx in range(runs_per_query):
|
||||
future = executor.submit(
|
||||
run_single_query,
|
||||
item["query"],
|
||||
skill_name,
|
||||
description,
|
||||
timeout,
|
||||
str(project_root),
|
||||
model,
|
||||
)
|
||||
future_to_info[future] = (item, run_idx)
|
||||
|
||||
query_triggers: dict[str, list[bool]] = {}
|
||||
query_items: dict[str, dict] = {}
|
||||
for future in as_completed(future_to_info):
|
||||
item, _ = future_to_info[future]
|
||||
query = item["query"]
|
||||
query_items[query] = item
|
||||
if query not in query_triggers:
|
||||
query_triggers[query] = []
|
||||
try:
|
||||
query_triggers[query].append(future.result())
|
||||
except Exception as e:
|
||||
print(f"Warning: query failed: {e}", file=sys.stderr)
|
||||
query_triggers[query].append(False)
|
||||
|
||||
for query, triggers in query_triggers.items():
|
||||
item = query_items[query]
|
||||
trigger_rate = sum(triggers) / len(triggers)
|
||||
should_trigger = item["should_trigger"]
|
||||
if should_trigger:
|
||||
did_pass = trigger_rate >= trigger_threshold
|
||||
else:
|
||||
did_pass = trigger_rate < trigger_threshold
|
||||
results.append({
|
||||
"query": query,
|
||||
"should_trigger": should_trigger,
|
||||
"trigger_rate": trigger_rate,
|
||||
"triggers": sum(triggers),
|
||||
"runs": len(triggers),
|
||||
"pass": did_pass,
|
||||
})
|
||||
|
||||
passed = sum(1 for r in results if r["pass"])
|
||||
total = len(results)
|
||||
|
||||
return {
|
||||
"skill_name": skill_name,
|
||||
"description": description,
|
||||
"results": results,
|
||||
"summary": {
|
||||
"total": total,
|
||||
"passed": passed,
|
||||
"failed": total - passed,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Run trigger evaluation for a skill description")
|
||||
parser.add_argument("--eval-set", required=True, help="Path to eval set JSON file")
|
||||
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
|
||||
parser.add_argument("--description", default=None, help="Override description to test")
|
||||
parser.add_argument("--num-workers", type=int, default=10, help="Number of parallel workers")
|
||||
parser.add_argument("--timeout", type=int, default=30, help="Timeout per query in seconds")
|
||||
parser.add_argument("--runs-per-query", type=int, default=3, help="Number of runs per query")
|
||||
parser.add_argument("--trigger-threshold", type=float, default=0.5, help="Trigger rate threshold")
|
||||
parser.add_argument("--model", default=None, help="Model to use for claude -p (default: user's configured model)")
|
||||
parser.add_argument("--verbose", action="store_true", help="Print progress to stderr")
|
||||
args = parser.parse_args()
|
||||
|
||||
eval_set = json.loads(Path(args.eval_set).read_text())
|
||||
skill_path = Path(args.skill_path)
|
||||
|
||||
if not (skill_path / "SKILL.md").exists():
|
||||
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
name, original_description, content = parse_skill_md(skill_path)
|
||||
description = args.description or original_description
|
||||
project_root = find_project_root()
|
||||
|
||||
if args.verbose:
|
||||
print(f"Evaluating: {description}", file=sys.stderr)
|
||||
|
||||
output = run_eval(
|
||||
eval_set=eval_set,
|
||||
skill_name=name,
|
||||
description=description,
|
||||
num_workers=args.num_workers,
|
||||
timeout=args.timeout,
|
||||
project_root=project_root,
|
||||
runs_per_query=args.runs_per_query,
|
||||
trigger_threshold=args.trigger_threshold,
|
||||
model=args.model,
|
||||
)
|
||||
|
||||
if args.verbose:
|
||||
summary = output["summary"]
|
||||
print(f"Results: {summary['passed']}/{summary['total']} passed", file=sys.stderr)
|
||||
for r in output["results"]:
|
||||
status = "PASS" if r["pass"] else "FAIL"
|
||||
rate_str = f"{r['triggers']}/{r['runs']}"
|
||||
print(f" [{status}] rate={rate_str} expected={r['should_trigger']}: {r['query'][:70]}", file=sys.stderr)
|
||||
|
||||
print(json.dumps(output, indent=2))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
332
.agents/skills/skill-creator/scripts/run_loop.py
Normal file
332
.agents/skills/skill-creator/scripts/run_loop.py
Normal file
@@ -0,0 +1,332 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Run the eval + improve loop until all pass or max iterations reached.
|
||||
|
||||
Combines run_eval.py and improve_description.py in a loop, tracking history
|
||||
and returning the best description found. Supports train/test split to prevent
|
||||
overfitting.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import random
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
import webbrowser
|
||||
from pathlib import Path
|
||||
|
||||
import anthropic
|
||||
|
||||
from scripts.generate_report import generate_html
|
||||
from scripts.improve_description import improve_description
|
||||
from scripts.run_eval import find_project_root, run_eval
|
||||
from scripts.utils import parse_skill_md
|
||||
|
||||
|
||||
def split_eval_set(eval_set: list[dict], holdout: float, seed: int = 42) -> tuple[list[dict], list[dict]]:
|
||||
"""Split eval set into train and test sets, stratified by should_trigger."""
|
||||
random.seed(seed)
|
||||
|
||||
# Separate by should_trigger
|
||||
trigger = [e for e in eval_set if e["should_trigger"]]
|
||||
no_trigger = [e for e in eval_set if not e["should_trigger"]]
|
||||
|
||||
# Shuffle each group
|
||||
random.shuffle(trigger)
|
||||
random.shuffle(no_trigger)
|
||||
|
||||
# Calculate split points
|
||||
n_trigger_test = max(1, int(len(trigger) * holdout))
|
||||
n_no_trigger_test = max(1, int(len(no_trigger) * holdout))
|
||||
|
||||
# Split
|
||||
test_set = trigger[:n_trigger_test] + no_trigger[:n_no_trigger_test]
|
||||
train_set = trigger[n_trigger_test:] + no_trigger[n_no_trigger_test:]
|
||||
|
||||
return train_set, test_set
|
||||
|
||||
|
||||
def run_loop(
|
||||
eval_set: list[dict],
|
||||
skill_path: Path,
|
||||
description_override: str | None,
|
||||
num_workers: int,
|
||||
timeout: int,
|
||||
max_iterations: int,
|
||||
runs_per_query: int,
|
||||
trigger_threshold: float,
|
||||
holdout: float,
|
||||
model: str,
|
||||
verbose: bool,
|
||||
live_report_path: Path | None = None,
|
||||
log_dir: Path | None = None,
|
||||
) -> dict:
|
||||
"""Run the eval + improvement loop."""
|
||||
project_root = find_project_root()
|
||||
name, original_description, content = parse_skill_md(skill_path)
|
||||
current_description = description_override or original_description
|
||||
|
||||
# Split into train/test if holdout > 0
|
||||
if holdout > 0:
|
||||
train_set, test_set = split_eval_set(eval_set, holdout)
|
||||
if verbose:
|
||||
print(f"Split: {len(train_set)} train, {len(test_set)} test (holdout={holdout})", file=sys.stderr)
|
||||
else:
|
||||
train_set = eval_set
|
||||
test_set = []
|
||||
|
||||
client = anthropic.Anthropic()
|
||||
history = []
|
||||
exit_reason = "unknown"
|
||||
|
||||
for iteration in range(1, max_iterations + 1):
|
||||
if verbose:
|
||||
print(f"\n{'='*60}", file=sys.stderr)
|
||||
print(f"Iteration {iteration}/{max_iterations}", file=sys.stderr)
|
||||
print(f"Description: {current_description}", file=sys.stderr)
|
||||
print(f"{'='*60}", file=sys.stderr)
|
||||
|
||||
# Evaluate train + test together in one batch for parallelism
|
||||
all_queries = train_set + test_set
|
||||
t0 = time.time()
|
||||
all_results = run_eval(
|
||||
eval_set=all_queries,
|
||||
skill_name=name,
|
||||
description=current_description,
|
||||
num_workers=num_workers,
|
||||
timeout=timeout,
|
||||
project_root=project_root,
|
||||
runs_per_query=runs_per_query,
|
||||
trigger_threshold=trigger_threshold,
|
||||
model=model,
|
||||
)
|
||||
eval_elapsed = time.time() - t0
|
||||
|
||||
# Split results back into train/test by matching queries
|
||||
train_queries_set = {q["query"] for q in train_set}
|
||||
train_result_list = [r for r in all_results["results"] if r["query"] in train_queries_set]
|
||||
test_result_list = [r for r in all_results["results"] if r["query"] not in train_queries_set]
|
||||
|
||||
train_passed = sum(1 for r in train_result_list if r["pass"])
|
||||
train_total = len(train_result_list)
|
||||
train_summary = {"passed": train_passed, "failed": train_total - train_passed, "total": train_total}
|
||||
train_results = {"results": train_result_list, "summary": train_summary}
|
||||
|
||||
if test_set:
|
||||
test_passed = sum(1 for r in test_result_list if r["pass"])
|
||||
test_total = len(test_result_list)
|
||||
test_summary = {"passed": test_passed, "failed": test_total - test_passed, "total": test_total}
|
||||
test_results = {"results": test_result_list, "summary": test_summary}
|
||||
else:
|
||||
test_results = None
|
||||
test_summary = None
|
||||
|
||||
history.append({
|
||||
"iteration": iteration,
|
||||
"description": current_description,
|
||||
"train_passed": train_summary["passed"],
|
||||
"train_failed": train_summary["failed"],
|
||||
"train_total": train_summary["total"],
|
||||
"train_results": train_results["results"],
|
||||
"test_passed": test_summary["passed"] if test_summary else None,
|
||||
"test_failed": test_summary["failed"] if test_summary else None,
|
||||
"test_total": test_summary["total"] if test_summary else None,
|
||||
"test_results": test_results["results"] if test_results else None,
|
||||
# For backward compat with report generator
|
||||
"passed": train_summary["passed"],
|
||||
"failed": train_summary["failed"],
|
||||
"total": train_summary["total"],
|
||||
"results": train_results["results"],
|
||||
})
|
||||
|
||||
# Write live report if path provided
|
||||
if live_report_path:
|
||||
partial_output = {
|
||||
"original_description": original_description,
|
||||
"best_description": current_description,
|
||||
"best_score": "in progress",
|
||||
"iterations_run": len(history),
|
||||
"holdout": holdout,
|
||||
"train_size": len(train_set),
|
||||
"test_size": len(test_set),
|
||||
"history": history,
|
||||
}
|
||||
live_report_path.write_text(generate_html(partial_output, auto_refresh=True, skill_name=name))
|
||||
|
||||
if verbose:
|
||||
def print_eval_stats(label, results, elapsed):
|
||||
pos = [r for r in results if r["should_trigger"]]
|
||||
neg = [r for r in results if not r["should_trigger"]]
|
||||
tp = sum(r["triggers"] for r in pos)
|
||||
pos_runs = sum(r["runs"] for r in pos)
|
||||
fn = pos_runs - tp
|
||||
fp = sum(r["triggers"] for r in neg)
|
||||
neg_runs = sum(r["runs"] for r in neg)
|
||||
tn = neg_runs - fp
|
||||
total = tp + tn + fp + fn
|
||||
precision = tp / (tp + fp) if (tp + fp) > 0 else 1.0
|
||||
recall = tp / (tp + fn) if (tp + fn) > 0 else 1.0
|
||||
accuracy = (tp + tn) / total if total > 0 else 0.0
|
||||
print(f"{label}: {tp+tn}/{total} correct, precision={precision:.0%} recall={recall:.0%} accuracy={accuracy:.0%} ({elapsed:.1f}s)", file=sys.stderr)
|
||||
for r in results:
|
||||
status = "PASS" if r["pass"] else "FAIL"
|
||||
rate_str = f"{r['triggers']}/{r['runs']}"
|
||||
print(f" [{status}] rate={rate_str} expected={r['should_trigger']}: {r['query'][:60]}", file=sys.stderr)
|
||||
|
||||
print_eval_stats("Train", train_results["results"], eval_elapsed)
|
||||
if test_summary:
|
||||
print_eval_stats("Test ", test_results["results"], 0)
|
||||
|
||||
if train_summary["failed"] == 0:
|
||||
exit_reason = f"all_passed (iteration {iteration})"
|
||||
if verbose:
|
||||
print(f"\nAll train queries passed on iteration {iteration}!", file=sys.stderr)
|
||||
break
|
||||
|
||||
if iteration == max_iterations:
|
||||
exit_reason = f"max_iterations ({max_iterations})"
|
||||
if verbose:
|
||||
print(f"\nMax iterations reached ({max_iterations}).", file=sys.stderr)
|
||||
break
|
||||
|
||||
# Improve the description based on train results
|
||||
if verbose:
|
||||
print(f"\nImproving description...", file=sys.stderr)
|
||||
|
||||
t0 = time.time()
|
||||
# Strip test scores from history so improvement model can't see them
|
||||
blinded_history = [
|
||||
{k: v for k, v in h.items() if not k.startswith("test_")}
|
||||
for h in history
|
||||
]
|
||||
new_description = improve_description(
|
||||
client=client,
|
||||
skill_name=name,
|
||||
skill_content=content,
|
||||
current_description=current_description,
|
||||
eval_results=train_results,
|
||||
history=blinded_history,
|
||||
model=model,
|
||||
log_dir=log_dir,
|
||||
iteration=iteration,
|
||||
)
|
||||
improve_elapsed = time.time() - t0
|
||||
|
||||
if verbose:
|
||||
print(f"Proposed ({improve_elapsed:.1f}s): {new_description}", file=sys.stderr)
|
||||
|
||||
current_description = new_description
|
||||
|
||||
# Find the best iteration by TEST score (or train if no test set)
|
||||
if test_set:
|
||||
best = max(history, key=lambda h: h["test_passed"] or 0)
|
||||
best_score = f"{best['test_passed']}/{best['test_total']}"
|
||||
else:
|
||||
best = max(history, key=lambda h: h["train_passed"])
|
||||
best_score = f"{best['train_passed']}/{best['train_total']}"
|
||||
|
||||
if verbose:
|
||||
print(f"\nExit reason: {exit_reason}", file=sys.stderr)
|
||||
print(f"Best score: {best_score} (iteration {best['iteration']})", file=sys.stderr)
|
||||
|
||||
return {
|
||||
"exit_reason": exit_reason,
|
||||
"original_description": original_description,
|
||||
"best_description": best["description"],
|
||||
"best_score": best_score,
|
||||
"best_train_score": f"{best['train_passed']}/{best['train_total']}",
|
||||
"best_test_score": f"{best['test_passed']}/{best['test_total']}" if test_set else None,
|
||||
"final_description": current_description,
|
||||
"iterations_run": len(history),
|
||||
"holdout": holdout,
|
||||
"train_size": len(train_set),
|
||||
"test_size": len(test_set),
|
||||
"history": history,
|
||||
}
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Run eval + improve loop")
|
||||
parser.add_argument("--eval-set", required=True, help="Path to eval set JSON file")
|
||||
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
|
||||
parser.add_argument("--description", default=None, help="Override starting description")
|
||||
parser.add_argument("--num-workers", type=int, default=10, help="Number of parallel workers")
|
||||
parser.add_argument("--timeout", type=int, default=30, help="Timeout per query in seconds")
|
||||
parser.add_argument("--max-iterations", type=int, default=5, help="Max improvement iterations")
|
||||
parser.add_argument("--runs-per-query", type=int, default=3, help="Number of runs per query")
|
||||
parser.add_argument("--trigger-threshold", type=float, default=0.5, help="Trigger rate threshold")
|
||||
parser.add_argument("--holdout", type=float, default=0.4, help="Fraction of eval set to hold out for testing (0 to disable)")
|
||||
parser.add_argument("--model", required=True, help="Model for improvement")
|
||||
parser.add_argument("--verbose", action="store_true", help="Print progress to stderr")
|
||||
parser.add_argument("--report", default="auto", help="Generate HTML report at this path (default: 'auto' for temp file, 'none' to disable)")
|
||||
parser.add_argument("--results-dir", default=None, help="Save all outputs (results.json, report.html, log.txt) to a timestamped subdirectory here")
|
||||
args = parser.parse_args()
|
||||
|
||||
eval_set = json.loads(Path(args.eval_set).read_text())
|
||||
skill_path = Path(args.skill_path)
|
||||
|
||||
if not (skill_path / "SKILL.md").exists():
|
||||
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
name, _, _ = parse_skill_md(skill_path)
|
||||
|
||||
# Set up live report path
|
||||
if args.report != "none":
|
||||
if args.report == "auto":
|
||||
timestamp = time.strftime("%Y%m%d_%H%M%S")
|
||||
live_report_path = Path(tempfile.gettempdir()) / f"skill_description_report_{skill_path.name}_{timestamp}.html"
|
||||
else:
|
||||
live_report_path = Path(args.report)
|
||||
# Open the report immediately so the user can watch
|
||||
live_report_path.write_text("<html><body><h1>Starting optimization loop...</h1><meta http-equiv='refresh' content='5'></body></html>")
|
||||
webbrowser.open(str(live_report_path))
|
||||
else:
|
||||
live_report_path = None
|
||||
|
||||
# Determine output directory (create before run_loop so logs can be written)
|
||||
if args.results_dir:
|
||||
timestamp = time.strftime("%Y-%m-%d_%H%M%S")
|
||||
results_dir = Path(args.results_dir) / timestamp
|
||||
results_dir.mkdir(parents=True, exist_ok=True)
|
||||
else:
|
||||
results_dir = None
|
||||
|
||||
log_dir = results_dir / "logs" if results_dir else None
|
||||
|
||||
output = run_loop(
|
||||
eval_set=eval_set,
|
||||
skill_path=skill_path,
|
||||
description_override=args.description,
|
||||
num_workers=args.num_workers,
|
||||
timeout=args.timeout,
|
||||
max_iterations=args.max_iterations,
|
||||
runs_per_query=args.runs_per_query,
|
||||
trigger_threshold=args.trigger_threshold,
|
||||
holdout=args.holdout,
|
||||
model=args.model,
|
||||
verbose=args.verbose,
|
||||
live_report_path=live_report_path,
|
||||
log_dir=log_dir,
|
||||
)
|
||||
|
||||
# Save JSON output
|
||||
json_output = json.dumps(output, indent=2)
|
||||
print(json_output)
|
||||
if results_dir:
|
||||
(results_dir / "results.json").write_text(json_output)
|
||||
|
||||
# Write final HTML report (without auto-refresh)
|
||||
if live_report_path:
|
||||
live_report_path.write_text(generate_html(output, auto_refresh=False, skill_name=name))
|
||||
print(f"\nReport: {live_report_path}", file=sys.stderr)
|
||||
|
||||
if results_dir and live_report_path:
|
||||
(results_dir / "report.html").write_text(generate_html(output, auto_refresh=False, skill_name=name))
|
||||
|
||||
if results_dir:
|
||||
print(f"Results saved to: {results_dir}", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
47
.agents/skills/skill-creator/scripts/utils.py
Normal file
47
.agents/skills/skill-creator/scripts/utils.py
Normal file
@@ -0,0 +1,47 @@
|
||||
"""Shared utilities for skill-creator scripts."""
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
|
||||
def parse_skill_md(skill_path: Path) -> tuple[str, str, str]:
|
||||
"""Parse a SKILL.md file, returning (name, description, full_content)."""
|
||||
content = (skill_path / "SKILL.md").read_text()
|
||||
lines = content.split("\n")
|
||||
|
||||
if lines[0].strip() != "---":
|
||||
raise ValueError("SKILL.md missing frontmatter (no opening ---)")
|
||||
|
||||
end_idx = None
|
||||
for i, line in enumerate(lines[1:], start=1):
|
||||
if line.strip() == "---":
|
||||
end_idx = i
|
||||
break
|
||||
|
||||
if end_idx is None:
|
||||
raise ValueError("SKILL.md missing frontmatter (no closing ---)")
|
||||
|
||||
name = ""
|
||||
description = ""
|
||||
frontmatter_lines = lines[1:end_idx]
|
||||
i = 0
|
||||
while i < len(frontmatter_lines):
|
||||
line = frontmatter_lines[i]
|
||||
if line.startswith("name:"):
|
||||
name = line[len("name:"):].strip().strip('"').strip("'")
|
||||
elif line.startswith("description:"):
|
||||
value = line[len("description:"):].strip()
|
||||
# Handle YAML multiline indicators (>, |, >-, |-)
|
||||
if value in (">", "|", ">-", "|-"):
|
||||
continuation_lines: list[str] = []
|
||||
i += 1
|
||||
while i < len(frontmatter_lines) and (frontmatter_lines[i].startswith(" ") or frontmatter_lines[i].startswith("\t")):
|
||||
continuation_lines.append(frontmatter_lines[i].strip())
|
||||
i += 1
|
||||
description = " ".join(continuation_lines)
|
||||
continue
|
||||
else:
|
||||
description = value.strip('"').strip("'")
|
||||
i += 1
|
||||
|
||||
return name, description, content
|
||||
11
.idea/go.imports.xml
generated
Normal file
11
.idea/go.imports.xml
generated
Normal file
@@ -0,0 +1,11 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<project version="4">
|
||||
<component name="GoImports">
|
||||
<option name="excludedPackages">
|
||||
<array>
|
||||
<option value="github.com/pkg/errors" />
|
||||
<option value="golang.org/x/net/context" />
|
||||
</array>
|
||||
</option>
|
||||
</component>
|
||||
</project>
|
||||
7
.idea/sqldialects.xml
generated
Normal file
7
.idea/sqldialects.xml
generated
Normal file
@@ -0,0 +1,7 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<project version="4">
|
||||
<component name="SqlDialectMappings">
|
||||
<file url="file://$PROJECT_DIR$/101-数据库学习/3-SQLite/9-题目/SQLite基础考察.sql" dialect="SQLite" />
|
||||
<file url="PROJECT" dialect="SQLite" />
|
||||
</component>
|
||||
</project>
|
||||
BIN
0-pandoc-失败/epub-失败/output.epub
Normal file
BIN
0-pandoc-失败/epub-失败/output.epub
Normal file
Binary file not shown.
BIN
0-pandoc-失败/output.pdf
Normal file
BIN
0-pandoc-失败/output.pdf
Normal file
Binary file not shown.
0
0-pandoc-失败/pdf/mermaid-filter.err
Normal file
0
0-pandoc-失败/pdf/mermaid-filter.err
Normal file
@@ -1,141 +0,0 @@
|
||||
---
|
||||
name: coding-vue3-vuetify
|
||||
description: Build production-grade Vue 3 + TypeScript + Vuetify 3 interfaces with architectural rigor. Use when creating Vue components, pages, layouts, Pinia stores, or API modules. Enforces strict typing, Composition API patterns, Material Design 3 aesthetics, and bulletproof data handling.
|
||||
---
|
||||
|
||||
This skill crafts Vue 3 + Vuetify 3 code that is architecturally sound, type-safe to the bone, and visually polished. Every component should feel like it belongs in a production codebase that senior engineers would be proud to maintain.
|
||||
|
||||
The user provides: $ARGUMENTS (component specs, page requirements, feature requests, or architectural questions).
|
||||
|
||||
## Architectural Thinking
|
||||
|
||||
Before writing a single line, establish clarity:
|
||||
|
||||
- **Component Identity**: Is this a Page, Layout, Reusable Component, Composable, Store, or API Module? Each has distinct patterns.
|
||||
- **Data Gravity**: Where does state live? Props flow down, events bubble up. Pinia for cross-component state. `provide/inject` for deep hierarchies.
|
||||
- **Scroll Strategy**: Which container owns the scroll? Never the body. Always explicit. Always controlled.
|
||||
- **Failure Modes**: What happens when data is `null`? Empty array? Network timeout? Design for the unhappy path first.
|
||||
|
||||
**CRITICAL**: Production code anticipates chaos. Type everything. Guard everything. Gracefully degrade everything.
|
||||
|
||||
## Core Dogma
|
||||
|
||||
### TypeScript Absolutism
|
||||
- `<script setup lang="ts">` — the ONLY acceptable incantation
|
||||
- `any` is forbidden — use `unknown` + type guards, generics, utility types
|
||||
- Every prop, emit, ref, and API response wears its type proudly
|
||||
- Types live in `@/types/`, organized by domain: `user.d.ts`, `order.d.ts`
|
||||
|
||||
### Composition API Purity
|
||||
- `ref`, `reactive`, `computed`, `watchEffect` — master these four
|
||||
- `shallowRef`, `readonly`, `toRaw` — know when to reach for optimization
|
||||
- Lifecycle via `onMounted`, `onUnmounted` — never mix Options API
|
||||
- Pinia stores: typed state, typed getters, typed actions — no exceptions
|
||||
|
||||
### Vuetify 3 + Material Design 3
|
||||
- ALL UI through Vuetify components — no raw HTML for UI elements
|
||||
- Theme-aware always — `rgb(var(--v-theme-surface))`, never `#ffffff`
|
||||
- `useDisplay()` for responsive logic — breakpoints are first-class citizens
|
||||
- Density matters — `density="compact"` for data-heavy interfaces
|
||||
|
||||
### Layout Philosophy
|
||||
```
|
||||
┌─────────────────────────────────┐
|
||||
│ Toolbar (flex-shrink-0) │
|
||||
├─────────────────────────────────┤
|
||||
│ │
|
||||
│ Content Area │
|
||||
│ (flex-grow-1, overflow-y-auto) │
|
||||
│ (min-height: 0) ← CRITICAL │
|
||||
│ │
|
||||
├─────────────────────────────────┤
|
||||
│ Footer (flex-shrink-0) │
|
||||
└─────────────────────────────────┘
|
||||
```
|
||||
- **No body scroll** — viewport locked, content scrolls in containers
|
||||
- **Flexbox trap**: `flex-grow-1` children MUST have `min-height: 0`
|
||||
- **Sticky elements**: filters, table headers — always visible during scroll
|
||||
|
||||
## Data Robustness Patterns
|
||||
|
||||
Treat all external data as hostile:
|
||||
|
||||
```typescript
|
||||
// Defensive access
|
||||
const userName = user?.profile?.name ?? 'Unknown'
|
||||
|
||||
// Array safety
|
||||
const items = Array.isArray(response.data) ? response.data : []
|
||||
|
||||
// Existence guards in templates
|
||||
<template v-if="user">{{ user.name }}</template>
|
||||
<v-empty-state v-else />
|
||||
```
|
||||
|
||||
## UI State Trinity
|
||||
|
||||
Every data-driven view handles THREE states:
|
||||
|
||||
| State | Component | Never Do |
|
||||
|-------|-----------|----------|
|
||||
| **Loading** | `v-skeleton-loader` | Show stale data or blank screen |
|
||||
| **Empty** | `v-empty-state` with action | Leave white void |
|
||||
| **Error** | Snackbar + retry option | Silent failure |
|
||||
|
||||
## Table & List Commandments
|
||||
|
||||
- `fixed-header` on every `v-data-table` — non-negotiable
|
||||
- Truncated text gets `v-tooltip` — users deserve full content on hover
|
||||
- 100+ items? `v-virtual-scroll` — DOM nodes stay constant
|
||||
- Column widths explicit — no layout lottery
|
||||
|
||||
## Anti-Patterns (NEVER)
|
||||
|
||||
- `.js` files in a TypeScript project
|
||||
- `any` without a blood oath and written justification
|
||||
- Hardcoded colors: `color="#1976d2"` → `color="primary"`
|
||||
- Body-level scrolling in SPA layouts
|
||||
- Tables without fixed headers
|
||||
- Truncated text without tooltips
|
||||
- Empty states that are literally empty
|
||||
- Loading states that freeze the UI
|
||||
- API calls without error handling
|
||||
|
||||
## Reference Files
|
||||
|
||||
Consult these for implementation details:
|
||||
|
||||
| Need | Read |
|
||||
|------|------|
|
||||
| Advanced TypeScript patterns | `reference/typescript-rules.md` |
|
||||
| Complex layout structures | `reference/layout-patterns.md` |
|
||||
| API client architecture | `reference/api-patterns.md` |
|
||||
| Tables, lists, forms, feedback | `reference/ui-interaction.md` |
|
||||
|
||||
## Project Anatomy
|
||||
|
||||
```
|
||||
src/
|
||||
├── api/ # Axios instance + modules
|
||||
├── components/ # Shared components
|
||||
├── composables/ # Reusable hooks
|
||||
├── layouts/ # Page shells
|
||||
├── pages/ # Route views
|
||||
├── plugins/ # Vuetify, Pinia, Router
|
||||
├── store/ # Pinia stores
|
||||
├── styles/ # Global SCSS
|
||||
├── types/ # Type definitions
|
||||
└── utils/ # Pure functions
|
||||
```
|
||||
|
||||
## Output Protocol
|
||||
|
||||
1. State the architectural approach (2-3 sentences)
|
||||
2. List files to create with their purposes
|
||||
3. Implement each file completely — no placeholders, no TODOs
|
||||
4. Verify against the anti-patterns list
|
||||
5. Call out any assumptions or trade-offs made
|
||||
|
||||
---
|
||||
|
||||
Remember: You're not writing code that works. You're writing code that works, scales, maintains, and delights. Every `ref` is typed. Every edge case is handled. Every loading state is beautiful. This is what production-grade means.
|
||||
105
1-AgentSkills/designing-contracts/SKILL.md
Normal file
105
1-AgentSkills/designing-contracts/SKILL.md
Normal file
@@ -0,0 +1,105 @@
|
||||
---
|
||||
name: designing-contracts
|
||||
description: "Guides API contract design, event schema definition, and version compatibility management for RMDC system. Triggered when defining new APIs, modifying request/response structures, designing MQTT message schemas, or planning breaking changes. Keywords: API versioning, backward compatibility, schema evolution, OpenAPI, event contract, MQTT payload, breaking change."
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Glob
|
||||
- Grep
|
||||
- Bash
|
||||
argument-hint: "$ARGUMENTS: <contract-type> [module] — contract-type: api|event|schema|breaking-change"
|
||||
---
|
||||
|
||||
# designing-contracts
|
||||
|
||||
## 概述
|
||||
本 Skill 指导 RMDC 系统的契约设计,包括 API 接口、事件消息、数据 Schema 的定义与版本管理。
|
||||
|
||||
## 动态上下文注入
|
||||
|
||||
### 查找现有 API 定义
|
||||
!`grep -rn "router\.\(GET\|POST\|PUT\|DELETE\)" --include="*.go" | head -30`
|
||||
|
||||
### 查找 MQTT topic 定义
|
||||
!`grep -rn "topic\|Topic\|MQTT" --include="*.go" | head -20`
|
||||
|
||||
---
|
||||
|
||||
## Plan(规划阶段)
|
||||
|
||||
### 契约变更类型判定
|
||||
| 类型 | 影响范围 | 兼容性要求 |
|
||||
|:---|:---|:---|
|
||||
| 新增字段(可选) | 低 | 向后兼容 |
|
||||
| 新增必填字段 | 高 | Breaking Change |
|
||||
| 修改字段类型 | 高 | Breaking Change |
|
||||
| 删除字段 | 高 | Breaking Change |
|
||||
| 修改字段语义 | 高 | Breaking Change |
|
||||
|
||||
### 决策点
|
||||
- [ ] 是否为 Breaking Change?
|
||||
- [ ] 是否需要版本化(v1 -> v2)?
|
||||
- [ ] 影响哪些下游模块/客户端?
|
||||
- [ ] 是否需要过渡期(同时支持新旧版本)?
|
||||
|
||||
---
|
||||
|
||||
## Verify(验证清单)
|
||||
|
||||
### API 契约检查
|
||||
- [ ] 新增字段有默认值
|
||||
- [ ] 必填字段未被删除
|
||||
- [ ] 字段类型未变更
|
||||
- [ ] 错误码向后兼容
|
||||
- [ ] 响应结构保持一致
|
||||
|
||||
### 事件契约检查
|
||||
- [ ] Topic 命名遵循规范
|
||||
- [ ] Payload 字段向后兼容
|
||||
- [ ] 消息版本字段存在
|
||||
- [ ] 消费者兼容新字段
|
||||
|
||||
### 验证命令
|
||||
```bash
|
||||
# 对比 API 变更
|
||||
git diff HEAD~1 --name-only | grep -E "handler|router"
|
||||
|
||||
# 检查 breaking change
|
||||
./scripts/verify-api-compatibility.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Execute(执行步骤)
|
||||
|
||||
### 新增 API
|
||||
1. 定义请求/响应结构体
|
||||
2. 添加字段校验规则
|
||||
3. 注册路由
|
||||
4. 更新 API 文档
|
||||
5. 通知下游模块
|
||||
|
||||
### Breaking Change 处理
|
||||
1. 创建新版本路由(/v2/...)
|
||||
2. 保留旧版本兼容期
|
||||
3. 添加废弃警告
|
||||
4. 通知所有消费者
|
||||
5. 设定旧版本下线时间
|
||||
|
||||
---
|
||||
|
||||
## Pitfalls(常见坑)
|
||||
|
||||
1. **删除字段未通知下游**:前端/其他模块仍在使用该字段,导致解析失败。
|
||||
2. **修改字段类型**:如 `string` 改 `int`,JSON 解析会失败。
|
||||
3. **错误码语义变更**:下游按错误码做分支处理,语义变更会导致逻辑错误。
|
||||
4. **MQTT payload 无版本字段**:无法做向后兼容处理。
|
||||
5. **必填字段无默认值**:旧客户端无法正常调用。
|
||||
|
||||
---
|
||||
|
||||
## 相关文件
|
||||
| 用途 | 路径 |
|
||||
|:---|:---|
|
||||
| 版本策略 | [reference/api-versioning-policy.md](reference/api-versioning-policy.md) |
|
||||
| 事件规则 | [reference/event-schema-rules.md](reference/event-schema-rules.md) |
|
||||
| Breaking Change 清单 | [reference/breaking-change-checklist.md](reference/breaking-change-checklist.md) |
|
||||
@@ -0,0 +1,46 @@
|
||||
# API 版本策略
|
||||
|
||||
## 版本号规范
|
||||
|
||||
- URL 路径版本:`/api/v1/users`, `/api/v2/users`
|
||||
- 仅在 Breaking Change 时升级主版本
|
||||
|
||||
## 向后兼容规则
|
||||
|
||||
### 兼容变更(无需升版本)
|
||||
- 新增可选请求字段
|
||||
- 新增响应字段
|
||||
- 新增 API 端点
|
||||
- 新增枚举值(消费者需忽略未知值)
|
||||
|
||||
### Breaking Change(必须升版本)
|
||||
- 删除/重命名字段
|
||||
- 修改字段类型
|
||||
- 修改字段语义
|
||||
- 删除 API 端点
|
||||
- 修改 URL 路径
|
||||
- 修改认证方式
|
||||
|
||||
## 版本过渡期
|
||||
|
||||
1. 新版本发布时,旧版本继续可用
|
||||
2. 旧版本标记 `Deprecated` 响应头
|
||||
3. 过渡期建议:2-4 周
|
||||
4. 过渡期结束前通知所有消费者
|
||||
5. 下线旧版本
|
||||
|
||||
## 示例
|
||||
|
||||
```go
|
||||
// v1 路由组
|
||||
v1 := router.Group("/api/v1")
|
||||
{
|
||||
v1.GET("/users", handler.ListUsersV1)
|
||||
}
|
||||
|
||||
// v2 路由组(Breaking Change 后)
|
||||
v2 := router.Group("/api/v2")
|
||||
{
|
||||
v2.GET("/users", handler.ListUsersV2)
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,35 @@
|
||||
# Breaking Change 检查清单
|
||||
|
||||
## 变更前
|
||||
|
||||
- [ ] 确认变更类型是否为 Breaking Change
|
||||
- [ ] 列出所有受影响的消费者(模块/前端/第三方)
|
||||
- [ ] 评估影响范围和严重程度
|
||||
- [ ] 确定是否可以用兼容方式实现
|
||||
|
||||
## 设计阶段
|
||||
|
||||
- [ ] 创建新版本 API(如 v2)
|
||||
- [ ] 设计过渡期方案
|
||||
- [ ] 编写迁移指南文档
|
||||
- [ ] 确定旧版本下线时间表
|
||||
|
||||
## 实施阶段
|
||||
|
||||
- [ ] 实现新版本接口
|
||||
- [ ] 旧版本添加 Deprecated 标记
|
||||
- [ ] 更新 API 文档
|
||||
- [ ] 通知所有消费者
|
||||
|
||||
## 过渡期
|
||||
|
||||
- [ ] 监控旧版本调用量
|
||||
- [ ] 跟进消费者迁移进度
|
||||
- [ ] 提供迁移支持
|
||||
|
||||
## 下线阶段
|
||||
|
||||
- [ ] 确认无活跃的旧版本调用
|
||||
- [ ] 最终通知
|
||||
- [ ] 下线旧版本
|
||||
- [ ] 清理旧代码
|
||||
@@ -0,0 +1,44 @@
|
||||
# 事件 Schema 规则
|
||||
|
||||
## MQTT Topic 命名规范
|
||||
|
||||
```
|
||||
rmdc/{module}/{resource}/{action}
|
||||
```
|
||||
|
||||
示例:
|
||||
- `rmdc/exchange-hub/command/created`
|
||||
- `rmdc/watchdog/deployment/started`
|
||||
- `rmdc/user-auth/user/registered`
|
||||
|
||||
## Payload 结构规范
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "1.0",
|
||||
"timestamp": "2026-01-23T10:00:00Z",
|
||||
"trace_id": "abc-123",
|
||||
"event_type": "user.registered",
|
||||
"source": "rmdc-user-auth",
|
||||
"data": {
|
||||
// 业务数据
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 必须字段
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| version | string | Schema 版本 |
|
||||
| timestamp | string | ISO 8601 时间戳 |
|
||||
| trace_id | string | 追踪ID |
|
||||
| event_type | string | 事件类型 |
|
||||
| source | string | 来源模块 |
|
||||
| data | object | 业务数据 |
|
||||
|
||||
## 演进规则
|
||||
|
||||
- 新增字段:在 `data` 中添加,消费者忽略未知字段
|
||||
- 删除字段:先标记废弃,保留 2 个版本后删除
|
||||
- 类型变更:升级 `version` 字段
|
||||
@@ -0,0 +1,57 @@
|
||||
#!/bin/bash
|
||||
# verify-api-compatibility.sh - 验证 API 契约兼容性
|
||||
# 依赖: git, jq (可选)
|
||||
# 用法: ./verify-api-compatibility.sh [base-branch]
|
||||
|
||||
set -e
|
||||
|
||||
BASE_BRANCH=${1:-main}
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
|
||||
echo "=== API 契约兼容性检查 ==="
|
||||
echo "对比分支: ${BASE_BRANCH}"
|
||||
echo ""
|
||||
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m'
|
||||
|
||||
pass() { echo -e "${GREEN}[PASS]${NC} $1"; }
|
||||
fail() { echo -e "${RED}[FAIL]${NC} $1"; }
|
||||
warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
|
||||
|
||||
# 获取变更的 handler 文件
|
||||
CHANGED_HANDLERS=$(git diff ${BASE_BRANCH} --name-only | grep -E "handler.*\.go$" || true)
|
||||
|
||||
if [ -z "$CHANGED_HANDLERS" ]; then
|
||||
pass "无 handler 文件变更"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "变更的 Handler 文件:"
|
||||
echo "$CHANGED_HANDLERS"
|
||||
echo ""
|
||||
|
||||
# 检查是否有字段删除
|
||||
for file in $CHANGED_HANDLERS; do
|
||||
if [ -f "$file" ]; then
|
||||
# 检查是否有结构体字段删除
|
||||
DELETED_LINES=$(git diff ${BASE_BRANCH} -- "$file" | grep "^-" | grep -E "^\s+\w+\s+\w+" || true)
|
||||
if [ -n "$DELETED_LINES" ]; then
|
||||
warn "可能的字段删除 in $file:"
|
||||
echo "$DELETED_LINES"
|
||||
fi
|
||||
|
||||
# 检查是否有新增必填字段
|
||||
ADDED_REQUIRED=$(git diff ${BASE_BRANCH} -- "$file" | grep "^+" | grep 'binding:"required"' || true)
|
||||
if [ -n "$ADDED_REQUIRED" ]; then
|
||||
warn "新增必填字段 in $file (可能是 Breaking Change):"
|
||||
echo "$ADDED_REQUIRED"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo "=== 检查完成 ==="
|
||||
echo "请人工确认上述变更是否为 Breaking Change"
|
||||
@@ -1,7 +1,8 @@
|
||||
---
|
||||
name: developing-project-management
|
||||
description: Guides development of rmdc-project-management module including project lifecycle management, version control (Git-like), ACL permissions, TOTP authorization, and workflow integration. Triggered when modifying project CRUD, draft/version APIs, permission grants, or authorization features. Keywords: project mangement, project lifecycle, version snapshot, ACL, TOTP, workflow callback, SuperAdmin.
|
||||
argument-hint: "<change-type> [target]" where change-type is one of: api|entity|service|migration|frontend|auth. Example: "api draft-submit" or "migration add-field"
|
||||
name: developing-project-management 项目管理模块开发指南
|
||||
description: "Guides development of rmdc-project-management 模块:项目全生命周期管理(Project Lifecycle)、类 Git 版本管理(snapshot/diff,支持草稿/发布/回滚)、ACL 权限控制、TOTP 二次授权、与工单(work-procedure)的联动回调,以及 Vue3 + Vuetify 前端页面/组件开发。Triggered when you modify 项目 CRUD、草稿/版本相关 API(draft/version)、权限授予与授权能力(ACL/TOTP/SuperAdmin)、工作流回调与状态同步、数据库迁移,或前端 ProjectDetail 等页面与组件。Keywords: 项目生命周期, 版本快照, diff 算法, ACL, TOTP, workflow callback, SuperAdmin, optimistic lock/乐观锁, Vue3, Vuetify, ProjectDetail."
|
||||
argument-hint: "<change-type> [target] — change-type 可选: api|entity|service|migration|frontend|auth|version|component. Examples: \"api draft-submit\" / \"frontend ProjectDetail\" / \"version diff-algorithm\""
|
||||
|
||||
allowed-tools:
|
||||
- Read
|
||||
- Glob
|
||||
@@ -17,20 +18,28 @@ allowed-tools:
|
||||
|
||||
## 模块定位
|
||||
|
||||
- **核心职责**: 项目 CRUD、版本控制(Git-like)、细粒度 ACL 权限、一级 TOTP 授权
|
||||
- **技术栈**: Go + Gin + GORM + PostgreSQL (JSONB)
|
||||
- **核心职责**: 项目 CRUD、Git-like 版本控制、细粒度 ACL 权限、一级 TOTP 授权
|
||||
- **后端技术栈**: Go + Gin + GORM + PostgreSQL (JSONB)
|
||||
- **前端技术栈**: Vue3 + TypeScript + Vuetify3
|
||||
- **架构**: 模块化单体,通过接口注入与 `rmdc-work-procedure` 工单模块协作
|
||||
- **版本控制思想**: 类似 Git 的分支管理(Master 主线 + 用户草稿分支)
|
||||
|
||||
## 动态上下文注入
|
||||
|
||||
使用前先获取当前仓库状态:
|
||||
|
||||
```bash
|
||||
# 查看项目管理模块目录结构
|
||||
# 查看项目管理模块后端目录结构
|
||||
!`find . -path "*/rmdc-project-management/*" -name "*.go" | head -20`
|
||||
|
||||
# 查找生命周期状态相关代码
|
||||
!`grep -rn "lifecycle_status\|LifecycleStatus" --include="*.go" | head -15`
|
||||
# 查看前端组件目录结构
|
||||
!`find . -path "*/admin/components/*" -name "*.vue" | head -20`
|
||||
|
||||
# 查找版本控制相关代码
|
||||
!`grep -rn "VersionSnapshot\|CompareVersions\|DiffResult" --include="*.go" | head -15`
|
||||
|
||||
# 查找前端生命周期状态相关代码
|
||||
!`grep -rn "lifecycle_status\|LIFECYCLE_STATUS" --include="*.vue" --include="*.ts" | head -15`
|
||||
```
|
||||
|
||||
---
|
||||
@@ -41,56 +50,72 @@ allowed-tools:
|
||||
|
||||
根据 `$ARGUMENTS` 确定变更范围:
|
||||
|
||||
| 变更类型 | 产物文件 | 影响模块 |
|
||||
|:---|:---|:---|
|
||||
| `api` | `handler/*.go`, `router.go` | rmdc-core 路由注册 |
|
||||
| `entity` | `entity/*.go` | 数据库迁移、DTO 映射 |
|
||||
| `service` | `service/*.go` | 业务逻辑、版本快照 |
|
||||
| `migration` | `migrations/*.sql` | 数据库 Schema |
|
||||
| `frontend` | `pages/*.vue`, `components/*.vue` | 前端联调 |
|
||||
| `auth` | `service/auth_*.go` | TOTP 授权、Exchange-Hub 交互 |
|
||||
| 变更类型 | 产物文件 | 影响模块 | 参考文档 |
|
||||
|:---|:---|:---|:---|
|
||||
| `api` | `handler/*.go`, `router.go` | rmdc-core 路由注册 | `reference/06-api-design/api-endpoints.md` |
|
||||
| `entity` | `entity/*.go` | 数据库迁移、DTO 映射 | `reference/05-database-schema/data-structures.md` |
|
||||
| `service` | `service/*.go` | 业务逻辑、版本快照 | `reference/04-version-control/version-design.md` |
|
||||
| `migration` | `migrations/*.sql` | 数据库 Schema | `reference/05-database-schema/database-schema.md` |
|
||||
| `frontend` | `pages/*.vue`, `components/*.vue` | 前端页面 | `reference/07-frontend-design/` |
|
||||
| `auth` | `service/auth_*.go` | TOTP 授权、Exchange-Hub | `reference/03-permission-model/acl-permission.md` |
|
||||
| `version` | `service/version_*.go` | 版本快照、Diff 算法 | `reference/04-version-control/version-design.md` |
|
||||
| `component` | `components/*.vue` | 前端组件开发 | `reference/07-frontend-design/component-specifications.md` |
|
||||
|
||||
### 决策点
|
||||
|
||||
1. **是否涉及生命周期状态变更?**
|
||||
- 若涉及,必须同步更新状态机转换逻辑
|
||||
- 检查 `reference/lifecycle-state-machine.md`
|
||||
|
||||
2. **是否修改版本快照结构?**
|
||||
- 若涉及,需评估历史版本兼容性
|
||||
- 更新 `VersionSnapshot` 结构体
|
||||
|
||||
3. **是否变更 ACL 权限模型?**
|
||||
- 若涉及,需同步 `rmdc-user-auth` 模块
|
||||
- 检查 `reference/acl-permission-model.md`
|
||||
|
||||
4. **是否影响工单模块回调?**
|
||||
- 若涉及,需更新 `ProjectLifecycleUpdater` 接口实现
|
||||
- 检查 `reference/workflow-state-mapping.md`
|
||||
1. **是否涉及生命周期状态变更?** → 检查 `reference/02-lifecycle-state-machine/lifecycle-states.md`
|
||||
2. **是否修改版本快照结构?** → 检查 `reference/04-version-control/version-design.md`
|
||||
3. **是否涉及并发修改冲突?** → 检查乐观锁实现(base_version 校验)
|
||||
4. **是否变更 ACL 权限模型?** → 检查 `reference/03-permission-model/acl-permission.md`
|
||||
5. **是否影响工单模块回调?** → 检查 `reference/02-lifecycle-state-machine/workflow-state-mapping.md`
|
||||
6. **是否涉及前端页面修改?** → 检查 `reference/07-frontend-design/page-architecture.md`
|
||||
7. **是否涉及用户侧/管理侧差异?** → 检查 `reference/07-frontend-design/user-admin-difference.md`
|
||||
|
||||
---
|
||||
|
||||
## Verify(验证阶段)
|
||||
|
||||
### Checklist
|
||||
### 后端 Checklist
|
||||
|
||||
- [ ] **生命周期状态机完整性**: 所有状态转换有明确的触发条件和权限控制
|
||||
- [ ] **版本快照一致性**: `projects` 表与 `project_versions` 表数据同步
|
||||
- [ ] **乐观锁检查**: 并发修改时 `base_version == current_version` 校验存在
|
||||
- [ ] **ACL 权限验证**: 接口权限注解与业务逻辑一致
|
||||
- [ ] **超管直改版本生成**: SuperAdmin 直接修改必须同时生成版本记录(原子事务)
|
||||
- [ ] **Diff 算法正确性**: 版本对比结果按模块分组,字段路径完整,中文名映射正确
|
||||
- [ ] **ACL 权限验证**: 接口权限注解与业务逻辑一致,授权模块仅 SuperAdmin 可见
|
||||
- [ ] **工单回调幂等**: 状态更新操作具备幂等性
|
||||
- [ ] **敏感字段加密**: 密码字段使用 AES-256 加密存储
|
||||
- [ ] **审计日志**: 所有写操作记录到 `rmdc-audit-log`
|
||||
- [ ] **TOTP 授权安全**: 一级密钥仅 SuperAdmin 可访问
|
||||
- [ ] **Namespace 校验**: 符合 RFC 1123 DNS 标签规范
|
||||
|
||||
### 前端 Checklist
|
||||
|
||||
- [ ] **状态分离**: 查看/编辑模式正确切换,`isEditMode` 状态管理正确
|
||||
- [ ] **脏数据检测**: `hasChanges` computed 正确计算,退出时有确认对话框
|
||||
- [ ] **角色差异化**: SuperAdmin 与普通用户看到的 Tab 和操作按钮符合设计
|
||||
- [ ] **生命周期展示**: 状态标签颜色、图标、Alert Banner 符合规范
|
||||
- [ ] **工单关联**: 多工单场景正确处理,跳转链接正确
|
||||
- [ ] **组件复用**: 共用组件正确抽离,Props 和 Emits 设计合理
|
||||
- [ ] **响应式布局**: 移动端适配正确,断点设置符合 Vuetify 规范
|
||||
- [ ] **TypeScript 类型**: 类型定义完整,无 any 类型
|
||||
|
||||
### 验证命令
|
||||
|
||||
```bash
|
||||
# 检查实体字段与数据库 Schema 一致性
|
||||
!`grep -rn "gorm:\"" entity/project.go | head -20`
|
||||
# 检查版本服务实现
|
||||
!`grep -rn "CompareVersions\|CreateOfficialVersion\|VersionSnapshot" service/*.go`
|
||||
|
||||
# 检查 API 路由权限注解
|
||||
!`grep -rn "RequireRole\|RequirePermission" handler/*.go`
|
||||
# 检查乐观锁实现
|
||||
!`grep -rn "base_version\|BaseVersion\|VersionConflict\|409" --include="*.go"`
|
||||
|
||||
# 检查敏感字段加密
|
||||
!`grep -rn "EncryptAES\|DecryptAES\|admin_password\|ssh_pwd" --include="*.go"`
|
||||
|
||||
# 检查前端生命周期状态
|
||||
!`grep -rn "LIFECYCLE_STATUS\|getLifecycleStatusColor" --include="*.vue" --include="*.ts"`
|
||||
|
||||
# 检查前端组件引用
|
||||
!`grep -rn "BasicInfoReadonly\|SaveConfirmDialog" --include="*.vue"`
|
||||
|
||||
# 运行模块单元测试
|
||||
go test ./internal/project/... -v -cover
|
||||
@@ -100,54 +125,66 @@ go test ./internal/project/... -v -cover
|
||||
|
||||
## Execute(执行阶段)
|
||||
|
||||
### API 开发流程
|
||||
### 后端 API 开发流程
|
||||
|
||||
1. **定义请求/响应结构体** → `dto/project_dto.go`
|
||||
2. **实现 Service 方法** → `service/project_service.go`
|
||||
3. **实现 Handler 方法** → `handler/project_handler.go`
|
||||
4. **注册路由** → `router.go` (注意权限中间件)
|
||||
5. **编写单元测试** → `*_test.go`
|
||||
1. 定义请求/响应结构体 → `dto/project_dto.go`
|
||||
2. 实现 Service 方法 → `service/project_service.go`
|
||||
3. 实现 Handler 方法 → `handler/project_handler.go`
|
||||
4. 注册路由 → `router.go` (注意权限中间件)
|
||||
5. 编写单元测试 → `*_test.go`
|
||||
|
||||
### 版本快照变更流程
|
||||
|
||||
1. 更新 `VersionSnapshot` 结构体定义
|
||||
2. 确保 `CompareVersions` Diff 算法兼容新字段
|
||||
3. 添加字段到 Diff 结果的字段名映射表
|
||||
1. 更新 `VersionSnapshot` 结构体定义 → `reference/05-database-schema/data-structures.md`
|
||||
2. 更新字段名映射表 `fieldNameMap` → 确保 Diff 显示中文名
|
||||
3. 确保 `CompareVersions` Diff 算法兼容新字段
|
||||
4. 测试历史版本查看功能不受影响
|
||||
|
||||
### 生命周期状态变更流程
|
||||
### SuperAdmin 直改流程
|
||||
|
||||
1. 更新 `reference/lifecycle-state-machine.md` 状态图
|
||||
2. 修改 `service/lifecycle_service.go` 状态转换逻辑
|
||||
3. 同步更新 `ProjectLifecycleUpdater` 接口实现
|
||||
4. 验证与工单模块的状态映射表一致
|
||||
1. 更新 `projects` 表 + 插入 `project_versions` 表**必须在同一事务**
|
||||
2. `workflow_id` 设为空或 `DIRECT_EDIT` 标识
|
||||
3. `committer_id` 记录 SuperAdmin ID
|
||||
4. 更新 `current_version` 字段
|
||||
|
||||
### 授权功能变更流程
|
||||
### 前端组件开发流程
|
||||
|
||||
1. 检查 `project_auth_configs` 表结构
|
||||
2. 更新 `AuthorizationInfo` 结构体
|
||||
3. 确保 TOTP 密钥生成/验证逻辑正确
|
||||
4. 测试与 Exchange-Hub 的授权指令下发
|
||||
1. **只读组件** → `*Readonly.vue`,使用 `v-row/v-col` 布局
|
||||
2. **表单组件** → `*Form.vue`,使用 Vuetify 表单组件
|
||||
3. **组件导出** → 更新 `components/index.ts`
|
||||
4. **页面集成** → 在 `ProjectDetail.vue` 或 `UserProjectDetail.vue` 中引用
|
||||
5. **类型定义** → 更新 `types/*.ts`
|
||||
|
||||
### 前端状态管理流程
|
||||
|
||||
1. **进入编辑模式**: 深拷贝 `masterData` → `editForm`
|
||||
2. **修改检测**: 使用 `hasChanges` computed
|
||||
3. **保存前确认**: 使用 `SaveConfirmDialog` 展示 Diff
|
||||
4. **退出保护**: 有未保存修改时弹出确认对话框
|
||||
|
||||
---
|
||||
|
||||
## Pitfalls(常见问题)
|
||||
|
||||
### 后端
|
||||
|
||||
1. **超管直改未生成版本**: SuperAdmin 直接修改 `projects` 表时,必须同时插入 `project_versions` 记录,否则版本链断裂
|
||||
|
||||
2. **草稿基准版本过期**: 用户 A 基于 v3 创建草稿,超管修改产生 v4,用户 A 提交时需检测冲突并提示 Rebase
|
||||
2. **草稿基准版本过期**: 用户 A 基于 v3 创建草稿,超管修改产生 v4,用户 A 提交时需检测冲突(409 Conflict)
|
||||
|
||||
3. **工单回调重复处理**: 工单模块可能重试回调,`ProjectLifecycleUpdater` 实现必须幂等
|
||||
|
||||
4. **ACL 权限遗漏授权模块**: `authorization_info` 模块仅 SuperAdmin 可见,其他角色查询时需过滤
|
||||
|
||||
5. **密码字段明文泄露**: `AdminPassword`、`SSHPwd` 等字段响应时必须脱敏或不返回
|
||||
5. **密码字段明文泄露**: `AdminPassword`、`SSHPwd` 等字段响应时必须脱敏(返回 `********`)
|
||||
|
||||
6. **省市级联校验缺失**: 前端省市级联选择后,后端需校验省市对应关系有效性
|
||||
### 前端
|
||||
|
||||
7. **Namespace 唯一性**: 创建项目时必须校验 `namespace` 全局唯一且符合 RFC 1123 DNS 标签规范
|
||||
6. **编辑模式状态未同步**: 切换 Tab 时 `isEditMode` 状态可能丢失,需使用 `v-window` 而非条件渲染
|
||||
|
||||
8. **JSONB 字段空值处理**: `basic_info`、`deploy_business` 等 JSONB 字段为空时,需返回空对象 `{}` 而非 `null`
|
||||
7. **Diff 计算不完整**: 对比时遗漏嵌套字段,需使用递归 JSON Diff
|
||||
|
||||
8. **工单按钮显示错误**: 多工单场景下 `workflow_id` 可能是数组,需正确处理
|
||||
|
||||
---
|
||||
|
||||
@@ -156,29 +193,42 @@ go test ./internal/project/... -v -cover
|
||||
```
|
||||
rmdc-project-management
|
||||
├── → rmdc-user-auth (用户鉴权、ACL 权限查询)
|
||||
├── → rmdc-work-procedure (工单创建、状态转换)
|
||||
├── ↔ rmdc-work-procedure (工单创建/状态转换 + 回调更新生命周期)
|
||||
├── → rmdc-audit-log (操作审计记录)
|
||||
├── → rmdc-exchange-hub (授权指令下发)
|
||||
└── ← rmdc-core (路由注册、依赖注入)
|
||||
```
|
||||
|
||||
## 关键接口
|
||||
## 关键接口速查
|
||||
|
||||
| 类别 | 路径 | 权限 |
|
||||
|:---|:---|:---|
|
||||
| 项目列表 | `POST /api/project/list` | Login |
|
||||
| 项目详情 | `POST /api/project/detail` | View ACL |
|
||||
| 创建项目 | `POST /api/project/create` | SuperAdmin |
|
||||
| 直接更新 | `POST /api/project/update` | SuperAdmin |
|
||||
| 保存草稿 | `POST /api/project/draft/save` | View ACL |
|
||||
| 提交审核 | `POST /api/project/draft/submit` | View ACL |
|
||||
| 版本历史 | `POST /api/project/version/list` | View ACL |
|
||||
| 权限分配 | `POST /api/project/permission/grant` | SuperAdmin |
|
||||
| 类别 | 路径 | 权限 | 说明 |
|
||||
|:---|:---|:---|:---|
|
||||
| 项目列表 | `POST /api/project/list` | Login | 自动过滤 ACL |
|
||||
| 项目详情 | `POST /api/project/detail` | View ACL | Master 版本 |
|
||||
| 创建项目 | `POST /api/project/create` | SuperAdmin | 同时创建填写工单 |
|
||||
| 直接更新 | `POST /api/project/update` | SuperAdmin | 必须生成新版本 |
|
||||
| 保存草稿 | `POST /api/project/draft/save` | View ACL | 更新草稿快照 |
|
||||
| 提交审核 | `POST /api/project/draft/submit` | View ACL | 检测版本冲突 |
|
||||
| 草稿差异 | `POST /api/project/draft/diff` | View ACL | 草稿与主线 Diff |
|
||||
| 版本历史 | `POST /api/project/version/list` | View ACL | 仅 official 类型 |
|
||||
| 版本对比 | `POST /api/project/version/diff` | View ACL | 按模块分组 |
|
||||
| 权限分配 | `POST /api/project/permission/grant` | SuperAdmin | 模块级权限 |
|
||||
|
||||
## 相关文档
|
||||
## 前端页面速查
|
||||
|
||||
- 生命周期状态机: `reference/lifecycle-state-machine.md`
|
||||
- API 端点清单: `reference/api-endpoints.md`
|
||||
- 数据库 Schema: `reference/database-schema.md`
|
||||
- ACL 权限模型: `reference/acl-permission-model.md`
|
||||
- 工单状态映射: `reference/workflow-state-mapping.md`
|
||||
| 页面 | 路径 | 角色 | 说明 |
|
||||
|:---|:---|:---|:---|
|
||||
| 管理员项目详情 | `pages/admin/ProjectDetail.vue` | SuperAdmin | 全功能,含授权/版本历史 |
|
||||
| 用户项目详情 | `pages/user/UserProjectDetail.vue` | User | 草稿编辑,提交审核 |
|
||||
|
||||
## 相关文档(章节分层)
|
||||
|
||||
| 章节目录 | 内容 |
|
||||
|:---|:---|
|
||||
| `reference/01-architecture-overview/` | 模块依赖关系 |
|
||||
| `reference/02-lifecycle-state-machine/` | 生命周期状态机、工单状态映射 |
|
||||
| `reference/03-permission-model/` | ACL 权限模型、权限检查流程 |
|
||||
| `reference/04-version-control/` | 版本快照、Diff 算法、乐观锁 |
|
||||
| `reference/05-database-schema/` | DDL、索引、数据结构定义 |
|
||||
| `reference/06-api-design/` | API 清单、业务流程 |
|
||||
| `reference/07-frontend-design/` | 前端页面架构、组件规范、交互时序 |
|
||||
|
||||
@@ -0,0 +1,54 @@
|
||||
# 模块依赖关系
|
||||
|
||||
> DDS-Section: 2. 系统架构
|
||||
> DDS-Lines: L33-L61
|
||||
|
||||
## 模块定位
|
||||
|
||||
`rmdc-project-management` 是 RMDC 系统的核心业务模块,负责维护以 K8s Namespace 为粒度的"项目"全生命周期管理。
|
||||
|
||||
## 依赖关系图
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph 核心层
|
||||
Core[rmdc-core<br/>API Gateway]
|
||||
end
|
||||
|
||||
subgraph 业务协作
|
||||
PM[rmdc-project-management]
|
||||
WP[rmdc-work-procedure<br/>工单流程]
|
||||
UA[rmdc-user-auth<br/>用户认证/权限]
|
||||
Aud[rmdc-audit-log<br/>审计日志]
|
||||
end
|
||||
|
||||
subgraph 边缘交互
|
||||
EH[rmdc-exchange-hub<br/>消息网关]
|
||||
WD[rmdc-watchdog]
|
||||
end
|
||||
|
||||
Core --> PM
|
||||
PM -->|用户鉴权/查询| UA
|
||||
PM -->|项目权限管理| UA
|
||||
PM -->|发起审批/接收回调| WP
|
||||
PM -->|记录操作日志| Aud
|
||||
PM -->|下发授权指令| EH
|
||||
EH <--> WD
|
||||
```
|
||||
|
||||
## 依赖方向说明
|
||||
|
||||
| 依赖方向 | 说明 |
|
||||
|:---|:---|
|
||||
| `PM → UA` | 用户鉴权、ACL 权限查询 |
|
||||
| `PM ↔ WP` | 工单创建/状态转换 + 回调更新生命周期 |
|
||||
| `PM → Aud` | 操作审计记录 |
|
||||
| `PM → EH` | 授权指令下发 |
|
||||
| `Core → PM` | 路由注册、依赖注入 |
|
||||
|
||||
## 核心职责
|
||||
|
||||
1. **项目全生命周期管理**: 创建、维护(编辑/通过工单)、发布、归档/删除
|
||||
2. **版本控制**: 记录项目配置的变更历史,支持版本回溯与差异对比(Diff)
|
||||
3. **细粒度权限控制**: 基于 ACL 的权限控制,精确到用户与项目模块
|
||||
4. **授权管理**: 管理项目的一级 TOTP 授权信息,与 Watchdog 交互
|
||||
@@ -0,0 +1,58 @@
|
||||
# 项目生命周期状态机
|
||||
|
||||
> DDS-Section: 3. 项目生命周期管理
|
||||
> DDS-Lines: L65-L106
|
||||
|
||||
## 状态定义
|
||||
|
||||
| 状态 | 说明 | 触发动作 | 权限 |
|
||||
|:---|:---|:---|:---|
|
||||
| **INIT** | 项目元数据已创建,等待详细信息录入 | 超级管理员创建项目 | SuperAdmin |
|
||||
| **DRAFTING** | 正在进行初始信息填写(关联填写工单) | 指定填写人保存/编辑 | 填写人/SuperAdmin |
|
||||
| **REVIEWING** | 初始信息或变更信息提交审核 | 提交审核 | SuperAdmin |
|
||||
| **RELEASED** | 审核通过,正常运行中 | 审核通过 | All (View) |
|
||||
| **MODIFYING** | 存在活跃的变更工单(不影响主线运行) | 发起修改工单 | Owner/SuperAdmin |
|
||||
| **ARCHIVED** | 软删除状态,不可见但保留数据 | 删除项目 | SuperAdmin |
|
||||
|
||||
## 状态转换图
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
[*] --> INIT: 创建项目
|
||||
|
||||
INIT --> DRAFTING: 分配填写人
|
||||
|
||||
DRAFTING --> DRAFTING: 保存草稿
|
||||
DRAFTING --> REVIEWING: 提交审核
|
||||
|
||||
REVIEWING --> DRAFTING: 审核打回
|
||||
REVIEWING --> RELEASED: 审核通过
|
||||
|
||||
RELEASED --> MODIFYING: 发起修改工单
|
||||
RELEASED --> ARCHIVED: 归档删除
|
||||
|
||||
MODIFYING --> MODIFYING: 保存草稿
|
||||
MODIFYING --> REVIEWING: 提交变更审核
|
||||
MODIFYING --> RELEASED: 撤销变更/审核通过
|
||||
|
||||
ARCHIVED --> [*]
|
||||
|
||||
note right of RELEASED: 项目认证状态=official
|
||||
note right of DRAFTING: 支持多次保存草稿
|
||||
note right of MODIFYING: 可同时存在多个变更工单
|
||||
```
|
||||
|
||||
## 状态转换触发条件
|
||||
|
||||
| From | To | 触发条件 | 执行操作 |
|
||||
|:---|:---|:---|:---|
|
||||
| `[*]` | `INIT` | SuperAdmin 创建项目 | 创建 Project 记录,生成 project_id |
|
||||
| `INIT` | `DRAFTING` | 分配填写人 | 创建填写工单,关联用户 |
|
||||
| `DRAFTING` | `DRAFTING` | 填写人保存草稿 | 更新 ProjectVersion 草稿 |
|
||||
| `DRAFTING` | `REVIEWING` | 填写人提交审核 | 工单状态 → pending_review |
|
||||
| `REVIEWING` | `DRAFTING` | 审核人打回 | 工单状态 → returned |
|
||||
| `REVIEWING` | `RELEASED` | 审核人通过 | 生成正式版本,certification → official |
|
||||
| `RELEASED` | `MODIFYING` | 发起修改工单 | 创建修改工单+草稿 |
|
||||
| `MODIFYING` | `REVIEWING` | 提交变更审核 | 工单状态 → pending_review |
|
||||
| `MODIFYING` | `RELEASED` | 撤销变更 或 审核通过 | 删除草稿 或 合并修改 |
|
||||
| `RELEASED` | `ARCHIVED` | SuperAdmin 删除 | 软删除,设置 deleted_at |
|
||||
@@ -1,14 +1,62 @@
|
||||
# 工单状态映射表
|
||||
# 生命周期与工单状态映射
|
||||
|
||||
## 设计原则
|
||||
> DDS-Section: 3.3-3.6 生命周期与工单状态映射 + 状态同步机制设计
|
||||
> DDS-Lines: L108-L296
|
||||
|
||||
项目模块(`rmdc-project-management`)与工单模块(`rmdc-work-procedure`)之间需要双向协作:
|
||||
## 工单事件映射表
|
||||
|
||||
| 工单事件 | 工单目标状态 | 项目生命周期状态 | 说明 |
|
||||
|:---|:---|:---|:---|
|
||||
| create | created | INIT→DRAFTING | 创建填写工单 |
|
||||
| draft_save | in_progress | DRAFTING/MODIFYING(保持) | 保存草稿 |
|
||||
| complete | pending_review | REVIEWING | 提交审核 |
|
||||
| resubmit | pending_review | REVIEWING | 被打回后重新提交 |
|
||||
| return | returned | DRAFTING | 审核人打回 |
|
||||
| approve | approved→closed | RELEASED | 审核人通过 |
|
||||
|
||||
## 状态同步机制设计
|
||||
|
||||
### 设计原则
|
||||
|
||||
项目模块与工单模块之间需要双向协作:
|
||||
1. **项目 → 工单**:项目模块调用工单模块创建/转换工单
|
||||
2. **工单 → 项目**:工单状态变更后同步更新项目生命周期状态
|
||||
|
||||
当前系统采用 **"模块化单体"架构**,使用 **接口注入(依赖注入)** 方式实现模块间回调。
|
||||
当前系统采用 **"模块化单体"架构**,采用 **接口注入(依赖注入)** 方式实现模块间回调。
|
||||
|
||||
## 填写工单 (project_detail)
|
||||
### 接口定义
|
||||
|
||||
**项目模块调用工单模块的接口**:
|
||||
|
||||
```go
|
||||
// WorkflowTransitioner 工单状态转换接口
|
||||
type WorkflowTransitioner interface {
|
||||
// TransitionWorkflow 触发工单状态转换
|
||||
// @param workflowID string - 工单ID
|
||||
// @param event string - 事件类型 (complete/submit/resubmit等)
|
||||
// @param operatorID uint64 - 操作人ID
|
||||
// @param operatorName string - 操作人姓名
|
||||
// @param remark string - 操作备注
|
||||
// @return newStatus string - 工单新状态
|
||||
TransitionWorkflow(workflowID, event string, operatorID uint64,
|
||||
operatorName string, remark string) (newStatus string, err error)
|
||||
}
|
||||
```
|
||||
|
||||
**工单模块回调项目模块的接口**:
|
||||
|
||||
```go
|
||||
// ProjectLifecycleUpdater 项目生命周期状态更新接口
|
||||
type ProjectLifecycleUpdater interface {
|
||||
UpdateLifecycleStatus(projectID, lifecycleStatus string) error
|
||||
SetLifecycleToDrafting(projectID string) error
|
||||
SetLifecycleToReviewing(projectID string) error
|
||||
SetLifecycleToReleased(projectID string) error
|
||||
SetLifecycleToModifying(projectID string) error
|
||||
}
|
||||
```
|
||||
|
||||
## 填写工单 (project_detail) 映射
|
||||
|
||||
| 工单事件 | 工单From状态 | 工单To状态 | 项目生命周期状态 | 说明 |
|
||||
|:---|:---|:---|:---|:---|
|
||||
@@ -21,7 +69,7 @@
|
||||
| `approve` | `pending_review` | `approved` | `RELEASED` | 审核通过 |
|
||||
| `revoke` | any | `revoked` | `INIT` | 撤销工单 |
|
||||
|
||||
## 修改工单 (project_modify)
|
||||
## 修改工单 (project_modify) 映射
|
||||
|
||||
| 工单事件 | 工单From状态 | 工单To状态 | 项目生命周期状态 | 说明 |
|
||||
|:---|:---|:---|:---|:---|
|
||||
@@ -34,84 +82,16 @@
|
||||
| `approve` | `pending_review` | `approved` | `RELEASED` | 审核通过 |
|
||||
| `revoke` | any | `revoked` | `RELEASED` | 撤销工单 |
|
||||
|
||||
## 回调接口定义
|
||||
|
||||
### 项目模块提供的回调接口
|
||||
|
||||
```go
|
||||
// ProjectLifecycleUpdater 项目生命周期状态更新接口
|
||||
// 由 rmdc-core 在初始化时注入,工单模块状态变更时调用
|
||||
type ProjectLifecycleUpdater interface {
|
||||
// UpdateLifecycleStatus 更新项目生命周期状态
|
||||
UpdateLifecycleStatus(projectID, lifecycleStatus string) error
|
||||
|
||||
// SetLifecycleToDrafting 设置为填写中状态(工单被打回后)
|
||||
SetLifecycleToDrafting(projectID string) error
|
||||
|
||||
// SetLifecycleToReviewing 设置为审核中状态(提交审核时)
|
||||
SetLifecycleToReviewing(projectID string) error
|
||||
|
||||
// SetLifecycleToReleased 设置为已发布状态(审批通过时)
|
||||
SetLifecycleToReleased(projectID string) error
|
||||
|
||||
// SetLifecycleToModifying 设置为变更中状态(发起修改工单时)
|
||||
SetLifecycleToModifying(projectID string) error
|
||||
}
|
||||
```
|
||||
|
||||
### 项目模块调用工单模块的接口
|
||||
|
||||
```go
|
||||
// WorkflowTransitioner 工单状态转换接口
|
||||
// 由 rmdc-core 在初始化时注入,项目模块通过此接口调用工单模块
|
||||
type WorkflowTransitioner interface {
|
||||
// TransitionWorkflow 触发工单状态转换
|
||||
TransitionWorkflow(workflowID, event string, operatorID uint64,
|
||||
operatorName string, remark string) (newStatus string, err error)
|
||||
}
|
||||
|
||||
// WorkflowCreator 工单创建接口
|
||||
type WorkflowCreator interface {
|
||||
// CreateProjectWorkflow 创建项目相关工单
|
||||
CreateProjectWorkflow(req CreateWorkflowRequest) (workflowID string, err error)
|
||||
}
|
||||
```
|
||||
|
||||
## 依赖注入流程
|
||||
|
||||
在 `rmdc-core/cmd/main.go` 中完成模块间的依赖注入:
|
||||
|
||||
```go
|
||||
// 1. 初始化项目和工单服务
|
||||
projectSvc := projectHandler.RegisterRoutes(r, dbs.Project, authMiddleware)
|
||||
workflowSvc := workflowHandler.RegisterRoutes(r, dbs.Workflow, authMiddleware)
|
||||
|
||||
// 2. 注入工单→项目的回调(状态同步)
|
||||
projectCallbackSvc := initProjectWorkflowCallbacks(dbs.Project)
|
||||
workflowSvc.SetProjectLifecycleUpdater(projectCallbackSvc)
|
||||
|
||||
// 3. 注入项目→工单的调用(创建/转换工单)
|
||||
workflowCreator := initProjectWorkflowCreator(workflowSvc)
|
||||
projectSvc.SetWorkflowCreator(workflowCreator)
|
||||
|
||||
workflowTransitioner := initProjectWorkflowTransitioner(workflowSvc)
|
||||
projectSvc.SetWorkflowTransitioner(workflowTransitioner)
|
||||
```
|
||||
|
||||
## 回调处理实现
|
||||
|
||||
工单模块状态转换后,自动调用已注入的 `ProjectLifecycleUpdater` 接口更新项目状态:
|
||||
|
||||
```go
|
||||
// 工单模块 - 状态转换后触发项目状态更新
|
||||
func (s *WorkflowService) handleProjectLifecycleCallback(workflow *entity.Workflow, event string) {
|
||||
// 从业务载荷中获取项目ID
|
||||
projectID, ok := workflow.BusinessPayload["project_id"].(string)
|
||||
if !ok || projectID == "" {
|
||||
return
|
||||
}
|
||||
|
||||
// 根据事件类型更新项目生命周期状态
|
||||
|
||||
if s.projectLifecycleUpdater != nil {
|
||||
switch event {
|
||||
case entity.EventApprove:
|
||||
@@ -125,22 +105,12 @@ func (s *WorkflowService) handleProjectLifecycleCallback(workflow *entity.Workfl
|
||||
}
|
||||
```
|
||||
|
||||
## 回调处理要点
|
||||
## 接口注入优势
|
||||
|
||||
1. **幂等性**: 使用 `projectID + event + timestamp` 作为幂等键,防止重复处理
|
||||
2. **事务边界**: 状态更新与版本快照生成应在同一事务
|
||||
3. **审计日志**: 记录状态变更操作到 `rmdc-audit-log`
|
||||
4. **错误处理**: 回调失败不应阻塞工单状态转换,记录日志后继续
|
||||
|
||||
## 与 HTTP 回调的对比
|
||||
|
||||
| 特性 | 接口注入(当前实现) | HTTP 回调 |
|
||||
|:---|:---|:---|
|
||||
| **模块解耦** | ⚠️ 接口级解耦 | ✅ 完全解耦 |
|
||||
| **分布式支持** | ❌ 不支持 | ✅ 支持 |
|
||||
| **性能** | ✅ 进程内调用 | ⚠️ 网络开销 |
|
||||
| **复杂度** | ✅ 简单直接 | ⚠️ 需要重试/幂等处理 |
|
||||
| **事务一致性** | ✅ 强一致性 | ⚠️ 最终一致性 |
|
||||
| **适用场景** | 模块化单体架构 | 微服务架构 |
|
||||
|
||||
> 如果未来系统需要微服务化,可参考备选文档迁移到 HTTP 回调方案。
|
||||
| 特性 | 说明 |
|
||||
|:---|:---|
|
||||
| **简单高效** | 进程内调用,无网络开销 |
|
||||
| **类型安全** | 编译期检查接口实现 |
|
||||
| **事务一致性** | 可在同一数据库事务中执行 |
|
||||
| **无需重试** | 不涉及网络失败,无需复杂的重试机制 |
|
||||
| **开发调试** | 调试更简单,堆栈清晰 |
|
||||
@@ -0,0 +1,82 @@
|
||||
# ACL 权限控制模型
|
||||
|
||||
> DDS-Section: 4. 权限控制模型
|
||||
> DDS-Lines: L300-L347
|
||||
|
||||
## 设计原则
|
||||
|
||||
权限控制分为 **功能权限** (RBAC) 和 **数据权限** (ACL),**数据权限需精确到项目模块级别**。
|
||||
|
||||
> **重要决策**:项目权限相关表设计在 `rmdc-user-auth` 模块中,由该模块统一管理所有权限数据。
|
||||
|
||||
## 功能权限 (RBAC)
|
||||
|
||||
| 权限代码 | 说明 | 角色 |
|
||||
|:---|:---|:---|
|
||||
| `project:create` | 创建项目 | SuperAdmin |
|
||||
| `project:delete` | 删除/归档项目 | SuperAdmin |
|
||||
| `project:edit` | 直接编辑项目 | SuperAdmin |
|
||||
| `project:edit_workflow` | 通过工单编辑项目 | User (有ACL权限) |
|
||||
| `project:auth_manage` | 一级/二级授权管理 | SuperAdmin |
|
||||
| `project:permission_manage` | 项目权限分配 | SuperAdmin |
|
||||
|
||||
## 数据权限 (ACL) - 模块级别
|
||||
|
||||
### 权限模块定义
|
||||
|
||||
| 模块代码 | 模块名称 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| `basic_info` | 基本信息模块 | 项目名称、命名空间、省份城市等 |
|
||||
| `business_info` | 部署业务模块 | 部署人、部署时间、系统版本等 |
|
||||
| `environment_info` | 部署环境模块 | 主机信息、网络环境、域名等 |
|
||||
| `middleware_info` | 部署中间件模块 | MySQL、Redis、EMQX等中间件配置 |
|
||||
| `authorization_info` | 项目授权模块 | TOTP授权信息(仅SuperAdmin) |
|
||||
|
||||
### 权限类型
|
||||
|
||||
| 权限类型 | 说明 |
|
||||
|:---|:---|
|
||||
| `view` | 查看权限(可查看项目信息,可发起修改工单) |
|
||||
| `export` | 导出权限(可导出项目信息) |
|
||||
|
||||
> **说明**:编辑权限通过工单系统实现,拥有 `view` 权限的用户可以发起修改工单,由 SuperAdmin 审批后生效。
|
||||
|
||||
## 权限规则
|
||||
|
||||
1. **SuperAdmin**: 拥有所有项目的所有模块的全部权限,可直接修改
|
||||
2. **Admin**: 可以访问自己被授权的项目模块,可以向普通用户转授权限
|
||||
3. **Normal User**: 只能访问被授权的项目模块,修改需通过工单
|
||||
4. **项目填写人**: 自动获得该项目的查看权限
|
||||
5. **授权模块**: 仅 SuperAdmin 可见
|
||||
|
||||
## 权限表 DDL (位于 rmdc-user-auth 模块)
|
||||
|
||||
```go
|
||||
// ProjectACL 项目权限表 (模块级别)
|
||||
type ProjectACL struct {
|
||||
ID int64 `gorm:"primaryKey;autoIncrement" json:"id"`
|
||||
ProjectID string `gorm:"type:varchar(64);index;not null" json:"project_id"`
|
||||
UserID int64 `gorm:"index;not null" json:"user_id"`
|
||||
|
||||
// 模块代码: basic_info/business_info/environment_info/middleware_info/authorization_info
|
||||
ModuleCode string `gorm:"type:varchar(32);not null" json:"module_code"`
|
||||
|
||||
// 权限类型
|
||||
CanView bool `gorm:"default:false" json:"can_view"`
|
||||
|
||||
// 授权信息
|
||||
GrantedBy int64 `json:"granted_by"`
|
||||
GrantedAt time.Time `json:"granted_at"`
|
||||
|
||||
UpdatedAt time.Time `json:"updated_at"`
|
||||
}
|
||||
```
|
||||
|
||||
## 权限管理接口 (SuperAdmin)
|
||||
|
||||
| 方法 | 路径 | 描述 | 权限 |
|
||||
|:---|:---|:---|:---|
|
||||
| POST | `/api/project/permission/list` | 获取项目权限列表 | SuperAdmin |
|
||||
| POST | `/api/project/permission/grant` | 授予权限 | SuperAdmin |
|
||||
| POST | `/api/project/permission/revoke` | 撤销权限 | SuperAdmin |
|
||||
| POST | `/api/project/permission/batch` | 批量设置权限 | SuperAdmin |
|
||||
@@ -0,0 +1,164 @@
|
||||
# 版本控制设计 (Git-like)
|
||||
|
||||
> DDS-Section: 5. 版本控制设计 (GIT-like)
|
||||
> DDS-Lines: L350-L567
|
||||
|
||||
## 设计原则
|
||||
|
||||
采用**统一版本表**设计,将正式版本和草稿版本存储在同一张表中,通过 `version_type` 字段区分。
|
||||
|
||||
## 版本类型
|
||||
|
||||
| 版本类型 | 代码 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| 正式版本 | `official` | 审核通过后的正式版本,构成版本历史 |
|
||||
| 填写草稿 | `fill_draft` | 项目创建时填写人的草稿 |
|
||||
| 修改草稿 | `modify_draft` | 发起变更工单时的草稿 |
|
||||
|
||||
## 版本与工单关系
|
||||
|
||||
1. **填写草稿**: 与填写工单 1:1 关联
|
||||
2. **修改草稿**: 与修改工单 1:1 关联
|
||||
3. **正式版本**: 审核通过后由草稿转化而来
|
||||
4. **一个项目可以有多个修改草稿**(对应多个修改工单)
|
||||
|
||||
## 版本快照机制
|
||||
|
||||
每次审核通过后,系统自动生成一个**完整快照**存储到 `project_versions` 表中。
|
||||
|
||||
### 快照结构
|
||||
|
||||
```go
|
||||
// VersionSnapshot 版本快照结构
|
||||
type VersionSnapshot struct {
|
||||
BasicInfo *BasicInfo `json:"basic_info"`
|
||||
DeployBusiness *DeployBusiness `json:"deploy_business"`
|
||||
DeployEnv *DeployEnv `json:"deploy_env"`
|
||||
DeployMiddleware *DeployMiddleware `json:"deploy_middleware"`
|
||||
}
|
||||
```
|
||||
|
||||
### 快照生成时机
|
||||
|
||||
| 场景 | 版本号 | 版本类型 | 说明 |
|
||||
|:---|:---|:---|:---|
|
||||
| 项目首次审批通过 | v1 | official | 项目初始版本 |
|
||||
| 修改工单审批通过 | v(N+1) | official | 增量版本 |
|
||||
| **超管直接修改** | v(N+1) | official | **重要:超管直改也必须生成新版本** |
|
||||
| 用户保存草稿 | 0 | fill_draft/modify_draft | 临时版本,不计入历史 |
|
||||
|
||||
### 超级管理员直改与版本一致性
|
||||
|
||||
**问题风险**:
|
||||
如果超级管理员直接修改 `projects` 表数据而不生成版本历史,会导致版本链断裂,后续基于旧版本的工单 Diff 结果将失效或产生误导。
|
||||
|
||||
**解决方案**:
|
||||
SuperAdmin 的 "Direct Edit" 操作必须被视为一次**自动审批通过的事务**:
|
||||
|
||||
1. **原子操作**:更新 `projects` 表 + 插入 `project_versions` 表必须在同一数据库事务中完成。
|
||||
2. **版本归属**:生成的 Version 记录中,`workflow_id` 为空(或特定系统标识),`committer_id` 记录为 SuperAdmin ID。
|
||||
3. **结果**:确保 `projects.current_version` 永远指向最新的 `project_versions.version`。
|
||||
|
||||
## 并发修改与冲突检测 (Optimistic Locking)
|
||||
|
||||
### 冲突场景
|
||||
|
||||
1. 用户 A 基于 v3 版本创建草稿(Draft based on v3)。
|
||||
2. 超级管理员直接修改项目,版本升级为 v4(Current = v4)。
|
||||
3. 用户 A 提交草稿审核。
|
||||
|
||||
### 处理策略
|
||||
|
||||
1. **提交时校验**:工单提交/审核接口需校验 `draft.base_version == project.current_version`。
|
||||
2. **冲突提示**:如果版本不一致,后端返回 `409 Conflict` 错误。
|
||||
3. **前端交互**:
|
||||
* 提示用户:"项目已被修改,当前草稿已过期"。
|
||||
* 提供 **"Rebase" (变基)** 选项。
|
||||
* 或者提供 **"Diff Check"**:让用户查看差异。
|
||||
|
||||
## 版本 Diff 算法
|
||||
|
||||
采用 **JSON Diff** 算法,对比两个版本快照的差异。
|
||||
|
||||
### 差异结构
|
||||
|
||||
```go
|
||||
// DiffResult 差异结果
|
||||
type DiffResult struct {
|
||||
Module string `json:"module"` // 模块名称
|
||||
FieldDiffs []FieldDiff `json:"field_diffs"` // 字段差异列表
|
||||
}
|
||||
|
||||
// FieldDiff 字段差异
|
||||
type FieldDiff struct {
|
||||
FieldPath string `json:"field_path"` // 字段路径
|
||||
FieldName string `json:"field_name"` // 字段中文名
|
||||
OldValue interface{} `json:"old_value"` // 旧值
|
||||
NewValue interface{} `json:"new_value"` // 新值
|
||||
ChangeType string `json:"change_type"` // add/modify/delete
|
||||
}
|
||||
```
|
||||
|
||||
### Diff 实现
|
||||
|
||||
```go
|
||||
// CompareVersions 比较两个版本的差异
|
||||
// @param baseVersion 基准版本(通常是较早的版本或 master)
|
||||
// @param targetVersion 目标版本(通常是较新的版本或草稿)
|
||||
// @return []DiffResult 差异结果列表,按模块分组
|
||||
func (s *VersionService) CompareVersions(
|
||||
ctx context.Context,
|
||||
baseVersion, targetVersion *VersionSnapshot,
|
||||
) ([]DiffResult, error) {
|
||||
var results []DiffResult
|
||||
|
||||
modules := []struct {
|
||||
Name string
|
||||
Base interface{}
|
||||
Target interface{}
|
||||
}{
|
||||
{"基本信息", baseVersion.BasicInfo, targetVersion.BasicInfo},
|
||||
{"部署业务", baseVersion.DeployBusiness, targetVersion.DeployBusiness},
|
||||
{"部署环境", baseVersion.DeployEnv, targetVersion.DeployEnv},
|
||||
{"部署中间件", baseVersion.DeployMiddleware, targetVersion.DeployMiddleware},
|
||||
}
|
||||
|
||||
for _, m := range modules {
|
||||
diffs := s.diffJSON(m.Base, m.Target)
|
||||
if len(diffs) > 0 {
|
||||
results = append(results, DiffResult{
|
||||
Module: m.Name,
|
||||
FieldDiffs: diffs,
|
||||
})
|
||||
}
|
||||
}
|
||||
return results, nil
|
||||
}
|
||||
```
|
||||
|
||||
## 版本历史查询
|
||||
|
||||
### 版本列表结构
|
||||
|
||||
```go
|
||||
// VersionHistory 版本历史记录
|
||||
type VersionHistory struct {
|
||||
Version int `json:"version"` // 版本号
|
||||
VersionType string `json:"version_type"` // 版本类型
|
||||
CommitMessage string `json:"commit_message"` // 变更说明
|
||||
CommitterID int64 `json:"committer_id"` // 提交人 ID
|
||||
CommitterName string `json:"committer_name"` // 提交人姓名
|
||||
WorkflowID string `json:"workflow_id"` // 关联工单 ID
|
||||
CreatedAt time.Time `json:"created_at"` // 创建时间
|
||||
ChangeSummary string `json:"change_summary"` // 变更摘要
|
||||
}
|
||||
```
|
||||
|
||||
### 版本历史 API
|
||||
|
||||
| 方法 | 路径 | 描述 |
|
||||
|:---|:---|:---|
|
||||
| POST | `/api/project/version/list` | 获取版本历史列表 |
|
||||
| POST | `/api/project/version/detail` | 获取指定版本详情 |
|
||||
| POST | `/api/project/version/diff` | 对比两个版本差异 |
|
||||
| POST | `/api/project/version/diff-with-current` | 对比指定版本与当前版本差异 |
|
||||
@@ -0,0 +1,172 @@
|
||||
# 数据结构定义
|
||||
|
||||
> DDS-Section: 10. 附录 - 结构体定义
|
||||
> DDS-Lines: L880-L993
|
||||
|
||||
## 基本信息结构体 (BasicInfo)
|
||||
|
||||
```go
|
||||
type BasicInfo struct {
|
||||
Province string `json:"province"` // 省份
|
||||
City string `json:"city"` // 城市 (级联)
|
||||
IndustryContact string `json:"industry_contact"` // 行业组人员姓名
|
||||
IndustryPhone string `json:"industry_phone"` // 行业组人员电话
|
||||
ProjectNature string `json:"project_nature"` // 项目性质: research/test/trial/market/sub_platform
|
||||
}
|
||||
```
|
||||
|
||||
### 项目性质枚举
|
||||
|
||||
| 代码 | 说明 |
|
||||
|:---|:---|
|
||||
| `research` | 科研 |
|
||||
| `test` | 测试 |
|
||||
| `trial` | 试用 |
|
||||
| `market` | 市场化 |
|
||||
| `sub_platform` | 二级平台 |
|
||||
|
||||
## 业务部署结构体 (DeployBusiness)
|
||||
|
||||
```go
|
||||
type DeployBusiness struct {
|
||||
DeployerName string `json:"deployer_name"` // 部署人姓名
|
||||
DeployerPhone string `json:"deployer_phone"` // 部署人电话
|
||||
DeployStartTime string `json:"deploy_start_time"` // 部署开始时间
|
||||
DeployEndTime string `json:"deploy_end_time"` // 部署结束时间
|
||||
SystemVersion string `json:"system_version"` // 部署系统版本
|
||||
SystemType string `json:"system_type"` // 系统类型
|
||||
MainEntrance string `json:"main_entrance"` // 业务主要入口URL
|
||||
AdminUsername string `json:"admin_username"` // 系统超管用户名
|
||||
AdminPassword string `json:"admin_password"` // 系统超管密码 (加密存储)
|
||||
}
|
||||
```
|
||||
|
||||
### 系统类型枚举
|
||||
|
||||
| 代码 | 说明 |
|
||||
|:---|:---|
|
||||
| `business` | 业务系统 |
|
||||
| `fly-control` | 飞控系统 |
|
||||
| `supervisor` | 监管系统 |
|
||||
|
||||
## 部署环境结构体 (DeployEnv)
|
||||
|
||||
```go
|
||||
type DeployEnv struct {
|
||||
// 主机信息
|
||||
Hosts []HostInfo `json:"hosts"`
|
||||
|
||||
// 网络环境
|
||||
NetworkType string `json:"network_type"` // internal/single_public/full_public
|
||||
MainPublicIP string `json:"main_public_ip"` // 主要公网IP
|
||||
DomainURL string `json:"domain_url"` // 域名URL
|
||||
SSLEnabled bool `json:"ssl_enabled"` // 是否开启SSL
|
||||
|
||||
// 管理方式
|
||||
ManagementType string `json:"management_type"` // bastion/whitelist/vpn
|
||||
ManagementURL string `json:"management_url"` // 管理后台URL
|
||||
ManagementUser string `json:"management_user"` // 管理后台用户名
|
||||
ManagementPwd string `json:"management_pwd"` // 管理后台密码 (加密存储)
|
||||
|
||||
// 统计信息
|
||||
HostCount int `json:"host_count"` // 主机台数
|
||||
TotalCPU int `json:"total_cpu"` // CPU总核数
|
||||
CPUModel string `json:"cpu_model"` // CPU型号
|
||||
TotalMemory int `json:"total_memory"` // 内存总大小(GB)
|
||||
TotalStorage int `json:"total_storage"` // 存储总大小(GB)
|
||||
}
|
||||
|
||||
type HostInfo struct {
|
||||
Hostname string `json:"hostname"`
|
||||
InternalIP string `json:"internal_ip"`
|
||||
PublicIP string `json:"public_ip"`
|
||||
CanAccessPublic bool `json:"can_access_public"` // 能否访问公网
|
||||
SSHPort int `json:"ssh_port"`
|
||||
SSHUser string `json:"ssh_user"`
|
||||
SSHPwd string `json:"ssh_pwd"` // SSH密码 (加密存储)
|
||||
Role string `json:"role"` // master/worker/storage
|
||||
}
|
||||
```
|
||||
|
||||
### 网络类型枚举
|
||||
|
||||
| 代码 | 说明 |
|
||||
|:---|:---|
|
||||
| `internal` | 完全内网 |
|
||||
| `single_public` | 单主机公网 |
|
||||
| `full_public` | 全访问公网 |
|
||||
|
||||
### 管理方式枚举
|
||||
|
||||
| 代码 | 说明 |
|
||||
|:---|:---|
|
||||
| `bastion` | 堡垒机 |
|
||||
| `whitelist` | 白名单 |
|
||||
| `vpn` | VPN |
|
||||
|
||||
### 主机角色枚举
|
||||
|
||||
| 代码 | 说明 |
|
||||
|:---|:---|
|
||||
| `master` | 主节点 |
|
||||
| `worker` | 工作节点 |
|
||||
| `storage` | 存储节点 |
|
||||
|
||||
## 部署中间件结构体 (DeployMiddleware)
|
||||
|
||||
```go
|
||||
type DeployMiddleware struct {
|
||||
MySQL MiddlewareInfo `json:"mysql"`
|
||||
Redis MiddlewareInfo `json:"redis"`
|
||||
EMQX MiddlewareInfo `json:"emqx"`
|
||||
MinIO MiddlewareInfo `json:"minio"`
|
||||
InfluxDB MiddlewareInfo `json:"influxdb"`
|
||||
Nacos MiddlewareInfo `json:"nacos"`
|
||||
K8SDashboard MiddlewareInfo `json:"k8s_dashboard"`
|
||||
}
|
||||
|
||||
// MiddlewareInfo 通用中间件信息
|
||||
type MiddlewareInfo struct {
|
||||
PublicIP string `json:"public_ip"`
|
||||
PublicPort int `json:"public_port"`
|
||||
InternalIP string `json:"internal_ip"`
|
||||
InternalPort int `json:"internal_port"`
|
||||
K8SAddress string `json:"k8s_address"` // K8S集群内访问地址
|
||||
K8SPort int `json:"k8s_port"`
|
||||
AdminUser string `json:"admin_user"`
|
||||
AdminPwd string `json:"admin_pwd"` // 超管密码 (加密存储)
|
||||
Version string `json:"version"` // 中间件版本
|
||||
}
|
||||
```
|
||||
|
||||
## 项目授权信息结构体 (AuthorizationInfo)
|
||||
|
||||
```go
|
||||
type AuthorizationInfo struct {
|
||||
TierOneSecret string `json:"tier_one_secret"` // 一级TOTP密钥
|
||||
TimeOffset int `json:"time_offset"` // 允许时间偏移(秒)
|
||||
TOTPEnabled bool `json:"totp_enabled"` // 是否开启TOTP
|
||||
TierTwoSecret string `json:"tier_two_secret"` // 二级TOTP密钥 (来自Watchdog)
|
||||
|
||||
AuthType string `json:"auth_type"` // permanent/time_limited
|
||||
AuthDays int `json:"auth_days"` // 授权有效期(天)
|
||||
AuthorizedAt time.Time `json:"authorized_at"` // 授权时间
|
||||
RevokedAt time.Time `json:"revoked_at"` // 撤销授权时间
|
||||
IsOffline bool `json:"is_offline"` // 是否离线授权
|
||||
}
|
||||
```
|
||||
|
||||
### 授权类型枚举
|
||||
|
||||
| 代码 | 说明 |
|
||||
|:---|:---|
|
||||
| `permanent` | 永久授权 |
|
||||
| `time_limited` | 限时授权 |
|
||||
|
||||
## 敏感字段加密规则
|
||||
|
||||
遵循系统规范,以下字段必须加密存储 (AES-256):
|
||||
- `DeployBusiness.AdminPassword`
|
||||
- `DeployEnv.SSHPassword`
|
||||
- `DeployEnv.ManagementPassword`
|
||||
- `DeployMiddleware.*.AdminPassword`
|
||||
@@ -0,0 +1,195 @@
|
||||
# 数据库模型设计
|
||||
|
||||
> DDS-Section: 6. 数据模型设计
|
||||
> DDS-Lines: L570-L778
|
||||
|
||||
## 设计决策
|
||||
|
||||
| 决策点 | 决策 | 理由 |
|
||||
|:---|:---|:---|
|
||||
| JSONB vs 分表 | 使用JSONB存储 | 1. 项目信息结构复杂但查询简单 2. 版本快照需完整存储 3. 减少JOIN提升性能 |
|
||||
| 草稿存储位置 | project_versions表 | 统一版本管理,方便Diff比较 |
|
||||
| 权限表位置 | rmdc-user-auth模块 | 统一权限管理,便于跨模块授权 |
|
||||
|
||||
## ER 图
|
||||
|
||||
```mermaid
|
||||
erDiagram
|
||||
projects ||--o{ project_versions : has
|
||||
projects ||--o{ project_workflows : has
|
||||
projects ||--|| project_auth_configs : has
|
||||
|
||||
projects {
|
||||
bigint id PK
|
||||
string project_id UK
|
||||
string name
|
||||
string namespace UK
|
||||
string lifecycle_status
|
||||
string certification_status
|
||||
bigint detail_filler_id
|
||||
string detail_filler_name
|
||||
int current_version
|
||||
jsonb basic_info
|
||||
jsonb deploy_business
|
||||
jsonb deploy_env
|
||||
jsonb deploy_middleware
|
||||
datetime created_at
|
||||
datetime updated_at
|
||||
datetime deleted_at
|
||||
}
|
||||
|
||||
project_versions {
|
||||
bigint id PK
|
||||
string project_id FK
|
||||
int version
|
||||
string version_type
|
||||
bigint user_id
|
||||
string workflow_id
|
||||
jsonb snapshot_data
|
||||
string commit_message
|
||||
bigint committer_id
|
||||
string committer_name
|
||||
datetime created_at
|
||||
datetime updated_at
|
||||
}
|
||||
|
||||
project_workflows {
|
||||
bigint id PK
|
||||
string project_id FK
|
||||
string workflow_id UK
|
||||
string workflow_type
|
||||
string status
|
||||
datetime created_at
|
||||
}
|
||||
|
||||
project_auth_configs {
|
||||
bigint id PK
|
||||
string project_id FK
|
||||
string tier_one_secret
|
||||
int time_offset
|
||||
bool totp_enabled
|
||||
string auth_type
|
||||
int auth_days
|
||||
datetime created_at
|
||||
datetime updated_at
|
||||
}
|
||||
```
|
||||
|
||||
## 表关系说明
|
||||
|
||||
| 关系 | 说明 |
|
||||
|:---|:---|
|
||||
| `projects` → `project_versions` | 一对多,一个项目有多个版本记录(含草稿) |
|
||||
| `projects` → `project_workflows` | 一对多,一个项目可关联多个工单 |
|
||||
| `projects` → `project_auth_configs` | 一对一,一个项目对应一个授权配置 |
|
||||
|
||||
## 字段说明
|
||||
|
||||
| 表 | 字段 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| projects | lifecycle_status | INIT/DRAFTING/REVIEWING/RELEASED/MODIFYING/ARCHIVED |
|
||||
| projects | certification_status | draft/pending/official |
|
||||
| project_versions | version_type | official/fill_draft/modify_draft |
|
||||
| project_versions | workflow_id | 关联 rmdc-work-procedure 模块的工单 ID |
|
||||
| project_workflows | workflow_type | fill(填写)/modify(修改) |
|
||||
|
||||
## 主表 DDL - projects
|
||||
|
||||
```go
|
||||
// Project 项目主表
|
||||
type Project struct {
|
||||
ID int64 `gorm:"primaryKey;autoIncrement" json:"id"`
|
||||
ProjectID string `gorm:"type:varchar(64);uniqueIndex;not null" json:"project_id"`
|
||||
Name string `gorm:"type:varchar(128);not null" json:"name"`
|
||||
Namespace string `gorm:"type:varchar(64);uniqueIndex;not null" json:"namespace"`
|
||||
|
||||
// 生命周期状态
|
||||
LifecycleStatus string `gorm:"type:varchar(32);default:'INIT'" json:"lifecycle_status"`
|
||||
// 认证状态 (draft/pending/official)
|
||||
CertificationStatus string `gorm:"type:varchar(32);default:'draft'" json:"certification_status"`
|
||||
|
||||
// 当前正式版本号
|
||||
CurrentVersion int `gorm:"default:0" json:"current_version"`
|
||||
|
||||
// 主版本数据 (使用JSONB存储)
|
||||
BasicInfo json.RawMessage `gorm:"type:jsonb" json:"basic_info"`
|
||||
DeployBusiness json.RawMessage `gorm:"type:jsonb" json:"deploy_business"`
|
||||
DeployEnv json.RawMessage `gorm:"type:jsonb" json:"deploy_env"`
|
||||
DeployMiddleware json.RawMessage `gorm:"type:jsonb" json:"deploy_middleware"`
|
||||
|
||||
// 项目填写人
|
||||
DetailFillerID int64 `json:"detail_filler_id"`
|
||||
DetailFillerName string `gorm:"type:varchar(64)" json:"detail_filler_name"`
|
||||
|
||||
// 审计字段
|
||||
CreatedBy int64 `json:"created_by"`
|
||||
CreatedByName string `gorm:"type:varchar(64)" json:"created_by_name"`
|
||||
|
||||
common.BaseModel
|
||||
}
|
||||
```
|
||||
|
||||
## 版本表 DDL - project_versions
|
||||
|
||||
```go
|
||||
// ProjectVersion 项目版本表 (含草稿)
|
||||
type ProjectVersion struct {
|
||||
ID int64 `gorm:"primaryKey;autoIncrement" json:"id"`
|
||||
ProjectID string `gorm:"type:varchar(64);index;not null" json:"project_id"`
|
||||
|
||||
// 版本号 (正式版本递增, 草稿为0)
|
||||
Version int `gorm:"not null;default:0" json:"version"`
|
||||
|
||||
// 版本类型: official/fill_draft/modify_draft
|
||||
VersionType string `gorm:"type:varchar(32);not null" json:"version_type"`
|
||||
|
||||
// 草稿所属用户ID (仅草稿类型有值)
|
||||
UserID int64 `gorm:"index" json:"user_id"`
|
||||
UserName string `gorm:"type:varchar(64)" json:"user_name"`
|
||||
|
||||
// 关联工单ID (1:1关系)
|
||||
WorkflowID string `gorm:"type:varchar(64);index" json:"workflow_id"`
|
||||
|
||||
// 完整快照数据
|
||||
SnapshotData json.RawMessage `gorm:"type:jsonb" json:"snapshot_data"`
|
||||
|
||||
// 变更信息
|
||||
CommitMessage string `gorm:"type:varchar(255)" json:"commit_message"`
|
||||
CommitterID int64 `json:"committer_id"`
|
||||
CommitterName string `gorm:"type:varchar(64)" json:"committer_name"`
|
||||
|
||||
CreatedAt time.Time `json:"created_at"`
|
||||
UpdatedAt time.Time `json:"updated_at"`
|
||||
}
|
||||
```
|
||||
|
||||
## 项目工单关联表 DDL - project_workflows
|
||||
|
||||
```go
|
||||
// ProjectWorkflow 项目与工单关联表
|
||||
type ProjectWorkflow struct {
|
||||
ID int64 `gorm:"primaryKey;autoIncrement" json:"id"`
|
||||
ProjectID string `gorm:"type:varchar(64);index;not null" json:"project_id"`
|
||||
WorkflowID string `gorm:"type:varchar(64);uniqueIndex;not null" json:"workflow_id"`
|
||||
|
||||
// 工单类型: fill(填写)/modify(修改)
|
||||
WorkflowType string `gorm:"type:varchar(32);not null" json:"workflow_type"`
|
||||
|
||||
// 工单状态 (冗余存储)
|
||||
Status string `gorm:"type:varchar(32)" json:"status"`
|
||||
|
||||
CreatedAt time.Time `json:"created_at"`
|
||||
UpdatedAt time.Time `json:"updated_at"`
|
||||
}
|
||||
```
|
||||
|
||||
## 索引策略
|
||||
|
||||
| 表 | 索引 | 用途 |
|
||||
|:---|:---|:---|
|
||||
| projects | `UK: project_id` | 主键查询 |
|
||||
| projects | `UK: namespace` | 唯一性约束 |
|
||||
| project_versions | `IX: project_id` | 版本历史查询 |
|
||||
| project_versions | `IX: workflow_id` | 工单关联查询 |
|
||||
| project_workflows | `UK: workflow_id` | 工单唯一关联 |
|
||||
| project_workflows | `IX: project_id` | 项目关联查询 |
|
||||
@@ -0,0 +1,141 @@
|
||||
# API 接口设计
|
||||
|
||||
> DDS-Section: 7. 接口设计 (API)
|
||||
> DDS-Lines: L781-L818
|
||||
|
||||
## 项目管理接口
|
||||
|
||||
| 方法 | 路径 | 描述 | 权限 |
|
||||
|:---|:---|:---|:---|
|
||||
| POST | `/api/project/list` | 获取项目列表 (自动过滤ACL) | Login |
|
||||
| POST | `/api/project/detail` | 获取项目详情 (Master版本) | View ACL |
|
||||
| POST | `/api/project/create` | 创建项目 | SuperAdmin |
|
||||
| POST | `/api/project/update` | 直接更新项目 | SuperAdmin |
|
||||
| POST | `/api/project/delete` | 删除项目 (软删除) | SuperAdmin |
|
||||
| POST | `/api/project/export` | 导出项目信息 | Export ACL |
|
||||
|
||||
## 版本管理接口
|
||||
|
||||
| 方法 | 路径 | 描述 | 权限 |
|
||||
|:---|:---|:---|:---|
|
||||
| POST | `/api/project/version/list` | 获取版本历史列表 | View ACL |
|
||||
| POST | `/api/project/version/detail` | 获取指定版本详情 | View ACL |
|
||||
| POST | `/api/project/version/diff` | 获取版本差异 | View ACL |
|
||||
|
||||
## 草稿管理接口
|
||||
|
||||
| 方法 | 路径 | 描述 | 权限 |
|
||||
|:---|:---|:---|:---|
|
||||
| POST | `/api/project/draft/get` | 获取当前用户的草稿 | View ACL |
|
||||
| POST | `/api/project/draft/save` | 保存草稿 | View ACL |
|
||||
| POST | `/api/project/draft/submit` | 提交审核 | View ACL |
|
||||
| POST | `/api/project/draft/discard` | 放弃草稿 | View ACL |
|
||||
| POST | `/api/project/draft/diff` | 获取草稿与主线的差异 | View ACL |
|
||||
|
||||
## 权限管理接口 (SuperAdmin)
|
||||
|
||||
| 方法 | 路径 | 描述 | 权限 |
|
||||
|:---|:---|:---|:---|
|
||||
| POST | `/api/project/permission/list` | 获取项目权限列表 | SuperAdmin |
|
||||
| POST | `/api/project/permission/grant` | 授予权限 | SuperAdmin |
|
||||
| POST | `/api/project/permission/revoke` | 撤销权限 | SuperAdmin |
|
||||
| POST | `/api/project/permission/batch` | 批量设置权限 | SuperAdmin |
|
||||
|
||||
## 请求/响应示例
|
||||
|
||||
### 创建项目
|
||||
|
||||
**请求**
|
||||
```json
|
||||
{
|
||||
"project_name": "测试项目",
|
||||
"namespace": "test-project-ns",
|
||||
"detail_filler_id": 123,
|
||||
"detail_filler_name": "张三"
|
||||
}
|
||||
```
|
||||
|
||||
**响应**
|
||||
```json
|
||||
{
|
||||
"code": 0,
|
||||
"message": "success",
|
||||
"data": {
|
||||
"project_id": "proj_20260120_001",
|
||||
"workflow_id": "wf_20260120_001"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 保存草稿
|
||||
|
||||
**请求**
|
||||
```json
|
||||
{
|
||||
"project_id": "proj_20260120_001",
|
||||
"form_data": {
|
||||
"deploy_business": {
|
||||
"deployer_name": "李四",
|
||||
"deployer_phone": "13800138000",
|
||||
"system_version": "v2.0.0"
|
||||
},
|
||||
"deploy_env": {
|
||||
"network_type": "internal",
|
||||
"host_count": 3
|
||||
}
|
||||
},
|
||||
"middlewares": [
|
||||
{
|
||||
"middleware_type": "mysql",
|
||||
"internal_ip": "10.0.0.1",
|
||||
"internal_port": 3306
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 获取版本差异
|
||||
|
||||
**请求**
|
||||
```json
|
||||
{
|
||||
"project_id": "proj_20260120_001",
|
||||
"base_version": 1,
|
||||
"target_version": 2
|
||||
}
|
||||
```
|
||||
|
||||
**响应**
|
||||
```json
|
||||
{
|
||||
"code": 0,
|
||||
"data": {
|
||||
"diffs": [
|
||||
{
|
||||
"module": "部署业务",
|
||||
"field_diffs": [
|
||||
{
|
||||
"field_path": "deploy_business.system_version",
|
||||
"field_name": "系统版本",
|
||||
"old_value": "v1.0.0",
|
||||
"new_value": "v2.0.0",
|
||||
"change_type": "modify"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 错误码定义
|
||||
|
||||
| 错误码 | 说明 |
|
||||
|:---|:---|
|
||||
| `PROJECT_NOT_FOUND` | 项目不存在 |
|
||||
| `NAMESPACE_EXISTS` | 命名空间已存在 |
|
||||
| `VERSION_CONFLICT` | 版本冲突(并发修改) |
|
||||
| `PERMISSION_DENIED` | 权限不足 |
|
||||
| `INVALID_LIFECYCLE_STATUS` | 无效的生命周期状态 |
|
||||
| `DRAFT_NOT_FOUND` | 草稿不存在 |
|
||||
| `WORKFLOW_ERROR` | 工单操作失败 |
|
||||
@@ -0,0 +1,98 @@
|
||||
# 业务流程设计
|
||||
|
||||
> DDS-Section: 8. 业务流程
|
||||
> DDS-Lines: L822-L856
|
||||
|
||||
## 项目创建流程
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Admin as 超级管理员
|
||||
participant PM as rmdc-project-management
|
||||
participant WP as rmdc-work-procedure
|
||||
participant User as 填写人
|
||||
|
||||
Admin->>PM: POST /api/project/create
|
||||
PM->>PM: 创建Project记录 (status=INIT)
|
||||
PM->>WP: 创建project_detail工单
|
||||
WP-->>PM: 返回workflow_id
|
||||
PM->>PM: 创建ProjectVersion草稿记录
|
||||
PM->>PM: 创建ProjectWorkflow关联记录
|
||||
PM-->>Admin: 返回项目创建成功
|
||||
|
||||
PM->>User: 通知填写人
|
||||
User->>PM: POST /api/project/draft/save
|
||||
PM->>PM: 更新ProjectVersion草稿
|
||||
User->>PM: POST /api/project/draft/submit
|
||||
PM->>WP: 状态转换为pending_review
|
||||
PM->>PM: 更新lifecycle_status=REVIEWING
|
||||
```
|
||||
|
||||
## 项目与工单关系
|
||||
|
||||
| 关系类型 | 项目状态 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| 项目:填写工单 = 1:1 | INIT/DRAFTING | 项目创建时只能有一个填写工单 |
|
||||
| 项目:修改工单 = 1:N | RELEASED/MODIFYING | 已发布项目可以有多个修改工单 |
|
||||
| 用户:修改工单 = 1:1 (per project) | - | 非SuperAdmin用户同一项目只能有一个活跃修改工单 |
|
||||
|
||||
## 修改工单流程
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User as 普通用户
|
||||
participant PM as rmdc-project-management
|
||||
participant WP as rmdc-work-procedure
|
||||
participant Admin as 超级管理员
|
||||
|
||||
User->>PM: POST /api/project/draft/create (修改工单)
|
||||
PM->>PM: 检查用户项目权限
|
||||
PM->>WP: 创建project_modify工单
|
||||
WP-->>PM: 返回workflow_id
|
||||
PM->>PM: 创建ProjectVersion草稿 (base_version=current)
|
||||
PM->>PM: 更新lifecycle_status=MODIFYING
|
||||
PM-->>User: 返回工单创建成功
|
||||
|
||||
loop 填写草稿
|
||||
User->>PM: POST /api/project/draft/save
|
||||
PM->>PM: 更新草稿数据
|
||||
end
|
||||
|
||||
User->>PM: POST /api/project/draft/submit
|
||||
PM->>PM: 检查版本冲突 (base_version == current_version?)
|
||||
alt 无冲突
|
||||
PM->>WP: 状态转换为pending_review
|
||||
PM->>PM: 更新lifecycle_status=REVIEWING
|
||||
else 有冲突
|
||||
PM-->>User: 返回409 Conflict
|
||||
end
|
||||
|
||||
Admin->>PM: POST /api/workflow/approve
|
||||
PM->>PM: 合并草稿到主线
|
||||
PM->>PM: 生成新版本 (official)
|
||||
PM->>PM: 更新lifecycle_status=RELEASED
|
||||
```
|
||||
|
||||
## 超管直改流程
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Admin as 超级管理员
|
||||
participant PM as rmdc-project-management
|
||||
participant DB as PostgreSQL
|
||||
|
||||
Admin->>PM: POST /api/project/update
|
||||
PM->>PM: 验证权限 (SuperAdmin)
|
||||
PM->>DB: BEGIN TRANSACTION
|
||||
PM->>DB: UPDATE projects SET ...
|
||||
PM->>DB: INSERT project_versions (official)
|
||||
PM->>DB: COMMIT
|
||||
PM-->>Admin: 返回更新成功 + 新版本号
|
||||
```
|
||||
|
||||
## 审计日志
|
||||
|
||||
所有对 `projects` 表的写操作均需通过 `rmdc-audit-log` 记录:
|
||||
- **Resource**: `project`
|
||||
- **Action**: `create`, `update`, `publish_version`, `delete`, `permission_grant`
|
||||
- **Payload**: 记录关键变更字段
|
||||
@@ -0,0 +1,151 @@
|
||||
# 核心组件设计规范
|
||||
|
||||
> DDS-Section: Frontend DDS - 8. 组件设计规范
|
||||
> DDS-Lines: L696-L783
|
||||
|
||||
## CopyableField 组件
|
||||
|
||||
用于展示可复制的字段值:
|
||||
|
||||
```html
|
||||
<template>
|
||||
<div class="copyable-field d-flex align-center gap-2">
|
||||
<span class="field-value">{{ displayValue }}</span>
|
||||
<v-btn
|
||||
icon="mdi-content-copy"
|
||||
size="x-small"
|
||||
variant="text"
|
||||
@click="copyToClipboard"
|
||||
>
|
||||
<v-tooltip activator="parent" location="top">复制</v-tooltip>
|
||||
</v-btn>
|
||||
</div>
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { computed } from 'vue'
|
||||
import { useClipboard } from '@vueuse/core'
|
||||
|
||||
const props = defineProps<{
|
||||
value: string
|
||||
mask?: boolean
|
||||
}>()
|
||||
|
||||
const { copy } = useClipboard()
|
||||
|
||||
const displayValue = computed(() => {
|
||||
if (props.mask) return '******'
|
||||
return props.value
|
||||
})
|
||||
|
||||
const copyToClipboard = () => {
|
||||
copy(props.value)
|
||||
}
|
||||
</script>
|
||||
```
|
||||
|
||||
## SaveConfirmDialog 组件
|
||||
|
||||
保存前的确认对话框,展示变更差异:
|
||||
|
||||
```html
|
||||
<v-dialog v-model="visible" max-width="600">
|
||||
<v-card>
|
||||
<v-card-title class="bg-primary text-white">
|
||||
<v-icon start>mdi-check-circle</v-icon>
|
||||
确认保存修改
|
||||
</v-card-title>
|
||||
<v-card-text>
|
||||
<v-alert type="info" variant="tonal" class="mb-4">
|
||||
以下字段将被修改:
|
||||
</v-alert>
|
||||
<v-table density="compact">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>字段</th>
|
||||
<th>修改前</th>
|
||||
<th>修改后</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr v-for="item in diffItems" :key="item.field">
|
||||
<td>{{ item.label }}</td>
|
||||
<td class="text-error">{{ item.oldValue || '空' }}</td>
|
||||
<td class="text-success">{{ item.newValue || '空' }}</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</v-table>
|
||||
</v-card-text>
|
||||
<v-card-actions>
|
||||
<v-spacer />
|
||||
<v-btn variant="text" @click="cancel">取消</v-btn>
|
||||
<v-btn color="primary" :loading="loading" @click="confirm">确认保存</v-btn>
|
||||
</v-card-actions>
|
||||
</v-card>
|
||||
</v-dialog>
|
||||
```
|
||||
|
||||
## DiffTextField 组件
|
||||
|
||||
编辑模式下显示与主线数据差异的输入框:
|
||||
|
||||
```html
|
||||
<template>
|
||||
<v-text-field
|
||||
v-model="inputValue"
|
||||
:label="label"
|
||||
:class="{ 'diff-highlight': hasDiff }"
|
||||
:hint="hasDiff ? `主线值: ${masterValue}` : ''"
|
||||
persistent-hint
|
||||
>
|
||||
<template v-slot:prepend-inner v-if="hasDiff">
|
||||
<v-icon color="warning" size="small">mdi-alert-circle</v-icon>
|
||||
</template>
|
||||
</v-text-field>
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { computed } from 'vue'
|
||||
|
||||
const props = defineProps<{
|
||||
modelValue: string
|
||||
masterValue: string
|
||||
label: string
|
||||
}>()
|
||||
|
||||
const emit = defineEmits(['update:modelValue'])
|
||||
|
||||
const inputValue = computed({
|
||||
get: () => props.modelValue,
|
||||
set: (val) => emit('update:modelValue', val)
|
||||
})
|
||||
|
||||
const hasDiff = computed(() => props.modelValue !== props.masterValue)
|
||||
</script>
|
||||
|
||||
<style scoped>
|
||||
.diff-highlight :deep(.v-field__outline) {
|
||||
--v-field-border-color: rgb(var(--v-theme-warning));
|
||||
}
|
||||
</style>
|
||||
```
|
||||
|
||||
## 组件清单
|
||||
|
||||
| 组件 | 文件路径 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| BasicInfoForm | `components/BasicInfoForm.vue` | 基本信息编辑表单 |
|
||||
| BasicInfoReadonly | `components/BasicInfoReadonly.vue` | 基本信息只读 |
|
||||
| BusinessInfoReadonly | `components/BusinessInfoReadonly.vue` | 业务信息只读 |
|
||||
| DeploymentBusinessForm | `components/DeploymentBusinessForm.vue` | 业务信息表单 |
|
||||
| DeploymentEnvironmentForm | `components/DeploymentEnvironmentForm.vue` | 环境信息表单 |
|
||||
| EnvironmentInfoReadonly | `components/EnvironmentInfoReadonly.vue` | 环境信息只读 |
|
||||
| HostsInfoReadonly | `components/HostsInfoReadonly.vue` | 主机信息只读 |
|
||||
| HostsManagement | `components/HostsManagement.vue` | 主机管理 |
|
||||
| MiddlewareCardsGrid | `components/MiddlewareCardsGrid.vue` | 中间件卡片网格 |
|
||||
| MiddlewareInfoReadonly | `components/MiddlewareInfoReadonly.vue` | 中间件只读 |
|
||||
| AuthorizationManagement | `components/AuthorizationManagement.vue` | 授权管理 |
|
||||
| VersionHistory | `components/VersionHistory.vue` | 版本历史 |
|
||||
| SaveConfirmDialog | `components/SaveConfirmDialog.vue` | 保存确认对话框 |
|
||||
| CopyableField | `components/CopyableField.vue` | 可复制字段 |
|
||||
| ProjectBasicInfoCard | `components/ProjectBasicInfoCard.vue` | 项目基本信息卡片 |
|
||||
@@ -0,0 +1,128 @@
|
||||
# 交互时序图
|
||||
|
||||
> DDS-Section: Frontend DDS - 9. 交互时序图
|
||||
> DDS-Lines: L787-L878
|
||||
|
||||
## 管理员编辑保存流程
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Admin as 超级管理员
|
||||
participant Page as ProjectDetail.vue
|
||||
participant Dialog as SaveConfirmDialog
|
||||
participant API as 后端API
|
||||
|
||||
Admin->>Page: 点击[编辑]按钮
|
||||
Page->>Page: isEditMode = true
|
||||
Page->>Page: editForm = deepClone(masterData)
|
||||
|
||||
Admin->>Page: 修改字段
|
||||
Page->>Page: hasChanges = true
|
||||
|
||||
Admin->>Page: 点击[保存修改]
|
||||
Page->>Page: 计算 diffItems
|
||||
Page->>Dialog: 显示确认对话框
|
||||
|
||||
Admin->>Dialog: 确认保存
|
||||
Dialog->>API: updateProject()
|
||||
API-->>Dialog: 成功
|
||||
Dialog->>Page: emit('confirm')
|
||||
Page->>API: loadProject()
|
||||
Page->>Page: isEditMode = false
|
||||
Page-->>Admin: Snackbar: 保存成功
|
||||
```
|
||||
|
||||
## 用户草稿提交流程
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User as 普通用户
|
||||
participant Page as UserProjectDetail.vue
|
||||
participant Dialog as SubmitDialog
|
||||
participant API as 后端API
|
||||
participant WF as 工单模块
|
||||
|
||||
User->>Page: 填写表单
|
||||
User->>Page: 点击[保存草稿]
|
||||
Page->>API: saveDraft()
|
||||
API-->>Page: 草稿保存成功
|
||||
|
||||
User->>Page: 点击[提交审核]
|
||||
Page->>Dialog: 显示确认对话框
|
||||
User->>Dialog: 填写备注并确认
|
||||
Dialog->>API: saveDraft() (最终版本)
|
||||
Dialog->>API: submitProjectDetail()
|
||||
API->>WF: 触发工单状态转换
|
||||
WF-->>API: 成功
|
||||
API-->>Dialog: 提交成功
|
||||
Page-->>User: 跳转至工单详情页
|
||||
```
|
||||
|
||||
## 管理员审批流程
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Admin as 超级管理员
|
||||
participant Page as ProjectDetail.vue
|
||||
participant Dialog as ApproveDialog/RejectDialog
|
||||
participant API as 后端API
|
||||
participant WF as 工单模块
|
||||
|
||||
Note over Page: lifecycle_status = reviewing
|
||||
|
||||
Admin->>Page: 查看待审核内容
|
||||
|
||||
alt 通过审批
|
||||
Admin->>Page: 点击[通过]
|
||||
Page->>Dialog: 显示审批对话框
|
||||
Admin->>Dialog: 填写备注并确认
|
||||
Dialog->>API: transitionWorkflow(approve)
|
||||
API->>WF: 工单状态 → approved
|
||||
WF-->>API: 回调更新项目状态
|
||||
API-->>Dialog: 成功
|
||||
Page->>API: updateProjectCertification('official')
|
||||
Page-->>Admin: Snackbar: 审批通过
|
||||
else 打回修改
|
||||
Admin->>Page: 点击[打回]
|
||||
Page->>Dialog: 显示打回对话框
|
||||
Admin->>Dialog: 填写打回原因并确认
|
||||
Dialog->>API: transitionWorkflow(return)
|
||||
API->>WF: 工单状态 → returned
|
||||
WF-->>API: 回调更新项目状态
|
||||
API-->>Dialog: 成功
|
||||
Page-->>Admin: Snackbar: 已打回
|
||||
end
|
||||
```
|
||||
|
||||
## 版本冲突处理流程
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User as 普通用户
|
||||
participant Page as UserProjectDetail.vue
|
||||
participant API as 后端API
|
||||
participant Admin as 超级管理员
|
||||
|
||||
Note over User,API: 用户 A 基于 v3 版本创建草稿
|
||||
|
||||
Admin->>API: 直接修改项目 (v3 → v4)
|
||||
|
||||
User->>Page: 点击[提交审核]
|
||||
Page->>API: submitDraft() (base_version=v3)
|
||||
API->>API: 检查 base_version != current_version
|
||||
API-->>Page: 返回 409 Conflict
|
||||
|
||||
Page-->>User: 显示冲突提示对话框
|
||||
|
||||
alt 选择 Rebase
|
||||
User->>Page: 点击 [Rebase]
|
||||
Page->>API: getProject() (最新 v4 版本)
|
||||
Page->>Page: 自动合并草稿到 v4
|
||||
Page-->>User: 显示合并结果,可能需要手动解决冲突
|
||||
else 选择 Diff Check
|
||||
User->>Page: 点击 [查看差异]
|
||||
Page->>API: getDiff(v3, v4)
|
||||
Page-->>User: 显示 v3 与 v4 的差异
|
||||
User->>User: 决定是否覆盖或调整
|
||||
end
|
||||
```
|
||||
@@ -0,0 +1,212 @@
|
||||
# 生命周期状态与工单关联展示
|
||||
|
||||
> DDS-Section: Frontend DDS - 5-6. 项目生命周期状态展示 + 工单关联与跳转机制
|
||||
> DDS-Lines: L334-L582
|
||||
|
||||
## 状态标签设计
|
||||
|
||||
项目详情页 Header 区域展示三类状态标签:
|
||||
|
||||
```html
|
||||
<div class="d-flex align-center gap-2">
|
||||
<h1 class="text-h4 font-weight-bold">{{ masterData.project_name }}</h1>
|
||||
|
||||
<!-- 1. 连接状态 -->
|
||||
<v-chip :color="getStatusColor(masterData.status)" size="small" variant="tonal">
|
||||
{{ PROJECT_STATUS[masterData.status] }}
|
||||
</v-chip>
|
||||
|
||||
<!-- 2. 生命周期状态 -->
|
||||
<v-chip :color="getLifecycleStatusColor(masterData.lifecycle_status)"
|
||||
size="small" variant="tonal">
|
||||
<v-icon start size="small">{{ getLifecycleStatusIcon(masterData.lifecycle_status) }}</v-icon>
|
||||
{{ LIFECYCLE_STATUS[masterData.lifecycle_status] }}
|
||||
</v-chip>
|
||||
|
||||
<!-- 3. 认证状态 -->
|
||||
<v-chip :color="masterData.project_certification === 'official' ? 'success' : 'warning'"
|
||||
size="small" variant="tonal">
|
||||
{{ PROJECT_CERTIFICATION[masterData.project_certification] }}
|
||||
</v-chip>
|
||||
|
||||
<!-- 4. 编辑模式指示器 -->
|
||||
<v-chip v-if="isEditMode" color="info" variant="tonal">
|
||||
<v-icon start size="small">mdi-pencil</v-icon>
|
||||
编辑模式
|
||||
</v-chip>
|
||||
</div>
|
||||
```
|
||||
|
||||
## 生命周期状态配置
|
||||
|
||||
```typescript
|
||||
// 生命周期状态枚举
|
||||
export const LIFECYCLE_STATUS = {
|
||||
init: '初始化',
|
||||
drafting: '填写中',
|
||||
reviewing: '审核中',
|
||||
released: '已发布',
|
||||
modifying: '变更中',
|
||||
archived: '已归档'
|
||||
}
|
||||
|
||||
// 状态颜色映射
|
||||
export const LIFECYCLE_STATUS_COLORS: Record<string, string> = {
|
||||
init: 'grey',
|
||||
drafting: 'info',
|
||||
reviewing: 'warning',
|
||||
released: 'success',
|
||||
modifying: 'primary',
|
||||
archived: 'grey-darken-1'
|
||||
}
|
||||
|
||||
// 状态图标
|
||||
const getLifecycleStatusIcon = (status: string): string => {
|
||||
const icons: Record<string, string> = {
|
||||
init: 'mdi-clock-outline',
|
||||
drafting: 'mdi-pencil',
|
||||
reviewing: 'mdi-eye',
|
||||
released: 'mdi-check-circle',
|
||||
modifying: 'mdi-sync',
|
||||
archived: 'mdi-archive'
|
||||
}
|
||||
return icons[status] || 'mdi-help-circle'
|
||||
}
|
||||
```
|
||||
|
||||
## 生命周期提示横幅 (Alert Banner)
|
||||
|
||||
根据当前生命周期状态,在 Header 下方显示上下文提示:
|
||||
|
||||
```typescript
|
||||
interface LifecycleAlert {
|
||||
type: 'info' | 'warning' | 'success' | 'error'
|
||||
message: string
|
||||
action?: {
|
||||
text: string
|
||||
handler: () => void
|
||||
}
|
||||
}
|
||||
|
||||
const lifecycleStatusAlert = computed((): LifecycleAlert | null => {
|
||||
if (!masterData.value) return null
|
||||
const status = masterData.value.lifecycle_status
|
||||
|
||||
switch (status) {
|
||||
case 'init':
|
||||
return {
|
||||
type: 'info',
|
||||
message: '项目已创建,等待指定填写人录入详细信息'
|
||||
}
|
||||
case 'drafting':
|
||||
return {
|
||||
type: 'info',
|
||||
message: `项目详情正在由 ${masterData.value.detail_filler_name || '填写人'} 填写中`,
|
||||
action: masterData.value.workflow_id ? {
|
||||
text: '查看工单',
|
||||
handler: () => router.push(`/admin/workflows/${masterData.value?.workflow_id}`)
|
||||
} : undefined
|
||||
}
|
||||
case 'reviewing':
|
||||
return {
|
||||
type: 'warning',
|
||||
message: '项目详情已提交,等待审核',
|
||||
action: masterData.value.workflow_id ? {
|
||||
text: '查看工单',
|
||||
handler: () => router.push(`/admin/workflows/${masterData.value?.workflow_id}`)
|
||||
} : undefined
|
||||
}
|
||||
case 'modifying':
|
||||
return {
|
||||
type: 'info',
|
||||
message: '项目存在活跃的变更工单,主线数据不受影响',
|
||||
action: masterData.value.workflow_id ? {
|
||||
text: '查看工单',
|
||||
handler: () => router.push(`/admin/workflows/${masterData.value?.workflow_id}`)
|
||||
} : undefined
|
||||
}
|
||||
case 'archived':
|
||||
return {
|
||||
type: 'warning',
|
||||
message: '项目已归档,仅保留历史数据'
|
||||
}
|
||||
default:
|
||||
return null
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
## 工单关联与跳转机制
|
||||
|
||||
### 工单与项目的关系
|
||||
|
||||
| 工单类型 | 生命周期状态 | 数量关系 | 说明 |
|
||||
|:---|:---|:---|:---|
|
||||
| 填写工单 (project_detail) | INIT → DRAFTING | 1:1 | 项目创建时只能有一个 |
|
||||
| 修改工单 (project_modify) | RELEASED → MODIFYING | 1:N | 已发布项目可有多个 |
|
||||
|
||||
### 工单按钮显示逻辑
|
||||
|
||||
```typescript
|
||||
// 是否显示工单按钮
|
||||
const showWorkflowButton = computed(() => {
|
||||
if (!masterData.value?.workflow_id) return false
|
||||
const status = masterData.value.lifecycle_status
|
||||
// 在填写中、审核中、变更中状态显示工单按钮
|
||||
return ['drafting', 'reviewing', 'modifying'].includes(status)
|
||||
})
|
||||
|
||||
// 工单按钮文本
|
||||
const workflowButtonText = computed(() => {
|
||||
if (!masterData.value) return '查看工单'
|
||||
const status = masterData.value.lifecycle_status
|
||||
switch (status) {
|
||||
case 'drafting':
|
||||
return '查看填写工单'
|
||||
case 'reviewing':
|
||||
return '查看审核工单'
|
||||
case 'modifying':
|
||||
return '查看修改工单'
|
||||
default:
|
||||
return '查看工单'
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
### 多工单场景处理
|
||||
|
||||
当项目处于 `MODIFYING` 状态时,可能同时存在多个修改工单:
|
||||
|
||||
```html
|
||||
<v-menu v-if="multipleWorkflows">
|
||||
<template v-slot:activator="{ props }">
|
||||
<v-btn v-bind="props" color="info" variant="tonal" prepend-icon="mdi-sitemap">
|
||||
查看工单 ({{ workflowCount }})
|
||||
<v-icon end>mdi-chevron-down</v-icon>
|
||||
</v-btn>
|
||||
</template>
|
||||
<v-list>
|
||||
<v-list-item
|
||||
v-for="wf in relatedWorkflows"
|
||||
:key="wf.workflow_id"
|
||||
@click="navigateToWorkflow(wf.workflow_id)"
|
||||
>
|
||||
<v-list-item-title>{{ wf.workflow_id }}</v-list-item-title>
|
||||
<v-list-item-subtitle>
|
||||
{{ wf.creator_name }} | {{ formatDate(wf.created_at) }}
|
||||
</v-list-item-subtitle>
|
||||
</v-list-item>
|
||||
</v-list>
|
||||
</v-menu>
|
||||
```
|
||||
|
||||
### 从工单页面跳转回项目详情
|
||||
|
||||
```typescript
|
||||
// 工单详情页
|
||||
const navigateToProject = () => {
|
||||
if (workflowDetail.value?.business_context?.project_id) {
|
||||
router.push(`/admin/projects/${workflowDetail.value.business_context.project_id}`)
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,111 @@
|
||||
# 模块详细设计规范
|
||||
|
||||
> DDS-Section: Frontend DDS - 7. 模块详细设计规范
|
||||
> DDS-Lines: L586-L693
|
||||
|
||||
## 基本信息模块 (Basic Info)
|
||||
|
||||
| 字段 | 类型 | 只读模式 | 编辑模式 |
|
||||
|:---|:---|:---|:---|
|
||||
| 项目名称 | String | 文本展示 + 复制 | `v-text-field` |
|
||||
| 命名空间 | String | 文本展示 + 复制 | `disabled` 不可编辑 |
|
||||
| 省份 | Enum | 文本展示 | 级联选择器 |
|
||||
| 城市 | Enum | 文本展示 | 级联选择器(依赖省份) |
|
||||
| 项目性质 | Enum | 文本展示 | `v-select` |
|
||||
| 行业组人员 | String | 文本展示 | `v-text-field` |
|
||||
| 行业组电话 | String | 文本展示 | `v-text-field` |
|
||||
| 项目描述 | String | 多行文本 | `v-textarea` |
|
||||
|
||||
## 部署业务模块 (Deployment Business)
|
||||
|
||||
| 字段 | 类型 | 只读模式 | 编辑模式 |
|
||||
|:---|:---|:---|:---|
|
||||
| 部署人姓名 | String | 文本展示 | `v-text-field` 或用户搜索 |
|
||||
| 部署人电话 | String | 文本展示 | `v-text-field` |
|
||||
| 部署系统 | Enum | 文本展示 | `v-select` |
|
||||
| 系统版本 | String | 文本展示 | `v-text-field` |
|
||||
| 业务入口 URL | String | 可点击链接 | `v-text-field` |
|
||||
| 超管账号 | String | 文本展示 + 复制 | `v-text-field` |
|
||||
| 超管密码 | Password | 脱敏 + 查看按钮 | `v-text-field` 密码输入 |
|
||||
|
||||
## 部署环境模块 (Deployment Environment)
|
||||
|
||||
| 字段 | 类型 | 只读模式 | 编辑模式 |
|
||||
|:---|:---|:---|:---|
|
||||
| 网络环境 | Enum | 文本展示 | `v-select` |
|
||||
| 主公网 IP | String | 文本展示 + 复制 | `v-text-field` IP 校验 |
|
||||
| 域名 URL | String | 可点击链接 | `v-text-field` |
|
||||
| 启用 SSL | Boolean | 图标显示 | `v-switch` |
|
||||
| 主机管理方式 | Enum | 文本展示 | `v-select` |
|
||||
| 管理控制台 URL | String | 可点击链接 | `v-text-field` |
|
||||
| 主机数量 | Number | 文本展示 | `v-text-field type=number` |
|
||||
| CPU 总核数 | Number | 统计卡片 | `v-text-field type=number` |
|
||||
| 内存总量 | Number | 统计卡片 | `v-text-field type=number` |
|
||||
| 存储总量 | Number | 统计卡片 | `v-text-field type=number` |
|
||||
|
||||
## 中间件模块 (Middleware)
|
||||
|
||||
采用 **卡片网格 (Card Grid)** 设计。
|
||||
|
||||
### 数据结构
|
||||
|
||||
```typescript
|
||||
interface MiddlewareFormItem {
|
||||
middleware_type: string
|
||||
public_ip: string
|
||||
public_port: number
|
||||
internal_ip: string
|
||||
internal_port: number
|
||||
admin_user: string
|
||||
admin_password?: string
|
||||
}
|
||||
```
|
||||
|
||||
### 只读模式
|
||||
- 每个中间件一张卡片,响应式网格布局
|
||||
- 卡片包含:类型图标 + 标题 + IP/Port 信息
|
||||
- 图标映射逻辑:
|
||||
|
||||
```typescript
|
||||
const MIDDLEWARE_ICONS: Record<string, string> = {
|
||||
'mysql': 'mdi-database',
|
||||
'redis': 'mdi-database-clock',
|
||||
'emqx': 'mdi-broadcast',
|
||||
'minio': 'mdi-bucket',
|
||||
'influxdb': 'mdi-chart-timeline-variant',
|
||||
'nacos': 'mdi-cog-outline',
|
||||
'k8s-dashboard': 'mdi-kubernetes'
|
||||
}
|
||||
```
|
||||
|
||||
### 编辑模式
|
||||
- 现有卡片右上角显示「编辑」「删除」按钮
|
||||
- 列表末尾显示「添加中间件」虚线框卡片
|
||||
- 点击添加/编辑弹出对话框:
|
||||
- 类型选择:`v-combobox` 支持预设 + 自定义
|
||||
- 选择预设类型时自动填充默认端口
|
||||
|
||||
## 主机管理模块 (Hosts Management)
|
||||
|
||||
- 复用 `HostsManagement.vue` 组件
|
||||
- 支持表格展示所有主机信息
|
||||
- 编辑模式支持添加/删除主机
|
||||
|
||||
## 授权信息模块 (Authorization) - SuperAdmin Only
|
||||
|
||||
| 功能 | 说明 |
|
||||
|:---|:---|
|
||||
| TOTP 密钥展示 | 二维码 + 文本形式 |
|
||||
| 授权类型切换 | 永久/限时 |
|
||||
| 授权天数设置 | 数字输入 |
|
||||
| 下发授权 | 调用 Exchange-Hub 接口 |
|
||||
| 撤销授权 | 调用 Exchange-Hub 接口 |
|
||||
|
||||
## 版本历史模块 (Version History) - SuperAdmin Only
|
||||
|
||||
| 功能 | 说明 |
|
||||
|:---|:---|
|
||||
| 版本列表 | 时间轴形式展示 |
|
||||
| 版本详情 | 点击查看完整快照 |
|
||||
| 版本对比 | 选择两个版本进行 Diff |
|
||||
| 工单关联 | 点击跳转关联工单 |
|
||||
@@ -0,0 +1,102 @@
|
||||
# 前端页面架构设计
|
||||
|
||||
> DDS-Section: Frontend DDS - 1-2. 设计概述 + 页面架构设计
|
||||
> DDS-Lines: L23-L127
|
||||
|
||||
## 设计背景
|
||||
|
||||
项目详情页面是 RMDC 系统的核心交互界面,需要支持复杂的生命周期管理流程,并满足不同角色用户的差异化需求。
|
||||
|
||||
## 核心设计目标
|
||||
|
||||
| 目标 | 说明 |
|
||||
|:---|:---|
|
||||
| **状态分离** | 明确区分「只读查看模式」与「编辑修改模式」,防止误操作 |
|
||||
| **角色差异化** | 超级管理员与普通用户看到的内容、操作权限不同,但尽量复用组件 |
|
||||
| **生命周期可视化** | 清晰展示项目当前状态(INIT/DRAFTING/REVIEWING/RELEASED/MODIFYING/ARCHIVED) |
|
||||
| **工单关联** | 支持项目详情页与工单详情页的双向跳转,处理多工单场景 |
|
||||
| **美观专业** | 采用现代 Material Design 风格,注重留白、排版与微动效 |
|
||||
| **高复用性** | 最大化组件复用,用户侧与管理侧共用核心表单组件 |
|
||||
|
||||
## 页面文件结构
|
||||
|
||||
```
|
||||
frontend/src/modules/admin/
|
||||
├── pages/
|
||||
│ ├── admin/
|
||||
│ │ └── ProjectDetail.vue # 超级管理员端项目详情
|
||||
│ └── user/
|
||||
│ └── UserProjectDetail.vue # 普通用户端项目详情(填写人视角)
|
||||
├── components/
|
||||
│ ├── BasicInfoForm.vue # 基本信息编辑表单
|
||||
│ ├── BasicInfoReadonly.vue # 基本信息只读展示
|
||||
│ ├── BusinessInfoReadonly.vue # 业务信息只读展示
|
||||
│ ├── DeploymentBusinessForm.vue # 部署业务编辑表单
|
||||
│ ├── DeploymentEnvironmentForm.vue # 部署环境编辑表单
|
||||
│ ├── EnvironmentInfoReadonly.vue # 环境信息只读展示
|
||||
│ ├── HostsInfoReadonly.vue # 主机信息只读展示
|
||||
│ ├── HostsManagement.vue # 主机管理组件
|
||||
│ ├── MiddlewareCardsGrid.vue # 中间件卡片网格
|
||||
│ ├── MiddlewareInfoReadonly.vue # 中间件只读展示
|
||||
│ ├── AuthorizationManagement.vue # 授权管理 (SuperAdmin Only)
|
||||
│ ├── VersionHistory.vue # 版本历史 (SuperAdmin Only)
|
||||
│ ├── SaveConfirmDialog.vue # 保存确认对话框
|
||||
│ ├── ProjectBasicInfoCard.vue # 项目基本信息卡片
|
||||
│ ├── CopyableField.vue # 可复制字段组件
|
||||
│ └── index.ts # 组件统一导出
|
||||
```
|
||||
|
||||
## 整体布局结构
|
||||
|
||||
页面采用 **「固定头部 + 固定 Tab 导航 + 可滚动内容区域」** 的三段式布局:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ [固定区域] 生命周期状态提示横幅 (Alert Banner) │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ [固定区域] 页面头部 Header │
|
||||
│ ┌─────────────────────────────────────┬─────────────────────────────┐ │
|
||||
│ │ ← 返回 项目名称 │ [查看工单] [打回] [通过] │ │
|
||||
│ │ Namespace | 省份 城市 │ [下载配置] [编辑/保存] │ │
|
||||
│ │ 状态标签组 │ │ │
|
||||
│ └─────────────────────────────────────┴─────────────────────────────┘ │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ [固定区域] Tab 导航栏 │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ 基本信息 | 部署业务 | 部署环境 | 主机管理 | 中间件 | 授权 | 版本历史 ││
|
||||
│ └─────────────────────────────────────────────────────────────────────┘│
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ [滚动区域] Tab 内容区域 │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ ││
|
||||
│ │ 当前 Tab 对应的表单/只读内容 ││
|
||||
│ │ ││
|
||||
│ └─────────────────────────────────────────────────────────────────────┘│
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## CSS 布局实现
|
||||
|
||||
```css
|
||||
.project-detail-page {
|
||||
height: 100%;
|
||||
max-height: 100%;
|
||||
overflow: hidden;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
}
|
||||
|
||||
.header-section {
|
||||
flex-shrink: 0;
|
||||
background: rgb(var(--v-theme-surface));
|
||||
z-index: 1;
|
||||
}
|
||||
|
||||
.content-area {
|
||||
flex: 1 1 auto;
|
||||
overflow-y: auto;
|
||||
overflow-x: hidden;
|
||||
min-height: 0; /* 关键:防止 Flex 子元素撑破父容器 */
|
||||
padding-bottom: 24px;
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,183 @@
|
||||
# 用户侧 vs 管理侧差异化设计
|
||||
|
||||
> DDS-Section: Frontend DDS - 4. 用户侧 vs 管理侧差异化设计
|
||||
> DDS-Lines: L220-L331
|
||||
|
||||
## 页面对照表
|
||||
|
||||
| 特性 | 管理员端 (ProjectDetail.vue) | 用户端 (UserProjectDetail.vue) |
|
||||
|:---|:---|:---|
|
||||
| **默认模式** | 查看模式 | 根据工单状态决定 |
|
||||
| **查看权限** | 所有模块 | ACL 授权模块 |
|
||||
| **授权信息 Tab** | ✅ 可见 | ❌ 不可见 |
|
||||
| **版本历史 Tab** | ✅ 可见 | ❌ 不可见 |
|
||||
| **主机管理 Tab** | ✅ 可见 | ❌ 不可见 |
|
||||
| **基本信息** | 可编辑 | 只读(由管理员填写) |
|
||||
| **编辑操作** | 直接保存(上帝模式) | 草稿 → 提交审核(工单流程) |
|
||||
| **审批操作** | ✅ 通过/打回按钮 | ❌ 无 |
|
||||
| **保存按钮** | 「保存修改」 | 「保存草稿」 |
|
||||
| **提交按钮** | 无 | 「提交审核」 |
|
||||
|
||||
## Tab 导航差异
|
||||
|
||||
### 管理员端 Tabs
|
||||
```html
|
||||
<v-tabs v-model="activeTab">
|
||||
<v-tab value="basic">基本信息</v-tab>
|
||||
<v-tab value="business">部署业务</v-tab>
|
||||
<v-tab value="environment">部署环境</v-tab>
|
||||
<v-tab value="hosts">主机管理</v-tab>
|
||||
<v-tab value="middlewares">中间件</v-tab>
|
||||
<v-tab value="authorization" v-if="isSuperAdmin">授权信息</v-tab>
|
||||
<v-tab value="version-history" v-if="isSuperAdmin">版本历史</v-tab>
|
||||
</v-tabs>
|
||||
```
|
||||
|
||||
### 用户端 Tabs
|
||||
```html
|
||||
<v-tabs v-model="activeTab">
|
||||
<v-tab value="basic">基本信息</v-tab>
|
||||
<v-tab value="business">部署业务</v-tab>
|
||||
<v-tab value="environment">部署环境</v-tab>
|
||||
<v-tab value="middlewares">中间件</v-tab>
|
||||
</v-tabs>
|
||||
```
|
||||
|
||||
## 组件复用策略
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph 共用组件
|
||||
A[BasicInfoForm]
|
||||
B[DeploymentBusinessForm]
|
||||
C[DeploymentEnvironmentForm]
|
||||
D[MiddlewareCardsGrid]
|
||||
E[BasicInfoReadonly]
|
||||
F[BusinessInfoReadonly]
|
||||
G[EnvironmentInfoReadonly]
|
||||
H[MiddlewareInfoReadonly]
|
||||
end
|
||||
|
||||
subgraph 管理端专用
|
||||
I[AuthorizationManagement]
|
||||
J[VersionHistory]
|
||||
K[HostsManagement]
|
||||
L[SaveConfirmDialog]
|
||||
end
|
||||
|
||||
subgraph 用户端专用
|
||||
M[ProjectBasicInfoCard]
|
||||
end
|
||||
|
||||
Admin[ProjectDetail.vue] --> A
|
||||
Admin --> B
|
||||
Admin --> C
|
||||
Admin --> D
|
||||
Admin --> E
|
||||
Admin --> F
|
||||
Admin --> G
|
||||
Admin --> H
|
||||
Admin --> I
|
||||
Admin --> J
|
||||
Admin --> K
|
||||
Admin --> L
|
||||
|
||||
User[UserProjectDetail.vue] --> B
|
||||
User --> C
|
||||
User --> D
|
||||
User --> M
|
||||
```
|
||||
|
||||
## 权限控制逻辑
|
||||
|
||||
### 管理员端 - 编辑权限判断
|
||||
|
||||
```typescript
|
||||
const canEdit = computed(() => {
|
||||
if (!masterData.value) return false
|
||||
const status = masterData.value.lifecycle_status
|
||||
|
||||
// 超级管理员在已发布、变更中状态可以编辑
|
||||
if (isSuperAdmin.value) {
|
||||
return status === 'released' || status === 'modifying'
|
||||
}
|
||||
return false
|
||||
})
|
||||
```
|
||||
|
||||
### 用户端 - 编辑权限判断
|
||||
|
||||
```typescript
|
||||
const canEdit = computed(() => {
|
||||
if (!workflowInfo.value) return true // 没有工单信息时默认可编辑
|
||||
const status = workflowInfo.value.status
|
||||
|
||||
// 已分配、处理中、已打回状态可编辑
|
||||
return ['assigned', 'in_progress', 'returned', 'draft_saved'].includes(status)
|
||||
})
|
||||
```
|
||||
|
||||
## 操作按钮差异
|
||||
|
||||
### 管理员端 Header 按钮
|
||||
|
||||
```html
|
||||
<template v-if="!isEditMode">
|
||||
<v-btn
|
||||
v-if="showWorkflowButton"
|
||||
color="info"
|
||||
variant="tonal"
|
||||
@click="navigateToWorkflow"
|
||||
>
|
||||
{{ workflowButtonText }}
|
||||
</v-btn>
|
||||
|
||||
<v-btn
|
||||
v-if="canApprove"
|
||||
color="success"
|
||||
@click="handleApprove"
|
||||
>
|
||||
通过
|
||||
</v-btn>
|
||||
|
||||
<v-btn
|
||||
v-if="canApprove"
|
||||
color="error"
|
||||
variant="outlined"
|
||||
@click="handleReject"
|
||||
>
|
||||
打回
|
||||
</v-btn>
|
||||
|
||||
<v-btn
|
||||
v-if="canEdit"
|
||||
color="primary"
|
||||
@click="enterEditMode"
|
||||
>
|
||||
编辑
|
||||
</v-btn>
|
||||
</template>
|
||||
|
||||
<template v-else>
|
||||
<v-btn variant="outlined" @click="handleCancel">取消</v-btn>
|
||||
<v-btn color="primary" @click="handleSave">保存修改</v-btn>
|
||||
</template>
|
||||
```
|
||||
|
||||
### 用户端 Header 按钮
|
||||
|
||||
```html
|
||||
<template v-if="canEdit">
|
||||
<v-btn variant="outlined" @click="handleSaveDraft">
|
||||
保存草稿
|
||||
</v-btn>
|
||||
<v-btn color="primary" @click="handleSubmit">
|
||||
提交审核
|
||||
</v-btn>
|
||||
</template>
|
||||
<template v-else>
|
||||
<v-chip color="warning">
|
||||
工单状态: {{ workflowStatusText }}
|
||||
</v-chip>
|
||||
</template>
|
||||
```
|
||||
@@ -0,0 +1,128 @@
|
||||
# 查看/编辑状态分离设计
|
||||
|
||||
> DDS-Section: Frontend DDS - 3. 查看状态 vs 编辑状态
|
||||
> DDS-Lines: L130-L217
|
||||
|
||||
## 状态定义
|
||||
|
||||
| 状态 | 变量名 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| **查看状态** | `isEditMode = false` | 默认状态,展示只读组件 |
|
||||
| **编辑状态** | `isEditMode = true` | 编辑模式,展示表单组件 |
|
||||
|
||||
## 查看状态 (默认)
|
||||
|
||||
### 展示形式
|
||||
- 使用 `*Readonly.vue` 系列组件展示数据
|
||||
- 采用 `v-row/v-col` 布局,键值对形式展示
|
||||
- 关键字段(IP、URL、密码)支持一键复制
|
||||
- 敏感字段(密码)默认脱敏显示 `******`
|
||||
|
||||
### 交互行为
|
||||
|
||||
| 交互 | 说明 |
|
||||
|:---|:---|
|
||||
| **一键复制** | 使用 `CopyableField` 组件,点击复制图标复制到剪贴板 |
|
||||
| **密码查看** | 点击"小眼睛"图标切换明文/密文 |
|
||||
| **链接跳转** | URL 字段支持点击新窗口打开 |
|
||||
|
||||
### 组件列表
|
||||
|
||||
| 组件名称 | 对应模块 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| `BasicInfoReadonly.vue` | 基本信息 | 项目名、NS、省市、性质 |
|
||||
| `BusinessInfoReadonly.vue` | 部署业务 | 部署人、系统版本、入口URL |
|
||||
| `EnvironmentInfoReadonly.vue` | 部署环境 | 网络、IP、域名、主机统计 |
|
||||
| `HostsInfoReadonly.vue` | 主机管理 | 主机列表只读表格 |
|
||||
| `MiddlewareInfoReadonly.vue` | 中间件 | 中间件卡片只读展示 |
|
||||
|
||||
## 编辑状态
|
||||
|
||||
### 进入条件
|
||||
- 点击 Header 的「编辑」按钮
|
||||
- 用户必须具备编辑权限(基于角色和生命周期状态)
|
||||
|
||||
### 数据流
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User as 用户
|
||||
participant View as 查看模式
|
||||
participant Edit as 编辑模式
|
||||
participant API as 后端API
|
||||
|
||||
User->>View: 点击[编辑]按钮
|
||||
View->>Edit: isEditMode = true
|
||||
Edit->>Edit: 深拷贝 masterData → editForm
|
||||
Edit-->>User: 显示表单组件
|
||||
|
||||
User->>Edit: 修改字段
|
||||
Edit->>Edit: computed hasChanges = true
|
||||
|
||||
User->>Edit: 点击[保存]
|
||||
Edit->>Edit: 弹出 SaveConfirmDialog
|
||||
User->>Edit: 确认保存
|
||||
Edit->>API: 调用更新接口
|
||||
API-->>Edit: 返回成功
|
||||
Edit->>View: isEditMode = false, 刷新数据
|
||||
```
|
||||
|
||||
### 退出保护
|
||||
|
||||
```typescript
|
||||
// 脏数据检测
|
||||
const hasChanges = computed(() => {
|
||||
return JSON.stringify(editForm.value) !== JSON.stringify(masterData.value)
|
||||
})
|
||||
|
||||
// 退出确认
|
||||
const exitEditMode = () => {
|
||||
if (hasChanges.value) {
|
||||
exitConfirmDialog.value = true // 弹出确认对话框
|
||||
} else {
|
||||
isEditMode.value = false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 编辑状态指示器
|
||||
|
||||
编辑模式下,Header 区域显示明显的「编辑模式」标签:
|
||||
|
||||
```html
|
||||
<v-chip v-if="isEditMode" color="info" variant="tonal" class="ml-2">
|
||||
<v-icon start size="small">mdi-pencil</v-icon>
|
||||
编辑模式
|
||||
</v-chip>
|
||||
```
|
||||
|
||||
## 编辑模式数据流示例
|
||||
|
||||
```typescript
|
||||
// 进入编辑模式
|
||||
const enterEditMode = () => {
|
||||
editForm.value = JSON.parse(JSON.stringify(masterData.value)) // 深拷贝
|
||||
isEditMode.value = true
|
||||
}
|
||||
|
||||
// 保存修改
|
||||
const handleSave = async () => {
|
||||
try {
|
||||
await updateProject(projectId.value, editForm.value)
|
||||
await loadProject() // 重新加载数据
|
||||
isEditMode.value = false
|
||||
snackbar.success('保存成功')
|
||||
} catch (error) {
|
||||
snackbar.error('保存失败')
|
||||
}
|
||||
}
|
||||
|
||||
// 取消编辑
|
||||
const handleCancel = () => {
|
||||
if (hasChanges.value) {
|
||||
confirmDialog.value = true
|
||||
} else {
|
||||
isEditMode.value = false
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,183 @@
|
||||
# 视觉设计与响应式规范
|
||||
|
||||
> DDS-Section: Frontend DDS - 10-11. 视觉设计规范 + 响应式设计
|
||||
> DDS-Lines: L882-L980
|
||||
|
||||
## 色彩系统
|
||||
|
||||
| 用途 | 颜色 | Vuetify 类 |
|
||||
|:---|:---|:---|
|
||||
| 主色调 | Deep Purple | `color="primary"` |
|
||||
| 成功状态 | Green | `color="success"` |
|
||||
| 警告状态 | Orange | `color="warning"` |
|
||||
| 错误状态 | Red | `color="error"` |
|
||||
| 信息状态 | Blue | `color="info"` |
|
||||
| 页面背景 | Light Grey | `bg-grey-lighten-4` |
|
||||
| 卡片背景 | White | `bg-white` |
|
||||
|
||||
## 卡片设计
|
||||
|
||||
```html
|
||||
<v-card elevation="2" rounded="lg" class="pa-4">
|
||||
<!-- 卡片内容 -->
|
||||
</v-card>
|
||||
```
|
||||
|
||||
设计规范:
|
||||
- 圆角:`rounded-lg` (8px)
|
||||
- 阴影:`elevation-2`
|
||||
- 内边距:`pa-4` (16px)
|
||||
- Hover 效果:轻微上浮 + 阴影加深
|
||||
|
||||
## 排版规范
|
||||
|
||||
| 元素 | 字体样式 |
|
||||
|:---|:---|
|
||||
| 页面标题 | `text-h4 font-weight-bold` |
|
||||
| 卡片标题 | `text-h6` |
|
||||
| 字段标签 | `text-medium-emphasis text-body-2` |
|
||||
| 字段值 | `text-high-emphasis` |
|
||||
| 辅助文字 | `text-caption text-grey` |
|
||||
|
||||
## 间距规范
|
||||
|
||||
遵循 8px 网格系统:
|
||||
|
||||
| 间距 | Vuetify 类 | 值 |
|
||||
|:---|:---|:---|
|
||||
| 紧凑 | `pa-2` / `ma-2` | 8px |
|
||||
| 标准 | `pa-4` / `ma-4` | 16px |
|
||||
| 宽松 | `pa-6` / `ma-6` | 24px |
|
||||
| 组件间距 | `gap-2` | 8px |
|
||||
| 卡片间距 | `gap-4` | 16px |
|
||||
|
||||
## 动画与过渡
|
||||
|
||||
| 场景 | 效果 |
|
||||
|:---|:---|
|
||||
| Tab 切换 | `v-window` 默认滑动过渡 |
|
||||
| 模式切换 | `v-expand-transition` |
|
||||
| 按钮 Hover | 0.2s 缓动 |
|
||||
| 卡片 Hover | `transform: translateY(-2px)` |
|
||||
| 加载状态 | `v-skeleton-loader` 骨架屏 |
|
||||
|
||||
## 断点配置
|
||||
|
||||
遵循 Vuetify 默认断点:
|
||||
|
||||
| 断点 | 宽度范围 |
|
||||
|:---|:---|
|
||||
| xs | < 600px |
|
||||
| sm | 600px - 960px |
|
||||
| md | 960px - 1280px |
|
||||
| lg | 1280px - 1920px |
|
||||
| xl | > 1920px |
|
||||
|
||||
## 响应式布局适配
|
||||
|
||||
### 中间件卡片网格
|
||||
|
||||
```html
|
||||
<v-row>
|
||||
<v-col
|
||||
v-for="mw in middlewares"
|
||||
:key="mw.type"
|
||||
cols="12"
|
||||
sm="6"
|
||||
md="4"
|
||||
lg="3"
|
||||
>
|
||||
<MiddlewareCard :data="mw" />
|
||||
</v-col>
|
||||
</v-row>
|
||||
```
|
||||
|
||||
### 移动端适配要点
|
||||
|
||||
1. **Tab 导航**:使用 `show-arrows` 支持左右滑动
|
||||
2. **操作按钮**:使用 `v-bottom-sheet` 或收起到菜单
|
||||
3. **表单布局**:单列堆叠
|
||||
4. **表格**:使用 `mobile-breakpoint` 切换卡片视图
|
||||
|
||||
## TypeScript 类型定义
|
||||
|
||||
```typescript
|
||||
// 项目详情
|
||||
interface ProjectDetail {
|
||||
id: number
|
||||
project_id: string
|
||||
project_name: string
|
||||
namespace: string
|
||||
province: string
|
||||
city: string
|
||||
project_nature: string
|
||||
industry_group_member: string
|
||||
industry_group_phone: string
|
||||
description: string
|
||||
status: string
|
||||
lifecycle_status: string
|
||||
project_certification: string
|
||||
workflow_id: string
|
||||
detail_filler_id: number
|
||||
detail_filler_name: string
|
||||
deployment_business: DeploymentBusiness | null
|
||||
deployment_environment: DeploymentEnvironment | null
|
||||
middlewares: Middleware[]
|
||||
hosts: Host[]
|
||||
draft_data: Record<string, unknown> | null
|
||||
created_at: string
|
||||
updated_at: string
|
||||
}
|
||||
|
||||
// 表单数据类型
|
||||
interface BasicFormData {
|
||||
project_name: string
|
||||
province: string
|
||||
city: string
|
||||
industry_group_member: string
|
||||
industry_group_phone: string
|
||||
project_nature: string
|
||||
description: string
|
||||
}
|
||||
|
||||
interface BusinessFormData {
|
||||
deployer_name: string
|
||||
deployer_phone: string
|
||||
deploy_system: string
|
||||
system_version: string
|
||||
business_entry_url: string
|
||||
super_admin_user: string
|
||||
super_admin_password: string
|
||||
}
|
||||
|
||||
interface EnvironmentFormData {
|
||||
network_environment: string
|
||||
main_public_ip: string
|
||||
domain_url: string
|
||||
enable_ssl: boolean
|
||||
host_management_method: string
|
||||
management_console_url: string
|
||||
host_count: number
|
||||
total_cpu: number
|
||||
total_memory_gb: number
|
||||
total_storage_gb: number
|
||||
}
|
||||
|
||||
interface MiddlewareFormItem {
|
||||
middleware_type: string
|
||||
public_ip: string
|
||||
public_port: number
|
||||
internal_ip: string
|
||||
internal_port: number
|
||||
admin_user: string
|
||||
admin_password?: string
|
||||
}
|
||||
|
||||
// Diff 项
|
||||
interface DiffItem {
|
||||
field: string
|
||||
label: string
|
||||
oldValue: string | number | boolean
|
||||
newValue: string | number | boolean
|
||||
}
|
||||
```
|
||||
@@ -1,158 +0,0 @@
|
||||
# ACL 权限模型
|
||||
|
||||
## 功能权限 (RBAC)
|
||||
|
||||
| 权限代码 | 说明 | 角色 |
|
||||
|:---|:---|:---|
|
||||
| `project:create` | 创建项目 | SuperAdmin |
|
||||
| `project:delete` | 删除/归档项目 | SuperAdmin |
|
||||
| `project:edit` | 直接编辑项目 | SuperAdmin |
|
||||
| `project:edit_workflow` | 通过工单编辑项目 | User (有ACL权限) |
|
||||
| `project:auth_manage` | 一级/二级授权管理 | SuperAdmin |
|
||||
| `project:permission_manage` | 项目权限分配 | SuperAdmin |
|
||||
|
||||
## 数据权限 (ACL) - 模块级别
|
||||
|
||||
### 模块定义
|
||||
|
||||
| 模块代码 | 模块名称 | 说明 |
|
||||
|:---|:---|:---|
|
||||
| `basic_info` | 基本信息模块 | 项目名称、命名空间、省份城市等 |
|
||||
| `business_info` | 部署业务模块 | 部署人、部署时间、系统版本等 |
|
||||
| `environment_info` | 部署环境模块 | 主机信息、网络环境、域名等 |
|
||||
| `middleware_info` | 部署中间件模块 | MySQL、Redis、EMQX等配置 |
|
||||
| `authorization_info` | 项目授权模块 | TOTP授权信息(仅SuperAdmin) |
|
||||
|
||||
### 权限类型
|
||||
|
||||
| 权限类型 | 说明 |
|
||||
|:---|:---|
|
||||
| `view` | 查看权限(可查看项目信息,可发起修改工单) |
|
||||
| `export` | 导出权限(可导出项目信息) |
|
||||
|
||||
> **说明**:编辑权限通过工单系统实现,拥有 `view` 权限的用户可以发起修改工单,由 SuperAdmin 审批后生效。
|
||||
|
||||
## 权限规则
|
||||
|
||||
1. **SuperAdmin**: 拥有所有项目的所有模块的全部权限,可直接修改
|
||||
2. **Admin**: 可以访问自己被授权的项目模块,可以向普通用户转授权限
|
||||
3. **Normal User**: 只能访问被授权的项目模块,修改需通过工单
|
||||
4. **项目填写人**: 自动获得该项目的查看权限
|
||||
5. **授权模块**: 仅 SuperAdmin 可见
|
||||
|
||||
## ACL 表结构(位于 rmdc-user-auth)
|
||||
|
||||
```go
|
||||
// ProjectACL 项目权限表 (模块级别)
|
||||
type ProjectACL struct {
|
||||
ID int64 `gorm:"primaryKey;autoIncrement" json:"id"`
|
||||
ProjectID string `gorm:"type:varchar(64);index;not null" json:"project_id"`
|
||||
UserID int64 `gorm:"index;not null" json:"user_id"`
|
||||
|
||||
// 模块代码: basic_info/business_info/environment_info/middleware_info/authorization_info
|
||||
ModuleCode string `gorm:"type:varchar(32);not null" json:"module_code"`
|
||||
|
||||
// 权限类型
|
||||
CanView bool `gorm:"default:false" json:"can_view"`
|
||||
CanExport bool `gorm:"default:false" json:"can_export"`
|
||||
|
||||
// 授权信息
|
||||
GrantedBy int64 `json:"granted_by"`
|
||||
GrantedAt time.Time `json:"granted_at"`
|
||||
|
||||
UpdatedAt time.Time `json:"updated_at"`
|
||||
}
|
||||
```
|
||||
|
||||
## 权限检查流程
|
||||
|
||||
```go
|
||||
// CheckProjectModulePermission 检查用户对项目模块的权限
|
||||
func (s *ACLService) CheckProjectModulePermission(
|
||||
ctx context.Context,
|
||||
userID int64,
|
||||
projectID string,
|
||||
moduleCode string,
|
||||
permType string, // "view" or "export"
|
||||
) (bool, error) {
|
||||
// 1. 检查是否为 SuperAdmin
|
||||
if s.IsSuperAdmin(ctx, userID) {
|
||||
return true, nil
|
||||
}
|
||||
|
||||
// 2. 授权模块仅 SuperAdmin 可访问
|
||||
if moduleCode == "authorization_info" {
|
||||
return false, nil
|
||||
}
|
||||
|
||||
// 3. 查询 ACL 表
|
||||
var acl ProjectACL
|
||||
err := s.db.Where("project_id = ? AND user_id = ? AND module_code = ?",
|
||||
projectID, userID, moduleCode).First(&acl).Error
|
||||
if err != nil {
|
||||
return false, nil
|
||||
}
|
||||
|
||||
// 4. 检查权限类型
|
||||
switch permType {
|
||||
case "view":
|
||||
return acl.CanView, nil
|
||||
case "export":
|
||||
return acl.CanExport, nil
|
||||
default:
|
||||
return false, nil
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 权限分配接口
|
||||
|
||||
### 授予权限
|
||||
|
||||
```json
|
||||
// POST /api/project/permission/grant
|
||||
{
|
||||
"project_id": "proj_001",
|
||||
"user_id": 123,
|
||||
"modules": [
|
||||
{
|
||||
"module_code": "basic_info",
|
||||
"can_view": true,
|
||||
"can_export": false
|
||||
},
|
||||
{
|
||||
"module_code": "business_info",
|
||||
"can_view": true,
|
||||
"can_export": true
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 批量设置权限
|
||||
|
||||
```json
|
||||
// POST /api/project/permission/batch
|
||||
{
|
||||
"project_id": "proj_001",
|
||||
"permissions": [
|
||||
{
|
||||
"user_id": 123,
|
||||
"modules": ["basic_info", "business_info"]
|
||||
},
|
||||
{
|
||||
"user_id": 456,
|
||||
"modules": ["basic_info"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 权限继承规则
|
||||
|
||||
| 场景 | 规则 |
|
||||
|:---|:---|
|
||||
| 项目创建 | 填写人自动获得所有模块的 view 权限 |
|
||||
| 权限转授 | Admin 只能转授自己拥有的权限 |
|
||||
| 权限撤销 | 不影响已创建的草稿/工单 |
|
||||
| 项目归档 | 保留权限记录,但无法访问 |
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user