Compare commits

...

2 Commits

Author SHA1 Message Date
zeaslity
a1f208891d 实现前端开发的SKILL 2026-07-01 16:31:30 +08:00
zeaslity
9cd57b92b8 更新 skill 后端开发 2026-07-01 13:45:30 +08:00
315 changed files with 50006 additions and 1702 deletions

View File

@@ -1,438 +0,0 @@
---
name: backend-go-gin-gorm
description: >
使用 Gin + GORM 生成、编写、修改、评审 production-ready 的 Go 后端代码Generate & Review Go backend code with Gin/GORM
强制分层架构 handler → service → dao/repository避免业务逻辑堆在 handlerDAO/Repo 只做数据访问与查询组装),并统一 API 响应包装
consistent response envelopecode/message/data + request_id/trace_id 等可观测字段)。接口风格默认推荐 POST + JSON RequestBody
as default必要时遵循 REST 语义与幂等约定),规范 DTO/VO/DO 命名与字段映射 conventions入参 DTO、出参 VO、持久化 DO/Model
代码注释使用中文Chinese comments for maintainability时间处理默认 Asia/Shanghaitime zone aware time handling
采用结构化日志 structured logging携带 request_id/trace_id/user_id/path/latency 等上下文),并遵循 Gin/GORM 工程化最佳实践
(transactions, context propagation, error wrapping, pagination, soft delete, optimistic locking when needed)。
触发场景 Trigger: Go 后端开发 / Gin Handler 创建 / GORM DAO/Repository 实现 / 代码走查与 Reviewrefactor suggestions, bug fixes, performance tips
argument-hint: "<动作 action> <目标 target>" 例如/ e.g.:
"create user-handler", "review service/order.go", "scaffold api/v1/product", "add repo for table/users", "optimize gorm query"
allowed-tools:
- Read
- Write
- Edit
- Glob
- Grep
- Bash
---
# Go GIN/GORM 开发规范 Skill
## 触发条件
- 用户请求创建/修改 Go 后端代码
- 用户请求代码审查
- 用户提及 API 开发、数据库操作、统一响应、日志、时间处理
- 用户请求设计 API 接口、DTO 结构
## 上下文收集
执行前先收集项目信息:
!`ls -la go.mod go.sum 2>/dev/null || echo "No go.mod found"`
!`head -20 go.mod 2>/dev/null || echo ""`
## $ARGUMENTS 解析
期望格式:`<action> <target>`
| action | 说明 |
|--------|------|
| `create` | 创建新文件handler/service/dao/dto |
| `review` | 审查现有代码 |
| `scaffold` | 生成完整模块骨架 |
| `fix` | 修复不符合规范的代码 |
---
## Plan 阶段
### 产物清单(按 action 确定)
| action | 产物 |
|--------|------|
| `create handler` | `/api/xxx_handler.go``/internal/handler/xxx.go` |
| `create service` | `/internal/service/xxx_service.go` |
| `create dao` | `/internal/dao/xxx_dao.go` |
| `create dto` | `/internal/model/dto/xxx_dto.go` |
| `scaffold` | 上述全部 + entity |
### 决策点
1. **目录风格**:检查项目是用 `/api` 还是 `/internal/handler`
2. **模块命名**:从 $ARGUMENTS 提取资源名(如 `user``order`
3. **是否已存在**:先 Glob 检查目标文件
---
## Execute 阶段
### Handler 层编写规则
```
1. 仅做:参数解析 → 调用 service → 返回响应
2. 禁止:编写业务逻辑、直接操作数据库
3. 必须:使用 common.ResponseSuccess / common.ResponseError
4. 错误处理gorm.ErrRecordNotFound → CodeNotFound
```
### Service 层编写规则
```
1. 编排 dao 层完成业务
2. 记录关键业务日志Info 级别)
3. 错误包装fmt.Errorf("xxx: %w", err)
4. 业务异常记录 Warning 级别日志
```
### DAO 层编写规则
```
1. 封装所有 GORM 操作
2. 禁止在 service 层写 SQL
3. 复杂查询用 Raw/Exec
4. 善用链式调用,但复杂场景优先原生 SQL
```
### 统一响应格式(强制)
```go
// 成功
common.ResponseSuccess(c, data)
common.ResponseSuccessWithMessage(c, data, "创建成功")
// 失败
common.ResponseError(c, common.CodeParamError, "参数错误")
common.ResponseErrorWithDetail(c, common.CodeServerError, "系统错误", err)
```
错误码定义 → 读取 `reference/error-codes.go`
### 注释规范(强制中文)
```go
// GetUserByID 根据用户ID获取用户信息
// @param ctx context.Context - 请求上下文
// @param userID int64 - 用户唯一ID
// @return *model.User - 用户信息,未找到返回nil
// @return error - 查询错误
func (s *UserService) GetUserByID(ctx context.Context, userID int64) (*model.User, error)
```
---
## API 设计规范(强制)
### 核心原则POST + RequestBody
```
所有 API 优先使用 POST 方法,参数通过 RequestBody 传递
避免使用 PathVariables 和 RequestParams
```
### 禁止与推荐
| 禁止 | 推荐 |
|------|------|
| `GET /api/projects/{project_id}` | `POST /api/projects/detail` + RequestBody |
| `GET /api/users?role=admin&page=1` | `POST /api/users/list` + RequestBody |
| URL 中传递敏感信息 | RequestBody 传递所有参数 |
### API 路径命名规范
| 操作 | 后缀 | 示例 |
|------|------|------|
| 列表查询 | `/list` | `POST /api/projects/list` |
| 详情查询 | `/detail` | `POST /api/projects/detail` |
| 创建 | `/create` | `POST /api/projects/create` |
| 更新 | `/update` | `POST /api/projects/update` |
| 删除 | `/delete` | `POST /api/projects/delete` |
| 同步 | `/sync` | `POST /api/jenkins/organizations/sync` |
| 触发 | `/trigger` | `POST /api/builds/trigger` |
### DTO 命名规范
| 类型 | 命名格式 | 示例 |
|------|----------|------|
| 列表请求 | `List{资源}Request` | `ListBuildsRequest` |
| 详情请求 | `Get{资源}Request` | `GetBuildRequest` |
| 创建请求 | `Create{资源}Request` | `CreateProjectRequest` |
| 更新请求 | `Update{资源}Request` | `UpdateProjectRequest` |
| 删除请求 | `Delete{资源}Request` | `DeleteProjectRequest` |
| 列表响应 | `List{资源}Response` | `ListBuildsResponse` |
| 详情响应 | `{资源}DetailResponse` | `BuildDetailResponse` |
### 通用分页结构
```go
// 请求
type PageRequest struct {
Page int `json:"page" binding:"required,min=1"`
PageSize int `json:"page_size" binding:"required,min=1,max=100"`
}
// 响应
type ListResponse struct {
List []interface{} `json:"list"`
Total int64 `json:"total"`
Page int `json:"page"`
PageSize int `json:"page_size"`
}
```
### 模块错误码范围
| 范围 | 模块 |
|------|------|
| 0 | 成功 |
| 1000-1999 | 通用错误 |
| 2000-2999 | 用户/权限 |
| 3000-3999 | Jenkins |
| 4000-4999 | 项目管理 |
| 5000-5999 | Exchange-Hub |
详细规范 → 读取 `reference/api-design-spec.md`
---
## 日志规范(强制)
### 指定框架
项目统一使用 `rmdc-common/wdd_log/log_utils.go`
### 日志级别使用场景
| 级别 | 使用场景 | 示例 |
|------|----------|------|
| `Debug` | 开发调试,详细流程、变量值 | `log.Debug(ctx, "查询参数", map[string]interface{}{"userID": id})` |
| `Info` | 关键业务节点 | `log.Info(ctx, "用户登录成功", ...)` / `log.Info(ctx, "订单创建成功", ...)` |
| `Warning` | 可预期非致命异常,程序可继续 | `log.Warning(ctx, "外部API超时,启用备用方案", ...)` |
| `Error` | 严重错误,业务流程中断 | `log.Error(ctx, "数据库连接失败", ...)` 必须记录堆栈 |
### 日志内容要求
```
1. 简练、关键
2. 必须包含 TraceID、UserID 等追溯信息
3. Error 级别必须记录完整错误堆栈
```
### 日志记录位置
| 层级 | 记录内容 |
|------|----------|
| Handler | 使用 `ResponseErrorWithDetail` 自动记录 Error 日志 |
| Service | 关键业务操作记录 Info业务异常记录 Warning |
| DAO | 一般不记录日志,错误向上抛出 |
---
## 时间处理(强制东八区)
### 核心规则
```
时区Asia/Shanghai (UTC+8)
格式RFC3339
```
### 禁止与必须
| 禁止 | 必须使用 |
|------|----------|
| `time.Now()` | `TimeUtils.Now()` |
| `time.Parse()` | `TimeUtils.Parse()` |
| 直接格式化 | `TimeUtils.Format()` |
### 工具库位置
- 后端:`rmdc-common/utils/TimeUtils.go`
- 前端:`TonyMask/src/utils/timeUtils.ts`
### 使用示例
```go
// ✅ 正确
now := TimeUtils.Now()
timestamp := TimeUtils.Now().Format(time.RFC3339)
// ❌ 错误
now := time.Now() // 禁止直接使用
```
---
## 框架使用规范
### GIN 框架
#### 路由组织(强制分组)
```go
// ✅ 正确:使用路由分组
v1 := r.Group("/api/v1")
{
users := v1.Group("/users")
{
users.GET("/:id", userHandler.GetByID)
users.POST("/", userHandler.Create)
}
}
// ❌ 错误:扁平路由
r.GET("/api/v1/users/:id", ...)
```
#### 中间件使用
```go
// 全局中间件
r.Use(middleware.Recovery()) // 恢复
r.Use(middleware.Logger()) // 日志
r.Use(middleware.CORS()) // 跨域
// 路由组中间件
authGroup := r.Group("/admin")
authGroup.Use(middleware.Auth())
```
#### 响应规范
```
所有 API 响应必须通过 pkg/common 统一响应函数
禁止直接使用 c.JSON()、c.String() 等
```
### GORM 框架
#### 操作位置
```
所有 GORM 操作必须在 dao 层
严禁在 service 层拼接查询
```
#### 链式调用 vs 原生 SQL
| 场景 | 推荐方式 |
|------|----------|
| 简单 CRUD | 链式调用 `db.Where().First()` |
| 复杂查询(多表 JOIN、子查询 | `Raw()` / `Exec()` 原生 SQL |
| 批量操作 | `Raw()` / `Exec()` 保证性能 |
```go
// 简单查询 - 链式调用
db.Where("status = ?", 1).Find(&users)
// 复杂查询 - 原生 SQL
db.Raw(`
SELECT u.*, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = ?
GROUP BY u.id
`, 1).Scan(&results)
```
#### 错误处理
```go
// 必须处理 ErrRecordNotFound
if errors.Is(err, gorm.ErrRecordNotFound) {
common.ResponseError(c, common.CodeNotFound, "资源不存在")
return
}
```
---
## Verify 阶段 Checklist
### 结构检查
- [ ] 依赖方向正确handler → service → dao无反向引用
- [ ] handler 层无业务逻辑
- [ ] dao 层无 service 引用
- [ ] 使用 internal 包保护私有代码
### 响应检查
- [ ] 所有 API 使用 `common.ResponseSuccess/Error`
- [ ] 错误码来自 `common.Code*` 常量
- [ ] 时间戳格式为 RFC3339
- [ ] 无直接 `c.JSON()` 调用
### 代码检查
- [ ] 公开函数/结构体有中文注释
- [ ] 注释格式:`// 函数名 功能描述`
- [ ] 无直接 `time.Now()` 调用
- [ ] 无丢弃的 error`_ = err` 禁止)
- [ ] 包名小写无下划线
### 日志检查
- [ ] 使用项目统一日志库
- [ ] Error 日志包含完整堆栈
- [ ] 关键业务操作有 Info 日志
- [ ] 日志包含 TraceID 等追溯信息
### GORM 检查
- [ ] `gorm.ErrRecordNotFound` 已处理
- [ ] 复杂查询在 dao 层使用 Raw/Exec
- [ ] 无 service 层直接 DB 操作
### GIN 检查
- [ ] 使用路由分组组织 API
- [ ] 通用逻辑使用中间件处理
- [ ] 响应通过统一函数返回
### API 设计检查
- [ ] 使用 POST + RequestBody非 GET + PathVariables
- [ ] API 路径使用正确后缀(/list, /detail, /create 等)
- [ ] DTO 命名符合规范List/Get/Create/Update/Delete + 资源 + Request/Response
- [ ] 分页请求嵌入 PageRequest
- [ ] 分页响应包含 list/total/page/page_size
- [ ] 敏感信息不在 URL 中
- [ ] 请求体必须验证ShouldBindJSON
---
## 常见陷阱
| 陷阱 | 正确做法 |
|------|----------|
| handler 写业务逻辑 | 移到 service 层 |
| 直接 `c.JSON()` | 用 `common.ResponseSuccess()` |
| 忽略 `ErrRecordNotFound` | 转为 `CodeNotFound` 返回 |
| `time.Now()` | `TimeUtils.Now()` |
| 英文注释 | 改为中文 |
| dao 引用 service | 违反依赖原则,重构 |
| service 写 SQL | 移到 dao 层 |
| 扁平路由 | 使用 Router Group |
| 日志缺少上下文 | 添加 TraceID、UserID |
| Error 日志无堆栈 | 记录完整错误信息 |
| `GET /api/users/{id}` | `POST /api/users/detail` + RequestBody |
| URL 传参数 `?page=1` | RequestBody 传递 |
| DTO 命名不规范 | 使用 `List/Get/Create/Update/Delete` + 资源名 |
| 敏感信息在 URL | 移到 RequestBody |
---
## Reference 文件索引
| 场景 | 读取文件 |
|------|----------|
| 需要完整目录结构说明 | `reference/project-structure.md` |
| 需要响应结构体定义 | `reference/api-response-spec.md` |
| 需要错误码完整列表 | `reference/error-codes.go` |
| 需要编码规范细节 | `reference/coding-standards.md` |
| 需要日志使用详细说明 | `reference/logging-standards.md` |
| 需要时间处理详细说明 | `reference/time-handling.md` |
| 需要框架使用详细说明 | `reference/framework-usage.md` |
| 需要 API 设计详细说明 | `reference/api-design-spec.md` |
| 需要代码示例 | `examples/*.go` |
---
## 快速命令
验证项目结构:
```bash
./scripts/validate-structure.sh
```

View File

@@ -1,55 +0,0 @@
package dao
import (
"context"
"gorm.io/gorm"
"my-project/internal/model/entity"
)
// UserDAO 用户数据访问对象
type UserDAO struct {
db *gorm.DB
}
// NewUserDAO 创建用户DAO实例
// @param db *gorm.DB - 数据库连接
// @return *UserDAO - DAO实例
func NewUserDAO(db *gorm.DB) *UserDAO {
return &UserDAO{db: db}
}
// FindByID 根据ID查询用户
// @param ctx context.Context - 请求上下文
// @param id int64 - 用户ID
// @return *entity.User - 用户实体
// @return error - 查询错误,未找到返回gorm.ErrRecordNotFound
func (d *UserDAO) FindByID(ctx context.Context, id int64) (*entity.User, error) {
var user entity.User
if err := d.db.WithContext(ctx).First(&user, id).Error; err != nil {
return nil, err
}
return &user, nil
}
// Create 创建用户
// @param ctx context.Context - 请求上下文
// @param user *entity.User - 用户实体
// @return error - 创建错误
func (d *UserDAO) Create(ctx context.Context, user *entity.User) error {
return d.db.WithContext(ctx).Create(user).Error
}
// FindByEmail 根据邮箱查询用户
// @param ctx context.Context - 请求上下文
// @param email string - 用户邮箱
// @return *entity.User - 用户实体
// @return error - 查询错误
func (d *UserDAO) FindByEmail(ctx context.Context, email string) (*entity.User, error) {
var user entity.User
if err := d.db.WithContext(ctx).Where("email = ?", email).First(&user).Error; err != nil {
return nil, err
}
return &user, nil
}

View File

@@ -1,70 +0,0 @@
package handler
import (
"errors"
"strconv"
"github.com/gin-gonic/gin"
"gorm.io/gorm"
"my-project/internal/model/dto"
"my-project/internal/service"
"my-project/pkg/common"
)
// UserHandler 用户相关API处理器
type UserHandler struct {
userService *service.UserService
}
// NewUserHandler 创建用户Handler实例
// @param userService *service.UserService - 用户服务
// @return *UserHandler - Handler实例
func NewUserHandler(userService *service.UserService) *UserHandler {
return &UserHandler{userService: userService}
}
// GetUserByID 根据ID获取用户信息
// @param c *gin.Context - GIN上下文
func (h *UserHandler) GetUserByID(c *gin.Context) {
// 1. 参数解析
idStr := c.Param("id")
userID, err := strconv.ParseInt(idStr, 10, 64)
if err != nil {
common.ResponseError(c, common.CodeParamError, "用户ID格式错误")
return
}
// 2. 调用Service
user, err := h.userService.GetUserByID(c.Request.Context(), userID)
if err != nil {
if errors.Is(err, gorm.ErrRecordNotFound) {
common.ResponseError(c, common.CodeNotFound, "用户不存在")
return
}
common.ResponseErrorWithDetail(c, common.CodeServerError, "获取用户失败", err)
return
}
// 3. 成功响应
common.ResponseSuccess(c, user)
}
// CreateUser 创建用户
// @param c *gin.Context - GIN上下文
func (h *UserHandler) CreateUser(c *gin.Context) {
var req dto.CreateUserRequest
if err := c.ShouldBindJSON(&req); err != nil {
common.ResponseErrorWithDetail(c, common.CodeValidationFail, "参数验证失败", err)
return
}
user, err := h.userService.CreateUser(c.Request.Context(), &req)
if err != nil {
common.ResponseErrorWithDetail(c, common.CodeBusiness, "创建用户失败", err)
return
}
common.ResponseSuccessWithMessage(c, user, "用户创建成功")
}

View File

@@ -1,60 +0,0 @@
package service
import (
"context"
"fmt"
"my-project/internal/dao"
"my-project/internal/model/dto"
"my-project/internal/model/entity"
"my-project/pkg/log"
)
// UserService 用户业务服务
type UserService struct {
userDAO *dao.UserDAO
}
// NewUserService 创建用户服务实例
// @param userDAO *dao.UserDAO - 用户数据访问对象
// @return *UserService - 服务实例
func NewUserService(userDAO *dao.UserDAO) *UserService {
return &UserService{userDAO: userDAO}
}
// GetUserByID 根据用户ID获取用户信息
// @param ctx context.Context - 请求上下文
// @param userID int64 - 用户唯一ID
// @return *entity.User - 用户实体
// @return error - 查询错误
func (s *UserService) GetUserByID(ctx context.Context, userID int64) (*entity.User, error) {
user, err := s.userDAO.FindByID(ctx, userID)
if err != nil {
return nil, fmt.Errorf("查询用户失败: %w", err)
}
return user, nil
}
// CreateUser 创建新用户
// @param ctx context.Context - 请求上下文
// @param req *dto.CreateUserRequest - 创建请求
// @return *entity.User - 创建的用户实体
// @return error - 创建错误
func (s *UserService) CreateUser(ctx context.Context, req *dto.CreateUserRequest) (*entity.User, error) {
user := &entity.User{
Username: req.Username,
Email: req.Email,
}
if err := s.userDAO.Create(ctx, user); err != nil {
return nil, fmt.Errorf("创建用户失败: %w", err)
}
// 记录关键业务日志
log.Info(ctx, "用户创建成功", map[string]interface{}{
"userID": user.ID,
"username": user.Username,
})
return user, nil
}

View File

@@ -1,332 +0,0 @@
# API 设计规范
## 核心原则
### 1. 使用 POST + RequestBody
> **核心规范**: 所有 API 优先使用 POST 方法,参数通过 RequestBody 传递
```go
// ✅ 推荐方式
POST /api/jenkins/builds/list
{
"organization_folder": "Backend",
"repository_name": "cmii-fly-center",
"branch_name": "master",
"page": 1,
"page_size": 10
}
// ❌ 避免使用
GET /api/jenkins/organizations/{org}/repositories/{repo}/branches/{branch}/builds?page=1&page_size=10
```
### 2. 避免 PathVariables
```go
// ❌ 不推荐
GET /api/projects/{project_id}
GET /api/builds/{build_id}/console
// ✅ 推荐
POST /api/projects/detail
{
"project_id": "namespace_abc12345"
}
POST /api/builds/console
{
"organization_folder": "Backend",
"repository_name": "cmii-fly-center",
"branch_name": "master",
"build_number": 123
}
```
### 3. 避免 RequestParams
```go
// ❌ 不推荐
GET /api/users/list?role=admin&status=active&page=1
// ✅ 推荐
POST /api/users/list
{
"role": "admin",
"status": "active",
"page": 1,
"page_size": 20
}
```
---
## 统一响应格式
### 成功响应
```json
{
"code": 0,
"message": "success",
"data": {
// 业务数据
}
}
```
### 分页响应
```json
{
"code": 0,
"message": "success",
"data": {
"list": [...],
"total": 100,
"page": 1,
"page_size": 20
}
}
```
### 错误响应
```json
{
"code": 1001,
"message": "参数错误: organization_folder不能为空",
"data": null
}
```
---
## 请求结构规范
### 通用分页请求
```go
type PageRequest struct {
Page int `json:"page" binding:"required,min=1"`
PageSize int `json:"page_size" binding:"required,min=1,max=100"`
}
```
### 通用筛选请求
```go
type ListRequest struct {
PageRequest
Keyword string `json:"keyword,omitempty"` // 搜索关键词
Status string `json:"status,omitempty"` // 状态筛选
SortBy string `json:"sort_by,omitempty"` // 排序字段
SortOrder string `json:"sort_order,omitempty"` // asc/desc
}
```
---
## API 命名规范
### 操作类型后缀
| 操作 | 后缀 | 示例 |
|------|------|------|
| 列表查询 | `/list` | `/api/projects/list` |
| 详情查询 | `/detail` | `/api/projects/detail` |
| 创建 | `/create` | `/api/projects/create` |
| 更新 | `/update` | `/api/projects/update` |
| 删除 | `/delete` | `/api/projects/delete` |
| 同步 | `/sync` | `/api/jenkins/organizations/sync` |
| 触发 | `/trigger` | `/api/builds/trigger` |
| 导出 | `/export` | `/api/projects/export` |
### 模块前缀
| 模块 | 前缀 |
|------|------|
| Jenkins | `/api/jenkins/` |
| 项目管理 | `/api/projects/` |
| 用户 | `/api/users/` |
| 权限 | `/api/permissions/` |
| 权限-Jenkins | `/api/permissions/jenkins/` |
| 权限-项目 | `/api/permissions/projects/` |
| 审计 | `/api/audit/` |
| Exchange-Hub | `/api/exchange-hub/` |
| DCU | `/api/dcu/` |
---
## Handler 实现模板
```go
// ListBuilds 获取构建列表
// @Summary 获取构建列表
// @Tags 构建管理
// @Accept json
// @Produce json
// @Param request body dto.ListBuildsRequest true "请求参数"
// @Success 200 {object} response.Response{data=dto.ListBuildsResponse}
// @Router /api/jenkins/builds/list [post]
func (h *BuildHandler) ListBuilds(c *gin.Context) {
var req dto.ListBuildsRequest
if err := c.ShouldBindJSON(&req); err != nil {
response.ParamError(c, err)
return
}
resp, err := h.buildService.ListBuilds(c.Request.Context(), &req)
if err != nil {
response.Error(c, err)
return
}
response.Success(c, resp)
}
```
---
## DTO 设计规范
### 请求 DTO 命名
```go
// 列表请求: List{资源}Request
type ListBuildsRequest struct {
PageRequest
OrganizationFolder string `json:"organization_folder" binding:"required"`
RepositoryName string `json:"repository_name" binding:"required"`
BranchName string `json:"branch_name,omitempty"`
}
// 详情请求: Get{资源}Request 或 {资源}DetailRequest
type GetBuildRequest struct {
OrganizationFolder string `json:"organization_folder" binding:"required"`
RepositoryName string `json:"repository_name" binding:"required"`
BranchName string `json:"branch_name" binding:"required"`
BuildNumber int `json:"build_number" binding:"required"`
}
// 创建请求: Create{资源}Request
type CreateProjectRequest struct {
Name string `json:"name" binding:"required"`
Namespace string `json:"namespace" binding:"required"`
Province string `json:"province" binding:"required"`
City string `json:"city" binding:"required"`
}
// 更新请求: Update{资源}Request
type UpdateProjectRequest struct {
ProjectID string `json:"project_id" binding:"required"`
Name string `json:"name,omitempty"`
Province string `json:"province,omitempty"`
City string `json:"city,omitempty"`
}
// 删除请求: Delete{资源}Request
type DeleteProjectRequest struct {
ProjectID string `json:"project_id" binding:"required"`
}
```
### 响应 DTO 命名
```go
// 列表响应: List{资源}Response
type ListBuildsResponse struct {
List []*BuildDTO `json:"list"`
Total int64 `json:"total"`
Page int `json:"page"`
PageSize int `json:"page_size"`
}
// 详情响应: {资源}DetailResponse 或直接使用 {资源}DTO
type BuildDetailResponse struct {
*BuildDTO
ConsoleOutput string `json:"console_output,omitempty"`
}
```
---
## 错误码规范
### 错误码范围
| 范围 | 模块 |
|------|------|
| 1000-1999 | 通用错误 |
| 2000-2999 | 用户/权限 |
| 3000-3999 | Jenkins模块 |
| 4000-4999 | 项目管理 |
| 5000-5999 | Exchange-Hub |
| 6000-6999 | Watchdog |
### 通用错误码
| 错误码 | 说明 |
|--------|------|
| 0 | 成功 |
| 1001 | 参数错误 |
| 1002 | 未授权 |
| 1003 | 禁止访问 |
| 1004 | 资源不存在 |
| 1005 | 内部错误 |
---
## 前端调用示例
```typescript
// api/modules/jenkins.ts
export const jenkinsApi = {
// 获取构建列表
listBuilds: (data: ListBuildsRequest) =>
request.post<ListBuildsResponse>('/api/jenkins/builds/list', data),
// 触发构建
triggerBuild: (data: TriggerBuildRequest) =>
request.post<TriggerBuildResponse>('/api/jenkins/builds/trigger', data),
// 获取构建详情
getBuildDetail: (data: GetBuildRequest) =>
request.post<BuildDetailResponse>('/api/jenkins/builds/detail', data),
};
```
---
## 安全规范
### 1. 敏感字段不出现在 URL
```go
// ❌ 敏感信息泄露到URL
GET /api/auth/login?username=admin&password=123456
// ✅ 使用RequestBody
POST /api/auth/login
{
"username": "admin",
"password": "123456"
}
```
### 2. 必须验证请求体
```go
func (h *Handler) CreateProject(c *gin.Context) {
var req dto.CreateProjectRequest
if err := c.ShouldBindJSON(&req); err != nil {
response.ParamError(c, err)
return
}
// 后续处理...
}
```
### 3. 审计敏感操作
所有写操作需通过审计中间件记录。

View File

@@ -1,35 +0,0 @@
# API 响应规范
## 统一响应结构
```go
type Response struct {
Code int `json:"code"` // 业务状态码0=成功
Status int `json:"status"` // HTTP 状态码
Timestamp string `json:"timestamp"` // RFC3339 东八区
Data interface{} `json:"data"` // 业务数据
Message string `json:"message,omitempty"` // 消息
Error string `json:"error,omitempty"` // 错误详情
}
```
## 使用函数
| 场景 | 函数 |
|------|------|
| 查询成功 | `ResponseSuccess(c, data)` |
| 操作成功 | `ResponseSuccessWithMessage(c, data, "msg")` |
| 普通错误 | `ResponseError(c, code, "msg")` |
| 详细错误 | `ResponseErrorWithDetail(c, code, "msg", err)` |
## HTTP 状态码映射
| 业务码 | HTTP 状态码 |
|--------|-------------|
| CodeSuccess | 200 |
| CodeParamError, CodeValidationFail | 400 |
| CodeUnauthorized | 401 |
| CodeForbidden | 403 |
| CodeNotFound | 404 |
| CodeTimeout | 408 |
| 其他 | 500 |

View File

@@ -1,44 +0,0 @@
# 编码规范
## 命名规范
| 类型 | 规则 | 示例 |
|------|------|------|
| 包名 | 小写单词,无下划线 | `service`, `utils` |
| 变量/函数 | 驼峰命名 | `getUserByID` |
| 公开标识 | 首字母大写 | `GetUserByID` |
| 接口 | 单方法以 `er` 结尾 | `Reader`, `Writer` |
## 注释规范(中文,必须)
```go
// GetUserByID 根据用户ID获取用户信息
// @param ctx context.Context - 请求上下文
// @param userID int64 - 用户唯一ID
// @return *model.User - 用户信息
// @return error - 查询错误
func (s *UserService) GetUserByID(ctx context.Context, userID int64) (*model.User, error)
```
## 错误处理
1. 必须 `if err != nil` 处理
2.`fmt.Errorf("xxx: %w", err)` 包装
3. 禁止 `_ = err` 丢弃错误
4. Handler 层必须通过统一响应返回
## 日志级别
| 级别 | 用途 |
|------|------|
| Debug | 开发调试,详细流程 |
| Info | 关键业务节点 |
| Warning | 可预期非致命异常 |
| Error | 严重错误,必须记录堆栈 |
## 时间处理
- 时区Asia/Shanghai (UTC+8)
- 格式RFC3339
- 禁止:`time.Now()`
- 使用:`TimeUtils.Now()`

View File

@@ -1,35 +0,0 @@
package common
// 业务状态码常量
const (
CodeSuccess = 0 // 成功
CodeServerError = 10001 // 服务器内部错误
CodeParamError = 10002 // 参数错误
CodeUnauthorized = 10003 // 未授权
CodeForbidden = 10004 // 禁止访问
CodeNotFound = 10005 // 资源不存在
CodeTimeout = 10006 // 请求超时
CodeValidationFail = 10007 // 验证失败
CodeBusiness = 20001 // 业务逻辑错误 (20001-29999)
)
// CodeMessage 错误码消息映射
var CodeMessage = map[int]string{
CodeSuccess: "success",
CodeServerError: "服务器内部错误",
CodeParamError: "参数错误",
CodeUnauthorized: "未授权,请先登录",
CodeForbidden: "权限不足,禁止访问",
CodeNotFound: "请求的资源不存在",
CodeTimeout: "请求超时",
CodeValidationFail: "数据验证失败",
CodeBusiness: "业务处理失败",
}
// GetMessage 根据错误码获取默认消息
func GetMessage(code int) string {
if msg, ok := CodeMessage[code]; ok {
return msg
}
return "未知错误"
}

View File

@@ -1,264 +0,0 @@
# 框架使用规范
## GIN 框架
### 路由组织
#### 强制使用路由分组 (Router Group)
```go
func SetupRouter(r *gin.Engine) {
// API 版本分组
v1 := r.Group("/api/v1")
{
// 用户模块
users := v1.Group("/users")
{
users.GET("/", userHandler.List)
users.GET("/:id", userHandler.GetByID)
users.POST("/", userHandler.Create)
users.PUT("/:id", userHandler.Update)
users.DELETE("/:id", userHandler.Delete)
}
// 订单模块
orders := v1.Group("/orders")
{
orders.GET("/", orderHandler.List)
orders.GET("/:id", orderHandler.GetByID)
orders.POST("/", orderHandler.Create)
}
}
}
```
#### 禁止扁平路由
```go
// ❌ 错误:扁平路由,难以维护
r.GET("/api/v1/users", ...)
r.GET("/api/v1/users/:id", ...)
r.POST("/api/v1/users", ...)
r.GET("/api/v1/orders", ...)
```
### 中间件使用
#### 全局中间件
```go
func SetupMiddleware(r *gin.Engine) {
// Recovery - 恢复 panic防止程序崩溃
r.Use(middleware.Recovery())
// Logger - 请求日志记录
r.Use(middleware.Logger())
// CORS - 跨域处理
r.Use(middleware.CORS())
// TraceID - 请求追踪
r.Use(middleware.TraceID())
}
```
#### 路由组中间件
```go
// 需要认证的路由组
authGroup := r.Group("/api/v1/admin")
authGroup.Use(middleware.Auth())
{
authGroup.GET("/dashboard", adminHandler.Dashboard)
authGroup.GET("/users", adminHandler.ListUsers)
}
// 需要特定权限的路由组
superAdmin := authGroup.Group("/super")
superAdmin.Use(middleware.RequireRole("super_admin"))
{
superAdmin.DELETE("/users/:id", adminHandler.DeleteUser)
}
```
#### 常用中间件职责
| 中间件 | 职责 |
|--------|------|
| Recovery | 捕获 panic返回 500 错误 |
| Logger | 记录请求日志(方法、路径、耗时等) |
| CORS | 处理跨域请求 |
| Auth | 验证用户身份JWT/Session |
| TraceID | 生成/传递请求追踪 ID |
| RateLimit | 请求频率限制 |
### 响应规范
#### 强制使用统一响应
```go
// ✅ 正确:使用统一响应函数
common.ResponseSuccess(c, data)
common.ResponseError(c, common.CodeParamError, "参数错误")
// ❌ 错误:直接使用 GIN 原生方法
c.JSON(200, data)
c.String(200, "success")
c.AbortWithStatusJSON(400, gin.H{"error": "bad request"})
```
---
## GORM 框架
### 操作位置规范
```
所有 GORM 操作必须在 dao 层实现
严禁在 service 层直接操作数据库
```
### 查询方式选择
#### 简单 CRUD - 链式调用
```go
// 单条查询
var user entity.User
db.Where("id = ?", userID).First(&user)
// 列表查询
var users []entity.User
db.Where("status = ?", 1).
Order("created_at DESC").
Limit(10).
Offset(0).
Find(&users)
// 创建
db.Create(&user)
// 更新
db.Model(&user).Updates(map[string]interface{}{
"name": "new name",
"status": 1,
})
// 删除
db.Delete(&user, userID)
```
#### 复杂查询 - Raw/Exec
**推荐场景**
- 多表 JOIN
- 子查询
- 复杂聚合
- 批量操作
- 性能敏感场景
```go
// 多表 JOIN 查询
type UserWithOrderCount struct {
entity.User
OrderCount int64 `json:"order_count"`
}
var results []UserWithOrderCount
db.Raw(`
SELECT u.*, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = ?
GROUP BY u.id
ORDER BY order_count DESC
LIMIT ?
`, 1, 10).Scan(&results)
// 批量更新
db.Exec(`
UPDATE orders
SET status = ?
WHERE user_id = ? AND status = ?
`, "completed", userID, "pending")
// 复杂子查询
db.Raw(`
SELECT * FROM users
WHERE id IN (
SELECT user_id FROM orders
WHERE amount > ?
GROUP BY user_id
HAVING COUNT(*) > ?
)
`, 1000, 5).Scan(&users)
```
### 错误处理
#### 必须处理 ErrRecordNotFound
```go
// DAO 层
func (d *UserDAO) FindByID(ctx context.Context, id int64) (*entity.User, error) {
var user entity.User
if err := d.db.WithContext(ctx).First(&user, id).Error; err != nil {
return nil, err // 包含 ErrRecordNotFound
}
return &user, nil
}
// Handler 层
user, err := h.userService.GetUserByID(ctx, userID)
if err != nil {
if errors.Is(err, gorm.ErrRecordNotFound) {
common.ResponseError(c, common.CodeNotFound, "用户不存在")
return
}
common.ResponseErrorWithDetail(c, common.CodeServerError, "查询失败", err)
return
}
```
### 事务处理
```go
// Service 层事务
func (s *OrderService) CreateOrder(ctx context.Context, req *dto.CreateOrderRequest) error {
return s.db.Transaction(func(tx *gorm.DB) error {
// 1. 创建订单
order := &entity.Order{...}
if err := tx.Create(order).Error; err != nil {
return fmt.Errorf("创建订单失败: %w", err)
}
// 2. 扣减库存
if err := tx.Model(&entity.Product{}).
Where("id = ? AND stock >= ?", req.ProductID, req.Quantity).
Update("stock", gorm.Expr("stock - ?", req.Quantity)).Error; err != nil {
return fmt.Errorf("扣减库存失败: %w", err)
}
// 3. 创建支付记录
payment := &entity.Payment{...}
if err := tx.Create(payment).Error; err != nil {
return fmt.Errorf("创建支付记录失败: %w", err)
}
return nil
})
}
```
### Context 传递
```go
// 必须使用 WithContext 传递上下文
db.WithContext(ctx).First(&user, id)
db.WithContext(ctx).Create(&order)
// 支持超时控制和取消
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
db.WithContext(ctx).Find(&users)
```

View File

@@ -1,100 +0,0 @@
# 日志规范
## 指定框架
项目统一使用内部日志库:`rmdc-common/wdd_log/log_utils.go`
## 日志级别定义
### Debug
- **用途**:开发调试,记录程序执行流程、变量值等详细信息
- **场景**:默认开发日志级别
- **示例**
```go
log.Debug(ctx, "开始处理用户请求", map[string]interface{}{
"userID": userID,
"requestID": requestID,
})
```
### Info
- **用途**:记录关键业务操作节点
- **场景**:用户登录、订单创建、支付成功等关键业务
- **示例**
```go
log.Info(ctx, "用户登录成功", map[string]interface{}{
"userID": user.ID,
"username": user.Username,
"ip": c.ClientIP(),
})
log.Info(ctx, "订单创建成功", map[string]interface{}{
"orderID": order.ID,
"amount": order.Amount,
"userID": order.UserID,
})
```
### Warning
- **用途**:记录可预期的、非致命的异常情况,程序仍可继续运行
- **场景**:外部 API 超时启用备用方案、配置缺失使用默认值等
- **示例**
```go
log.Warning(ctx, "外部API调用超时,已启用备用方案", map[string]interface{}{
"api": "payment-gateway",
"timeout": "5s",
"fallback": "local-cache",
})
```
### Error
- **用途**:记录严重错误,导致当前业务流程无法继续
- **场景**:数据库连接失败、关键参数校验失败等
- **要求**:必须详细记录错误信息和堆栈
- **示例**
```go
log.Error(ctx, "数据库连接失败", map[string]interface{}{
"host": dbConfig.Host,
"port": dbConfig.Port,
"error": err.Error(),
"stack": debug.Stack(),
})
```
## 日志内容规范
### 必须包含
1. **TraceID** - 请求追踪 ID
2. **UserID** - 用户标识(如适用)
3. **操作描述** - 简练的中文描述
4. **关键参数** - 与操作相关的关键数据
### 格式要求
```go
log.Info(ctx, "操作描述", map[string]interface{}{
"key1": value1,
"key2": value2,
})
```
## 各层日志职责
### Handler 层
- 使用 `ResponseErrorWithDetail` 自动记录 Error 日志
- 一般不主动记录日志
### Service 层
- **Info**:关键业务操作成功(创建订单、支付、用户注册等)
- **Warning**:业务逻辑异常但可处理
- **Error**:通过 ResponseErrorWithDetail 在 Handler 层统一记录
### DAO 层
- 一般不记录日志
- 错误向上抛出,由 Handler 层统一处理
## 禁止事项
1. 禁止在日志中记录敏感信息密码、Token、完整银行卡号等
2. 禁止使用 `fmt.Println``log.Println`
3. 禁止在循环中大量记录日志
4. Error 日志禁止缺少堆栈信息

View File

@@ -1,39 +0,0 @@
# 项目目录结构规范
## 核心目录
| 目录 | 职责 | 禁止事项 |
|------|------|----------|
| `/api``/internal/handler` | GIN Handler 层,解析请求、调用 service、返回响应 | 禁止写业务逻辑 |
| `/internal/service` | 业务逻辑核心,编排 dao 完成功能 | - |
| `/internal/dao``/internal/repository` | 数据访问层,封装 GORM 操作 | 禁止引用 service |
| `/internal/model/entity` | 数据库表结构对应的持久化对象 | - |
| `/internal/model/dto` | API 数据传输对象(请求/响应) | - |
| `/pkg/common` | 统一响应、错误码、公共工具 | - |
| `/configs` | 配置文件 | - |
| `/cmd` | main.go 入口 | - |
## 依赖规则
```
handler → service → dao
↓ ↓ ↓
pkg/common (任意层可引用)
```
**严禁反向或跨层依赖**
## go.mod 内部模块引用
```go
module my-project
go 1.24
require (
wdd.io/TonyCommon v1.0.0
)
// 本地开发使用 replace
replace wdd.io/TonyCommon => ../TonyCommon
```

View File

@@ -1,120 +0,0 @@
# 时间处理规范
## 核心原则
所有在前端和后端之间传输、以及在数据库中存储的时间,**必须统一为东八区时间 (Asia/Shanghai, UTC+8)**。
## 指定工具库
| 端 | 工具库路径 |
|----|-----------|
| 后端 | `rmdc-common/utils/TimeUtils.go` |
| 前端 | `TonyMask/src/utils/timeUtils.ts` |
## 时间格式
- API 响应中的 `timestamp` 字段统一使用 **RFC3339** 格式
- 示例:`2024-01-15T14:30:00+08:00`
## 禁止与必须
### 禁止直接使用
```go
// ❌ 禁止
time.Now()
time.Parse(layout, value)
t.Format(layout)
```
### 必须使用工具库
```go
// ✅ 正确
TimeUtils.Now()
TimeUtils.Parse(layout, value)
TimeUtils.Format(t, layout)
```
## 常用场景示例
### 获取当前时间
```go
// ❌ 错误
now := time.Now()
// ✅ 正确
now := TimeUtils.Now()
```
### 格式化时间戳
```go
// ❌ 错误
timestamp := time.Now().Format(time.RFC3339)
// ✅ 正确
timestamp := TimeUtils.Now().Format(time.RFC3339)
```
### 解析时间字符串
```go
// ❌ 错误
t, err := time.Parse(time.RFC3339, timeStr)
// ✅ 正确
t, err := TimeUtils.Parse(time.RFC3339, timeStr)
```
### 数据库时间字段
```go
type Order struct {
ID int64 `gorm:"primaryKey"`
CreatedAt time.Time `gorm:"autoCreateTime"` // GORM 自动处理
UpdatedAt time.Time `gorm:"autoUpdateTime"` // GORM 自动处理
ExpireAt time.Time // 业务时间使用 TimeUtils
}
// 设置业务时间
order.ExpireAt = TimeUtils.Now().Add(24 * time.Hour)
```
### API 响应时间
```go
type Response struct {
Code int `json:"code"`
Status int `json:"status"`
Timestamp string `json:"timestamp"` // RFC3339 格式
Data interface{} `json:"data"`
}
// 构建响应
resp := Response{
Timestamp: TimeUtils.Now().Format(time.RFC3339),
// ...
}
```
## TimeUtils 常用方法
| 方法 | 说明 |
|------|------|
| `Now()` | 获取当前东八区时间 |
| `Parse(layout, value)` | 解析时间字符串(东八区) |
| `Format(t, layout)` | 格式化时间 |
| `StartOfDay(t)` | 获取当天零点 |
| `EndOfDay(t)` | 获取当天 23:59:59 |
| `AddDays(t, days)` | 增加天数 |
## 时区配置
确保服务器和数据库时区配置正确:
```go
// 数据库连接配置
dsn := "user:pass@tcp(host:3306)/db?charset=utf8mb4&parseTime=True&loc=Asia%2FShanghai"
```

View File

@@ -1,51 +0,0 @@
#!/bin/bash
# 验证 Go GIN/GORM 项目结构
set -e
echo "=== Go 项目结构验证 ==="
# 检查 go.mod
if [ ! -f "go.mod" ]; then
echo "❌ 缺少 go.mod"
exit 1
fi
echo "✅ go.mod 存在"
# 检查核心目录
DIRS=("internal/service" "internal/dao" "internal/model" "pkg/common")
for dir in "${DIRS[@]}"; do
if [ -d "$dir" ]; then
echo "$dir 存在"
else
echo "⚠️ $dir 不存在"
fi
done
# 检查 handler 目录(两种风格)
if [ -d "api" ] || [ -d "internal/handler" ]; then
echo "✅ handler 目录存在"
else
echo "⚠️ 缺少 api/ 或 internal/handler/"
fi
# 检查反向依赖dao 不应引用 service
echo ""
echo "=== 检查依赖方向 ==="
if grep -r "internal/service" internal/dao/ 2>/dev/null; then
echo "❌ dao 层存在对 service 的反向依赖"
exit 1
fi
echo "✅ 无反向依赖"
# 检查 time.Now() 使用
echo ""
echo "=== 检查 time.Now() 使用 ==="
if grep -rn "time\.Now()" --include="*.go" internal/ api/ 2>/dev/null | grep -v "_test.go"; then
echo "⚠️ 发现直接使用 time.Now(),应使用 TimeUtils.Now()"
else
echo "✅ 无直接 time.Now() 调用"
fi
echo ""
echo "=== 验证完成 ==="

View File

@@ -187,7 +187,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Copyright 2026 Anthropic, PBC.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

View File

@@ -1,6 +1,6 @@
---
name: skill-creator
description: Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
description: Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
---
# Skill Creator
@@ -391,7 +391,7 @@ Use the model ID from your system prompt (the one powering the current session)
While it runs, periodically tail the output to give the user updates on which iteration it's on and what the scores look like.
This handles the full optimization loop automatically. It splits the eval set into 60% train and 40% held-out test, evaluates the current description (running each query 3 times to get a reliable trigger rate), then calls Claude with extended thinking to propose improvements based on what failed. It re-evaluates each new description on both train and test, iterating up to 5 times. When it's done, it opens an HTML report in the browser showing the results per iteration and returns JSON with `best_description` — selected by test score rather than train score to avoid overfitting.
This handles the full optimization loop automatically. It splits the eval set into 60% train and 40% held-out test, evaluates the current description (running each query 3 times to get a reliable trigger rate), then calls Claude to propose improvements based on what failed. It re-evaluates each new description on both train and test, iterating up to 5 times. When it's done, it opens an HTML report in the browser showing the results per iteration and returns JSON with `best_description` — selected by test score rather than train score to avoid overfitting.
### How skill triggering works
@@ -435,6 +435,11 @@ In Claude.ai, the core workflow is the same (draft → test → review → impro
**Packaging**: The `package_skill.py` script works anywhere with Python and a filesystem. On Claude.ai, you can run it and the user can download the resulting `.skill` file.
**Updating an existing skill**: The user might be asking you to update an existing skill, not create a new one. In this case:
- **Preserve the original name.** Note the skill's directory name and `name` frontmatter field -- use them unchanged. E.g., if the installed skill is `research-helper`, output `research-helper.skill` (not `research-helper-v2`).
- **Copy to a writeable location before editing.** The installed skill path may be read-only. Copy to `/tmp/skill-name/`, edit there, and package from the copy.
- **If packaging manually, stage in `/tmp/` first**, then copy to the output directory -- direct writes may fail due to permissions.
---
## Cowork-Specific Instructions
@@ -447,6 +452,7 @@ If you're in Cowork, the main things to know are:
- Feedback works differently: since there's no running server, the viewer's "Submit All Reviews" button will download `feedback.json` as a file. You can then read it from there (you may have to request access first).
- Packaging works — `package_skill.py` just needs Python and a filesystem.
- Description optimization (`run_loop.py` / `run_eval.py`) should work in Cowork just fine since it uses `claude -p` via subprocess, not a browser, but please save it until you've fully finished making the skill and the user agrees it's in good shape.
- **Updating an existing skill**: The user might be asking you to update an existing skill, not create a new one. Follow the update guidance in the claude.ai section above.
---

View File

@@ -2,22 +2,52 @@
"""Improve a skill description based on eval results.
Takes eval results (from run_eval.py) and generates an improved description
using Claude with extended thinking.
by calling `claude -p` as a subprocess (same auth pattern as run_eval.py —
uses the session's Claude Code auth, no separate ANTHROPIC_API_KEY needed).
"""
import argparse
import json
import os
import re
import subprocess
import sys
from pathlib import Path
import anthropic
from scripts.utils import parse_skill_md
def _call_claude(prompt: str, model: str | None, timeout: int = 300) -> str:
"""Run `claude -p` with the prompt on stdin and return the text response.
Prompt goes over stdin (not argv) because it embeds the full SKILL.md
body and can easily exceed comfortable argv length.
"""
cmd = ["claude", "-p", "--output-format", "text"]
if model:
cmd.extend(["--model", model])
# Remove CLAUDECODE env var to allow nesting claude -p inside a
# Claude Code session. The guard is for interactive terminal conflicts;
# programmatic subprocess usage is safe. Same pattern as run_eval.py.
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"}
result = subprocess.run(
cmd,
input=prompt,
capture_output=True,
text=True,
env=env,
timeout=timeout,
)
if result.returncode != 0:
raise RuntimeError(
f"claude -p exited {result.returncode}\nstderr: {result.stderr}"
)
return result.stdout
def improve_description(
client: anthropic.Anthropic,
skill_name: str,
skill_content: str,
current_description: str,
@@ -99,7 +129,7 @@ Based on the failures, write a new and improved description that is more likely
1. Avoid overfitting
2. The list might get loooong and it's injected into ALL queries and there might be a lot of skills, so we don't want to blow too much space on any given description.
Concretely, your description should not be more than about 100-200 words, even if that comes at the cost of accuracy.
Concretely, your description should not be more than about 100-200 words, even if that comes at the cost of accuracy. There is a hard limit of 1024 characters — descriptions over that will be truncated, so stay comfortably under it.
Here are some tips that we've found to work well in writing these descriptions:
- The skill should be phrased in the imperative -- "Use this skill for" rather than "this skill does"
@@ -111,70 +141,41 @@ I'd encourage you to be creative and mix up the style in different iterations si
Please respond with only the new description text in <new_description> tags, nothing else."""
response = client.messages.create(
model=model,
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000,
},
messages=[{"role": "user", "content": prompt}],
)
text = _call_claude(prompt, model)
# Extract thinking and text from response
thinking_text = ""
text = ""
for block in response.content:
if block.type == "thinking":
thinking_text = block.thinking
elif block.type == "text":
text = block.text
# Parse out the <new_description> tags
match = re.search(r"<new_description>(.*?)</new_description>", text, re.DOTALL)
description = match.group(1).strip().strip('"') if match else text.strip().strip('"')
# Log the transcript
transcript: dict = {
"iteration": iteration,
"prompt": prompt,
"thinking": thinking_text,
"response": text,
"parsed_description": description,
"char_count": len(description),
"over_limit": len(description) > 1024,
}
# If over 1024 chars, ask the model to shorten it
# Safety net: the prompt already states the 1024-char hard limit, but if
# the model blew past it anyway, make one fresh single-turn call that
# quotes the too-long version and asks for a shorter rewrite. (The old
# SDK path did this as a true multi-turn; `claude -p` is one-shot, so we
# inline the prior output into the new prompt instead.)
if len(description) > 1024:
shorten_prompt = f"Your description is {len(description)} characters, which exceeds the hard 1024 character limit. Please rewrite it to be under 1024 characters while preserving the most important trigger words and intent coverage. Respond with only the new description in <new_description> tags."
shorten_response = client.messages.create(
model=model,
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000,
},
messages=[
{"role": "user", "content": prompt},
{"role": "assistant", "content": text},
{"role": "user", "content": shorten_prompt},
],
shorten_prompt = (
f"{prompt}\n\n"
f"---\n\n"
f"A previous attempt produced this description, which at "
f"{len(description)} characters is over the 1024-character hard limit:\n\n"
f'"{description}"\n\n'
f"Rewrite it to be under 1024 characters while keeping the most "
f"important trigger words and intent coverage. Respond with only "
f"the new description in <new_description> tags."
)
shorten_thinking = ""
shorten_text = ""
for block in shorten_response.content:
if block.type == "thinking":
shorten_thinking = block.thinking
elif block.type == "text":
shorten_text = block.text
shorten_text = _call_claude(shorten_prompt, model)
match = re.search(r"<new_description>(.*?)</new_description>", shorten_text, re.DOTALL)
shortened = match.group(1).strip().strip('"') if match else shorten_text.strip().strip('"')
transcript["rewrite_prompt"] = shorten_prompt
transcript["rewrite_thinking"] = shorten_thinking
transcript["rewrite_response"] = shorten_text
transcript["rewrite_description"] = shortened
transcript["rewrite_char_count"] = len(shortened)
@@ -216,9 +217,7 @@ def main():
print(f"Current: {current_description}", file=sys.stderr)
print(f"Score: {eval_results['summary']['passed']}/{eval_results['summary']['total']}", file=sys.stderr)
client = anthropic.Anthropic()
new_description = improve_description(
client=client,
skill_name=name,
skill_content=content,
current_description=current_description,

View File

@@ -15,8 +15,6 @@ import time
import webbrowser
from pathlib import Path
import anthropic
from scripts.generate_report import generate_html
from scripts.improve_description import improve_description
from scripts.run_eval import find_project_root, run_eval
@@ -75,7 +73,6 @@ def run_loop(
train_set = eval_set
test_set = []
client = anthropic.Anthropic()
history = []
exit_reason = "unknown"
@@ -200,7 +197,6 @@ def run_loop(
for h in history
]
new_description = improve_description(
client=client,
skill_name=name,
skill_content=content,
current_description=current_description,

View File

@@ -0,0 +1,258 @@
---
name: wdd-vue3-ts-vuetify3
description: >-
Vue 3 + TypeScript + Vuetify 3 全栈前端开发指导 Skill。强制 Composition API +
<script setup lang="ts">,强制 Vuetify 3 Material Design 组件,禁止 any 类型,
禁止 Options API。涵盖 Vuetify 3 组件选择、明暗主题、响应式布局、容器化防御、
统一 API 请求层axios + 错误码对齐后端、Composable 设计模式、Pinia 状态管理、
Vue Router 4、页面美学与留白。触发场景Vue 3 前端开发、Vuetify 3 组件使用、
TypeScript 类型设计、axios 请求封装、主题切换、响应式布局、Pinia Store、
组件架构设计、页面 UI 设计、前端调试。当用户提及 Vue、Vuetify、TypeScript
前端开发时必须加载此 Skill。
---
# Vue 3 + TypeScript + Vuetify 3 前端开发指导
遵循本 Skill 中的指令集。按顺序执行工作流,除非用户明确要求不同的顺序。
## §1 技术栈约束
| 约束项 | 要求 |
|--------|------|
| 框架 | Vue 3最新稳定版 |
| 语言 | TypeScript严格模式禁止 `any`,禁止 `.js` 文件) |
| UI 库 | Vuetify 3唯一 UI 库,禁止引入 Element Plus / Ant Design Vue 等) |
| API 风格 | Composition API + `<script setup lang="ts">`(禁止 Options API |
| 模板语法 | SFC template禁止 JSX / TSX |
| 状态管理 | PiniaSetup Store 语法) |
| 路由 | Vue Router 4 |
| HTTP | axios统一请求器禁止 fetch |
| 图标 | Material Design Icons (`@mdi/font`) |
## §2 开发工作流
### 2.1 编码前确认(必须)
1. 确认项目使用 Vue 3 + TypeScript + Vuetify 3
2. 规划组件边界:每个组件用一句话定义其职责
3. 设计组件的 Props / Emits 契约
4. 确认数据流方向Props down / Events up
### 2.2 必读参考文档
每次开发任务开始前,确保以下参考已加载到工作上下文中:
| 任务类型 | 必读参考 |
|---------|---------|
| 组件选择 | [vuetify3-components](references/vuetify3-components.md) |
| 主题相关 | [vuetify3-theme](references/vuetify3-theme.md) |
| 布局设计 | [vuetify3-responsive](references/vuetify3-responsive.md) |
| 容器/滚动 | [container-defense](references/container-defense.md) |
| TypeScript | [typescript-strict](references/typescript-strict.md) |
| API 对接 | [api-layer](references/api-layer.md) |
| Composable | [composables-patterns](references/composables-patterns.md) |
| 组件架构 | [component-architecture](references/component-architecture.md) |
| 状态管理 | [pinia-state](references/pinia-state.md) |
| 路由 | [router-patterns](references/router-patterns.md) |
| UI 美学 | [design-aesthetics](references/design-aesthetics.md) |
| 调试排错 | [debug-guide](references/debug-guide.md) |
## §3 核心规范速查
### 3.1 SFC 结构顺序
```vue
<script setup lang="ts">
// 1. 导入
// 2. Props / Emits 定义
// 3. 响应式状态
// 4. 计算属性
// 5. 侦听器
// 6. 方法
// 7. 生命周期钩子
</script>
<template>
<!-- 声明式模板逻辑放到 script -->
</template>
<style scoped>
/* 样式使用 scoped引用 Vuetify CSS 变量 */
</style>
```
### 3.2 TypeScript 核心约束
- 所有函数必须声明返回类型(含 `void`
- 使用 `interface` 定义对象形状,`type` 定义联合/工具类型
- Props 使用 `defineProps<Props>()`Emits 使用 `defineEmits<Emits>()`
- 可变默认值使用工厂函数:`withDefaults(defineProps<P>(), { items: () => [] })`
- 禁止 `any``@ts-ignore``@ts-nocheck`
详细规范 → [typescript-strict](references/typescript-strict.md)
### 3.3 响应式默认值策略
| 场景 | 使用 | 原因 |
|------|------|------|
| 原始值 | `ref` | 简单高效 |
| 对象 / 数组 | `ref` | 深层追踪 |
| 第三方实例 | `shallowRef` | 避免代理破坏 |
| 只读派生 | `computed` | 缓存 + 依赖追踪 |
| 从 getter/props 创建 ref | `toRef` | 保持响应式链接 |
### 3.4 组件拆分触发条件
当满足以下**任一条件**时必须拆分:
- 承担 2+ 个独立职责
- 包含 3+ 个独立 UI 区域
- 模板超 100 行或 script 超 150 行
- UI 模式在多处重复出现
CRUD 功能标准拆分FilterBar + Table + Form + Detail + `useFeatureData.ts`
详细规范 → [component-architecture](references/component-architecture.md)
### 3.5 数据流
```
Props (父 → 子) ─── 只读传递
Events (子 → 父) ── emit 通知
v-model ──────────── 仅用于表单控件双向绑定
provide/inject ───── 仅用于深层嵌套上下文(主题、布局)
Pinia Store ──────── 跨组件共享的全局状态
```
## §4 Vuetify 3 使用规范
### 4.1 组件选择
编码前先查决策表,按问题匹配组件,不要按关键词猜测。
优先使用 Vuetify 组件,禁止在有对应组件时使用原生 HTML。
完整决策表 → [vuetify3-components](references/vuetify3-components.md)
### 4.2 主题配置
- 必须同时提供 light 和 dark 主题
- 所有颜色通过 `createVuetify``theme.themes` 定义
- CSS 中使用 `rgb(var(--v-theme-<color>))`,禁止硬编码颜色
- 主题切换使用 `useTheme()` composable
详细模板 → [vuetify3-theme](references/vuetify3-theme.md)
### 4.3 响应式布局
- 使用 `v-container` + `v-row` + `v-col` 栅格系统
- 编程式断点使用 `useDisplay()` composable
- 移动优先:从 `cols` 开始,逐步增加 `sm` / `md` / `lg`
- 所有页面必须在 xs 断点下可用
详细模式 → [vuetify3-responsive](references/vuetify3-responsive.md)
### 4.4 容器化设计
- 所有内容区域必须有明确容器边界,使用 `min-height` 防御塌陷
- 内容禁止被折断:`break-inside: avoid`
- 超出预设高度时使用 `overflow-y: auto` + `scrollbar-gutter: stable`
- 固定区域使用 `flex-shrink: 0`,滚动区域使用 `flex-grow: 1` + `min-height: 0`
详细模式 → [container-defense](references/container-defense.md)
## §5 统一请求层
前端类型必须与后端 `common-runtime.md` 的 Response 结构对齐:
```typescript
interface ApiResponse<T = null> {
code: number // 0 = 成功1xxx = 通用错误2xxx = 业务错误
message: string
data: T
timestamp: string
request_id: string
}
```
所有 API 调用通过统一请求器axios 实例 + 拦截器),禁止直接 import axios。
详细实现 → [api-layer](references/api-layer.md)
## §6 Composable 设计模式
- 命名 `useXxx`,必须在 setup 同步上下文中调用
- 返回具名对象(非数组),便于按需解构
- 适配性输入:只读用 `MaybeRefOrGetter<T>`,可写用 `MaybeRef<T>`
- 生命周期钩子自动清理资源
详细模式 → [composables-patterns](references/composables-patterns.md)
## §7 组件架构与复用
三层架构:
| 层 | 职责 | 命名 |
|----|------|------|
| Base | 通用 UI 封装(与业务无关) | `BaseXxx.vue` |
| Feature | 业务功能组件 | `{Domain}Xxx.vue` |
| Page | 路由视图,布局编排 | `{Domain}XxxView.vue` |
系统中重复出现的 UI 模式必须提取为 Base 组件。
详细架构 → [component-architecture](references/component-architecture.md)
## §8 状态管理
- 全局共享状态 → Pinia StoreSetup Store 语法)
- 组件内部状态 → `ref` / `reactive`
- URL 可恢复状态 → Vue Router query/params
- Store 解构用 `storeToRefs()`,方法直接解构
- Setup Store 必须 return 所有需要暴露的 ref 和方法
详细模式 → [pinia-state](references/pinia-state.md)
## §9 路由管理
- 导航守卫使用返回值模式(禁止 `next()` 回调)
- 路由参数变化用 `watch(route.params)``:key="$route.fullPath"`
- 路由 Meta 使用 TypeScript 类型扩展
- 异步守卫使用 `async` + `await`
详细模式 → [router-patterns](references/router-patterns.md)
## §10 页面美学
- 间距使用 Vuetify 4px 基数系统(`pa-4` 为标准内边距)
- 响应式间距:`pa-4 pa-md-6 pa-lg-8`
- 字体使用 Vuetify Typography 类(`text-h4` / `text-body-1` / `text-caption`
- 颜色使用语义色(`primary` / `error` / `success`
- 空状态和错误状态必须精心设计,禁止空白页面
- 留白是设计元素,区块间距 `my-6``my-8`
详细规范 → [design-aesthetics](references/design-aesthetics.md)
## §11 调试与陷阱
遇到问题时先查调试索引表,覆盖以下分类:
- 响应式陷阱ref / reactive / watch / computed
- 组件陷阱Props / Emits / Slots / Lifecycle
- TypeScript 陷阱(类型定义 / 模板 ref / defineProps
- Vuetify 3 常见问题(样式 / 图标 / 主题 / 性能)
- 路由常见问题(参数变化 / 守卫 / 清理)
详细索引 → [debug-guide](references/debug-guide.md)
## §12 最终自检清单
每次提交代码前,逐项检查:
- [ ] **UI 统一**:所有组件使用 Vuetify 3无其他 UI 库引入
- [ ] **明暗主题**颜色通过主题系统定义CSS 使用 `--v-theme-*` 变量
- [ ] **响应式**:所有页面在 xs 断点下可用,使用 `useDisplay()` 判断
- [ ] **容器化**:内容不折断,滚动条不影响布局,无高度塌陷
- [ ] **组件复用**:重复 UI 模式已提取为 Base 组件
- [ ] **TypeScript**:无 `any` / `@ts-ignore`,所有函数有返回类型
- [ ] **Composition API**:所有组件使用 `<script setup lang="ts">`
- [ ] **统一请求**:所有 API 通过统一请求器,错误码与后端对齐
- [ ] **美学**:合理留白、空状态/错误状态设计、语义色使用正确

View File

@@ -0,0 +1,341 @@
# 统一请求层
## 核心原则
- 统一使用 axios 作为 HTTP 客户端,禁止使用 fetch 或其他请求库
- 前端类型必须与后端 `common-runtime.md``Response` 结构完全对齐
- 所有 API 调用必须通过统一请求器,禁止直接调用 axios 实例
- 错误处理集中在拦截器中,业务层只处理成功数据
## 后端响应格式(对齐 common-runtime.md
后端统一返回以下 JSON 结构:
```json
{
"code": 0,
"message": "success",
"data": { ... },
"timestamp": "2026-07-01T15:00:00+08:00",
"request_id": "uuid-string"
}
```
分页响应的 `data` 字段结构:
```json
{
"list": [ ... ],
"total": 100,
"page": 1,
"page_size": 20
}
```
## 前端类型定义
```typescript
// src/types/api.ts
/** 后端统一响应结构(对齐 common-runtime.md Response */
interface ApiResponse<T = null> {
code: number
message: string
data: T
timestamp: string
request_id: string
}
/** 分页响应数据结构(对齐 common-runtime.md PageResponse */
interface PageData<T> {
list: T[]
total: number
page: number
page_size: number
}
/** 分页请求参数 */
interface PageParams {
page: number
page_size: number
}
/** 错误码枚举(对齐 common-runtime.md codes.go */
const enum ApiCode {
Success = 0,
// 通用错误 1xxx
ParamError = 1001,
ValidationFail = 1002,
Unauthorized = 1003,
Forbidden = 1004,
NotFound = 1005,
Timeout = 1006,
ServerError = 1007,
Duplicate = 1008,
OperationFail = 1009,
// 业务错误 2xxx
BusinessError = 2001,
DataNotReady = 2002,
StatusInvalid = 2003,
DependencyError = 2004,
ExternalAPIError = 2005,
ResourceLocked = 2006,
QuotaExceeded = 2007,
ConcurrentConflict = 2008,
}
/** 错误码对应的用户友好提示 */
const API_CODE_MESSAGE: Record<number, string> = {
[ApiCode.Success]: '操作成功',
[ApiCode.ParamError]: '参数错误',
[ApiCode.ValidationFail]: '数据验证失败',
[ApiCode.Unauthorized]: '未授权,请先登录',
[ApiCode.Forbidden]: '权限不足,禁止访问',
[ApiCode.NotFound]: '资源不存在',
[ApiCode.Timeout]: '请求超时',
[ApiCode.ServerError]: '服务器内部错误',
[ApiCode.Duplicate]: '数据重复',
[ApiCode.OperationFail]: '操作失败',
[ApiCode.BusinessError]: '业务处理失败',
[ApiCode.DataNotReady]: '数据未就绪',
[ApiCode.StatusInvalid]: '状态不合法',
[ApiCode.DependencyError]: '依赖服务错误',
[ApiCode.ExternalAPIError]: '外部服务调用失败',
[ApiCode.ResourceLocked]: '资源被锁定',
[ApiCode.QuotaExceeded]: '配额超限',
[ApiCode.ConcurrentConflict]: '并发冲突',
}
```
## Axios 实例与拦截器
```typescript
// src/utils/request.ts
import axios, {
type AxiosInstance,
type AxiosResponse,
type InternalAxiosRequestConfig,
} from 'axios'
/** 业务异常(后端返回 code !== 0 */
class BusinessError extends Error {
readonly code: number
readonly requestId: string
constructor(code: number, message: string, requestId: string) {
super(message)
this.name = 'BusinessError'
this.code = code
this.requestId = requestId
}
}
function createRequest(baseURL: string): AxiosInstance {
const instance = axios.create({
baseURL,
timeout: 15_000,
headers: { 'Content-Type': 'application/json' },
})
// 请求拦截器:注入 token
instance.interceptors.request.use(
(config: InternalAxiosRequestConfig) => {
const token = localStorage.getItem('access_token')
if (token) {
config.headers.Authorization = `Bearer ${token}`
}
return config
},
(error: unknown) => Promise.reject(error),
)
// 响应拦截器:统一解包 + 错误处理
instance.interceptors.response.use(
(response: AxiosResponse<ApiResponse>) => {
const { code, message, request_id } = response.data
if (code === ApiCode.Success) {
return response
}
// 未授权 → 跳转登录
if (code === ApiCode.Unauthorized) {
localStorage.removeItem('access_token')
window.location.href = '/login'
return Promise.reject(new BusinessError(code, message, request_id))
}
return Promise.reject(new BusinessError(code, message, request_id))
},
(error: unknown) => {
// 网络错误 / 超时
if (axios.isAxiosError(error)) {
const message = error.response
? `服务异常 (${error.response.status})`
: error.code === 'ECONNABORTED'
? '请求超时'
: '网络连接失败'
return Promise.reject(new Error(message))
}
return Promise.reject(error)
},
)
return instance
}
/** 全局唯一请求实例 */
const request = createRequest(import.meta.env.VITE_API_BASE_URL ?? '/api')
export { request, BusinessError, type ApiResponse, type PageData, type PageParams }
```
## API 模块封装
```typescript
// src/api/user.ts
import { request, type ApiResponse, type PageData, type PageParams } from '@/utils/request'
interface UserInfo {
id: number
username: string
email: string
role: string
created_at: string
}
interface CreateUserParams {
username: string
email: string
role: string
}
/** 获取用户列表 */
async function getUserList(params: PageParams): Promise<PageData<UserInfo>> {
const response = await request.get<ApiResponse<PageData<UserInfo>>>('/users', { params })
return response.data.data
}
/** 获取用户详情 */
async function getUserById(id: number): Promise<UserInfo> {
const response = await request.get<ApiResponse<UserInfo>>(`/users/${id}`)
return response.data.data
}
/** 创建用户 */
async function createUser(data: CreateUserParams): Promise<UserInfo> {
const response = await request.post<ApiResponse<UserInfo>>('/users', data)
return response.data.data
}
/** 删除用户 */
async function deleteUser(id: number): Promise<null> {
const response = await request.delete<ApiResponse<null>>(`/users/${id}`)
return response.data.data
}
export { getUserList, getUserById, createUser, deleteUser }
export type { UserInfo, CreateUserParams }
```
## 组件中使用
```vue
<script setup lang="ts">
import { ref, onMounted } from 'vue'
import { getUserList, type UserInfo } from '@/api/user'
import { BusinessError } from '@/utils/request'
import type { PageData } from '@/utils/request'
const loading = ref(false)
const users = ref<UserInfo[]>([])
const total = ref(0)
const page = ref(1)
const pageSize = ref(20)
const errorMessage = ref<string | null>(null)
async function fetchUsers(): Promise<void> {
loading.value = true
errorMessage.value = null
try {
const result: PageData<UserInfo> = await getUserList({
page: page.value,
page_size: pageSize.value,
})
users.value = result.list
total.value = result.total
} catch (error: unknown) {
if (error instanceof BusinessError) {
errorMessage.value = error.message
} else if (error instanceof Error) {
errorMessage.value = error.message
}
} finally {
loading.value = false
}
}
onMounted(fetchUsers)
</script>
```
## 全局错误通知
结合 Vuetify 的 `v-snackbar` 实现全局错误提示:
```typescript
// src/composables/useNotification.ts
import { ref } from 'vue'
interface Notification {
message: string
color: 'success' | 'error' | 'warning' | 'info'
timeout?: number
}
const notification = ref<Notification | null>(null)
const visible = ref(false)
export function useNotification() {
function notify(options: Notification): void {
notification.value = { timeout: 3000, ...options }
visible.value = true
}
function notifySuccess(message: string): void {
notify({ message, color: 'success' })
}
function notifyError(message: string): void {
notify({ message, color: 'error' })
}
return { notification, visible, notify, notifySuccess, notifyError }
}
```
## 反模式
### ❌ 直接使用 axios
```typescript
// ❌ 禁止:绕过统一请求器
import axios from 'axios'
const res = await axios.get('/api/users')
```
### ❌ 组件中处理 HTTP 细节
```typescript
// ❌ 禁止:组件中直接处理状态码
if (response.status === 401) { ... }
```
### ❌ 忽略错误处理
```typescript
// ❌ 禁止:吞掉错误
try { await fetchData() } catch { /* 空 catch */ }
```

View File

@@ -0,0 +1,259 @@
# 组件分层架构与复用
## 核心原则
- 组件按职责分为三层Base → Feature → Page
- 系统中重复出现的 UI 模式必须提取为可复用组件
- 组件保持单一职责,当一个组件承担多个独立职责时必须拆分
- 数据流方向Props down / Events up`v-model` 仅用于真正的双向绑定
## 三层组件架构
### Base 组件(基础层)
与业务无关的通用 UI 组件,对 Vuetify 组件的二次封装。
```
src/components/base/
├── BaseCard.vue # 统一卡片样式(防御高度塌陷)
├── BaseDialog.vue # 统一对话框(响应式宽度 + 滚动)
├── BaseTable.vue # 统一数据表格(分页 + 加载状态)
├── BaseForm.vue # 统一表单容器(验证 + 提交)
├── BaseConfirmDialog.vue # 二次确认对话框
├── BaseEmptyState.vue # 空状态占位
└── BasePageHeader.vue # 页面标题 + 面包屑 + 操作栏
```
示例 — BaseDialog
```vue
<!-- src/components/base/BaseDialog.vue -->
<script setup lang="ts">
import { useDisplay } from 'vuetify'
import { computed } from 'vue'
interface Props {
modelValue: boolean
title: string
maxWidth?: string | number
persistent?: boolean
loading?: boolean
}
interface Emits {
(event: 'update:modelValue', value: boolean): void
(event: 'confirm'): void
(event: 'cancel'): void
}
const props = withDefaults(defineProps<Props>(), {
maxWidth: 600,
persistent: false,
loading: false,
})
const emit = defineEmits<Emits>()
const { mobile } = useDisplay()
const dialogFullscreen = computed(() => mobile.value)
function handleCancel(): void {
emit('update:modelValue', false)
emit('cancel')
}
</script>
<template>
<v-dialog
:model-value="modelValue"
:max-width="maxWidth"
:fullscreen="dialogFullscreen"
:persistent="persistent"
@update:model-value="emit('update:modelValue', $event)"
>
<v-card class="d-flex flex-column" style="max-height: 80vh;">
<v-card-title class="flex-shrink-0 d-flex align-center">
<span>{{ title }}</span>
<v-spacer />
<v-btn icon="mdi-close" variant="text" @click="handleCancel" />
</v-card-title>
<v-divider />
<v-card-text class="flex-grow-1 scroll-container" style="min-height: 0;">
<slot />
</v-card-text>
<v-divider />
<v-card-actions class="flex-shrink-0">
<v-spacer />
<slot name="actions">
<v-btn variant="text" :disabled="loading" @click="handleCancel">
取消
</v-btn>
<v-btn
color="primary"
:loading="loading"
@click="emit('confirm')"
>
确认
</v-btn>
</slot>
</v-card-actions>
</v-card>
</v-dialog>
</template>
```
### Feature 组件(业务层)
特定业务功能的组件,组合 Base 组件和业务逻辑。
```
src/components/
├── user/
│ ├── UserList.vue # 用户列表
│ ├── UserForm.vue # 用户表单(新建/编辑)
│ ├── UserDetail.vue # 用户详情卡片
│ └── UserRoleBadge.vue # 角色标签
├── device/
│ ├── DeviceTable.vue # 设备数据表
│ ├── DeviceStatusChip.vue # 状态芯片
│ └── DeviceFilterBar.vue # 筛选栏
```
### Page 组件(页面层)
路由级视图组件,仅做布局编排和数据组装,不包含具体 UI 实现。
```
src/views/
├── user/
│ ├── UserListView.vue # 用户列表页(组装 UserList + 筛选 + 分页)
│ └── UserDetailView.vue # 用户详情页
├── device/
│ └── DeviceManageView.vue # 设备管理页
```
## 组件拆分触发条件
当满足以下**任一条件**时,必须拆分组件:
| 条件 | 说明 |
|------|------|
| 多重职责 | 同时负责数据编排和多个 UI 区域 |
| 3+ 个独立 UI 区域 | 如表单 + 筛选 + 列表 + 分页同时出现 |
| 可复用 UI 模式 | 列表项、卡片、状态标签等在多处出现 |
| 模板超过 100 行 | 可读性下降的信号 |
| script 超过 150 行 | 逻辑应抽取为 Composable |
### CRUD 功能标准拆分
```
feature/
├── FeatureListView.vue # Page 层:布局编排
├── FeatureFilterBar.vue # 筛选条件
├── FeatureTable.vue # 数据表格
├── FeatureForm.vue # 新建/编辑表单
├── FeatureDetail.vue # 详情展示
└── composables/
└── useFeatureData.ts # 数据加载/分页/搜索逻辑
```
## 命名约定
| 类型 | 命名规则 | 示例 |
|------|---------|------|
| Base 组件 | `Base` + 功能名 | `BaseDialog.vue` |
| Feature 组件 | 业务域 + 功能名 | `UserForm.vue` |
| Page 组件 | 业务域 + `View` | `UserListView.vue` |
| Composable | `use` + 功能名 | `useUserData.ts` |
| 类型文件 | 业务域 + `.types.ts` | `user.types.ts` |
| API 文件 | 业务域 + `.ts` | `user.ts`(在 `src/api/` 下) |
## 数据流规范
### Props down / Events up
```vue
<!-- 父组件 -->
<UserForm
:initial-values="formData"
:loading="submitting"
@submit="handleSubmit"
@cancel="showForm = false"
/>
<!-- 子组件 UserForm.vue -->
<script setup lang="ts">
interface Props {
initialValues: UserFormValues
loading: boolean
}
interface Emits {
(event: 'submit', data: UserFormValues): void
(event: 'cancel'): void
}
defineProps<Props>()
const emit = defineEmits<Emits>()
</script>
```
### v-model 双向绑定
仅在组件本质上是表单控件(输入值的读写)时使用 `v-model`
```vue
<!-- 适合 v-model搜索输入组件 -->
<SearchInput v-model="searchKeyword" />
<!-- 不适合 v-model列表组件的数据不应双向绑定 -->
<UserList v-model="users" /> <!-- 应使用 :items="users" -->
```
### provide / inject
仅用于深层嵌套场景(如主题、布局上下文),不用于一般数据传递:
```typescript
// 定义 injection key
import { type InjectionKey } from 'vue'
interface LayoutContext {
sidebarCollapsed: Ref<boolean>
toggleSidebar: () => void
}
const LayoutContextKey: InjectionKey<LayoutContext> = Symbol('LayoutContext')
```
## 反模式
### ❌ 万能组件
```vue
<!-- 禁止一个组件包含表格 + 表单 + 筛选 + 详情 -->
<template>
<div>
<filter-section />
<data-table />
<edit-form />
<detail-drawer />
<!-- 全部逻辑在一个组件内 -->
</div>
</template>
```
### ❌ 跨层级直接通信
```typescript
// ❌ 禁止:子组件直接调用父组件方法
const parent = getCurrentInstance()?.parent
parent?.exposed?.refresh()
// ✅ 正确:通过 emit 通知
emit('dataChanged')
```

View File

@@ -0,0 +1,231 @@
# Composable 设计模式
## 核心原则
- Composable 是 Vue 3 中复用有状态逻辑的核心机制
- 命名统一使用 `use` 前缀:`useXxx`
- 必须在 `<script setup>``setup()` 的同步上下文中调用
- 返回值使用具名属性的对象(而非数组),便于按需解构
- 所有参数和返回值必须有明确的 TypeScript 类型
## Composable 设计清单
### 1. 确认职责与 API
```typescript
// ✅ 好的 Composable单一职责API 简洁
export function useCounter(initialValue: number = 0) {
const count = ref(initialValue)
function increment(): void { count.value++ }
function decrement(): void { count.value-- }
function reset(): void { count.value = initialValue }
return { count: readonly(count), increment, decrement, reset }
}
```
### 2. 输入适配性MaybeRef / MaybeRefOrGetter
当 Composable 需要接受外部响应式数据时使用适配性输入类型允许调用者传入普通值、ref 或 getter
```typescript
import { toRef, toValue, watch, type MaybeRefOrGetter, type MaybeRef } from 'vue'
// 只读输入 → MaybeRefOrGetter
export function useDocumentTitle(title: MaybeRefOrGetter<string>): void {
watch(toRef(title), (t) => {
document.title = t
}, { immediate: true })
}
// 可写输入 → MaybeRef
export function useLocalStorage<T>(key: string, defaultValue: MaybeRef<T>) {
const data = toRef(defaultValue)
// 从 localStorage 恢复
const stored = localStorage.getItem(key)
if (stored !== null) {
data.value = JSON.parse(stored) as T
}
// 变化时持久化
watch(data, (val) => {
localStorage.setItem(key, JSON.stringify(val))
}, { deep: true })
return data
}
```
### 选择策略
| 参数类型 | 使用 | 原因 |
|---------|------|------|
| 只读、可计算的输入 | `MaybeRefOrGetter<T>` | 接受 ref / computed / getter / 普通值 |
| 需要可写的输入 | `MaybeRef<T>` | 接受 ref / shallowRef / 普通值 |
| 参数本身是函数(回调/谓词) | 不用 `MaybeRefOrGetter` | 避免被当作 getter 调用 |
### 规范化方法
| 需要响应式追踪 | 使用 `toRef(source)` | watch / computed 的依赖源 |
| 只需当前值快照 | 使用 `toValue(source)` | 非响应式上下文中读取值 |
## 典型 Composable 模式
### 异步数据加载
```typescript
// src/composables/useAsyncData.ts
import { ref, watchEffect, type Ref } from 'vue'
interface AsyncDataResult<T> {
data: Ref<T | null>
loading: Ref<boolean>
error: Ref<string | null>
refresh: () => Promise<void>
}
export function useAsyncData<T>(
fetcher: () => Promise<T>,
): AsyncDataResult<T> {
const data = ref<T | null>(null) as Ref<T | null>
const loading = ref(false)
const error = ref<string | null>(null)
async function refresh(): Promise<void> {
loading.value = true
error.value = null
try {
data.value = await fetcher()
} catch (e: unknown) {
error.value = e instanceof Error ? e.message : '未知错误'
} finally {
loading.value = false
}
}
// 初始加载
void refresh()
return { data, loading, error, refresh }
}
```
### 表单状态管理
```typescript
// src/composables/useFormState.ts
import { ref, computed, type Ref } from 'vue'
interface FormState<T extends Record<string, unknown>> {
values: Ref<T>
isDirty: Ref<boolean>
reset: () => void
setValues: (newValues: Partial<T>) => void
}
export function useFormState<T extends Record<string, unknown>>(
initialValues: T,
): FormState<T> {
const values = ref<T>({ ...initialValues }) as Ref<T>
const original = { ...initialValues }
const isDirty = computed(() =>
JSON.stringify(values.value) !== JSON.stringify(original),
)
function reset(): void {
values.value = { ...original } as T
}
function setValues(newValues: Partial<T>): void {
values.value = { ...values.value, ...newValues } as T
}
return { values, isDirty, reset, setValues }
}
```
### 防抖搜索
```typescript
// src/composables/useDebounce.ts
import { ref, watch, type MaybeRefOrGetter, toRef, type Ref } from 'vue'
export function useDebouncedRef<T>(
source: MaybeRefOrGetter<T>,
delay: number = 300,
): Ref<T> {
const debounced = ref(toValue(source)) as Ref<T>
let timer: ReturnType<typeof setTimeout> | null = null
watch(toRef(source), (val) => {
if (timer) clearTimeout(timer)
timer = setTimeout(() => {
debounced.value = val
}, delay)
})
return debounced
}
```
## 生命周期与清理
```typescript
// Composable 中使用生命周期钩子自动清理资源
import { onMounted, onUnmounted } from 'vue'
export function useWindowResize(callback: (width: number, height: number) => void) {
function handler(): void {
callback(window.innerWidth, window.innerHeight)
}
onMounted(() => {
window.addEventListener('resize', handler)
handler() // 初始调用
})
onUnmounted(() => {
window.removeEventListener('resize', handler)
})
}
```
## 反模式
### ❌ 在异步回调中调用 Composable
```typescript
// ❌ 禁止setup 上下文已丢失
onMounted(async () => {
await someAsyncWork()
const { data } = useAsyncData(fetcher) // ← 错误位置
})
```
### ❌ 返回解构后的普通值
```typescript
// ❌ 禁止:丢失响应式
export function useBad() {
const count = ref(0)
return { count: count.value } // ← 丢失响应式
}
// ✅ 正确:返回 ref 本身
export function useGood() {
const count = ref(0)
return { count } // ← 保持响应式
}
```
### ❌ 隐藏副作用
```typescript
// ❌ 禁止Composable 内部偷偷修改全局状态
export function useBad() {
document.title = '固定标题' // ← 隐藏副作用,调用者不知情
}
```

View File

@@ -0,0 +1,308 @@
# 容器化设计与高度塌陷防御
## 核心原则
- 所有内容区域必须有明确的容器边界,防止高度塌陷
- 容器内容禁止被折断(`break-inside: avoid`),必须完整展示
- 内容超过预设高度时添加垂直滚动条,滚动条不影响布局宽度
- 使用 `min-height` 防御空容器塌陷
## 高度塌陷防御
### 基础防御策略
```css
/* 全局容器防御样式 */
/* 防止内容区域高度塌陷为 0 */
.container-defended {
min-height: 48px;
display: flex;
flex-direction: column;
}
/* 防止 flex 子元素收缩导致内容被截断 */
.flex-no-shrink {
flex-shrink: 0;
}
/* 确保内容区域至少占满剩余空间 */
.flex-fill {
flex: 1 1 auto;
min-height: 0; /* 允许 flex 子元素滚动 */
}
```
### Vuetify 容器防御
```vue
<template>
<!-- v-card 内容防御 -->
<v-card class="d-flex flex-column" style="min-height: 200px;">
<v-card-title class="flex-shrink-0">标题</v-card-title>
<v-card-text class="flex-grow-1 overflow-y-auto">
<!-- 长内容区域 -->
</v-card-text>
<v-card-actions class="flex-shrink-0">
<v-btn>操作</v-btn>
</v-card-actions>
</v-card>
</template>
```
## 内容完整展示(禁止折断)
### CSS 防折断
```css
/* 防止内容在分页/打印/flex 换行时被折断 */
.no-break {
break-inside: avoid;
page-break-inside: avoid;
}
/* Vuetify 卡片组的子项不可折断 */
.card-grid .v-col {
break-inside: avoid;
}
```
### 模板应用
```vue
<template>
<v-row>
<v-col
v-for="item in items"
:key="item.id"
cols="12"
md="6"
class="no-break"
>
<v-card>
<!-- 整张卡片作为不可折断单元 -->
<v-card-title>{{ item.title }}</v-card-title>
<v-card-text>{{ item.content }}</v-card-text>
</v-card>
</v-col>
</v-row>
</template>
```
## 滚动容器设计
### 核心:滚动条不影响布局
```css
/* 滚动容器标准样式 */
.scroll-container {
overflow-y: auto;
/* 预留滚动条空间,防止内容宽度跳变 */
scrollbar-gutter: stable;
}
/* 滚动条美化(不影响布局宽度) */
.scroll-container::-webkit-scrollbar {
width: 6px;
}
.scroll-container::-webkit-scrollbar-track {
background: transparent;
}
.scroll-container::-webkit-scrollbar-thumb {
background-color: rgba(var(--v-theme-on-surface), 0.2);
border-radius: 3px;
}
.scroll-container::-webkit-scrollbar-thumb:hover {
background-color: rgba(var(--v-theme-on-surface), 0.4);
}
```
### 固定高度滚动区域
```vue
<template>
<!-- 固定高度的列表滚动区域 -->
<v-card>
<v-card-title class="flex-shrink-0">消息列表</v-card-title>
<v-card-text
class="scroll-container"
style="max-height: 400px;"
>
<v-list>
<v-list-item
v-for="msg in messages"
:key="msg.id"
:title="msg.title"
:subtitle="msg.content"
/>
</v-list>
</v-card-text>
</v-card>
</template>
```
### 全页面布局滚动
```vue
<template>
<v-app>
<v-app-bar class="flex-shrink-0">
<!-- 固定顶栏 -->
</v-app-bar>
<v-navigation-drawer class="flex-shrink-0">
<!-- 固定侧栏 -->
</v-navigation-drawer>
<v-main>
<!-- 内容区域自适应高度内部滚动 -->
<div class="d-flex flex-column" style="height: 100%;">
<!-- 固定的页面头部 -->
<div class="flex-shrink-0 pa-4">
<h1>页面标题</h1>
</div>
<!-- 滚动的内容区域 -->
<div class="flex-grow-1 scroll-container pa-4" style="min-height: 0;">
<!-- 长内容 -->
</div>
<!-- 固定的页面底部 -->
<div class="flex-shrink-0 pa-4">
<v-btn>保存</v-btn>
</div>
</div>
</v-main>
</v-app>
</template>
```
## 典型容器模式
### 数据表格容器
```vue
<template>
<v-card class="d-flex flex-column" style="height: 100%;">
<!-- 固定工具栏 -->
<v-card-title class="flex-shrink-0 d-flex align-center">
<span>数据列表</span>
<v-spacer />
<v-btn prepend-icon="mdi-plus" color="primary">新建</v-btn>
</v-card-title>
<!-- 固定筛选栏 -->
<v-card-text class="flex-shrink-0 pb-0">
<v-row dense>
<v-col cols="12" sm="4">
<v-text-field
v-model="search"
label="搜索"
density="compact"
hide-details
/>
</v-col>
</v-row>
</v-card-text>
<!-- 滚动表格区域 -->
<v-card-text class="flex-grow-1 scroll-container" style="min-height: 0;">
<v-data-table
:items="items"
:headers="headers"
:search="search"
/>
</v-card-text>
<!-- 固定分页 -->
<v-card-actions class="flex-shrink-0 justify-center">
<v-pagination v-model="page" :length="totalPages" />
</v-card-actions>
</v-card>
</template>
```
### 对话框内滚动容器
```vue
<template>
<v-dialog max-width="600">
<v-card class="d-flex flex-column" style="max-height: 80vh;">
<v-card-title class="flex-shrink-0">编辑表单</v-card-title>
<!-- 表单内容滚动 -->
<v-card-text class="flex-grow-1 scroll-container" style="min-height: 0;">
<v-form>
<!-- 大量表单字段 -->
</v-form>
</v-card-text>
<v-card-actions class="flex-shrink-0">
<v-spacer />
<v-btn variant="text" @click="close">取消</v-btn>
<v-btn color="primary" @click="submit">保存</v-btn>
</v-card-actions>
</v-card>
</v-dialog>
</template>
```
## 全局样式建议
在项目入口 CSS 中添加以下防御样式:
```css
/* src/styles/container-defense.css */
/* 所有 Vuetify 卡片默认防御高度塌陷 */
.v-card {
min-height: 48px;
}
/* 滚动容器统一样式 */
.scroll-container {
overflow-y: auto;
scrollbar-gutter: stable;
}
/* 防折断 */
.no-break {
break-inside: avoid;
page-break-inside: avoid;
}
/* flex 布局辅助 */
.flex-shrink-0 {
flex-shrink: 0;
}
```
## 反模式
### ❌ 无限高度容器
```vue
<!-- 禁止内容无限增长没有滚动约束 -->
<v-card>
<v-card-text>
<div v-for="i in 1000" :key="i">{{ i }}</div>
</v-card-text>
</v-card>
```
### ❌ overflow: hidden 截断内容
```css
/* 禁止:截断内容而非提供滚动 */
.container { overflow: hidden; height: 300px; }
```
### ❌ 滚动条影响布局
```css
/* 禁止:不使用 scrollbar-gutter导致滚动条出现时内容跳变 */
.container { overflow-y: scroll; } /* 始终显示滚动条 */
```

View File

@@ -0,0 +1,119 @@
# 调试与常见陷阱索引
本文档汇总 Vue 3 + TypeScript + Vuetify 3 开发中的常见问题和调试技巧。按问题分类索引,遇到问题时先查本表。
## 响应式陷阱
| 问题 | 原因 | 解法 |
|------|------|------|
| ref 值不更新 | 忘记 `.value` | 在 script 中必须通过 `.value` 访问template 中自动解包 |
| 解构 reactive 后不响应 | 解构丢失响应式代理 | 使用 `toRefs()` 解构,或直接用 `ref` |
| 解构 Store 后不响应 | 同上 | 使用 `storeToRefs()` 解构状态 |
| 数组中的 ref 不自动解包 | 集合中 ref 不解包 | 手动 `.value` 访问 |
| `reactive` 包裹第三方实例报错 | Proxy 破坏实例内部结构 | 使用 `shallowRef` + `markRaw` |
| 同 tick 内多次修改只触发一次 watcher | Vue 批处理更新 | 预期行为,如需立即执行用 `flush: 'sync'` |
## 计算属性陷阱
| 问题 | 原因 | 解法 |
|------|------|------|
| computed 中发请求 | computed 应为纯函数 | 移到 watch 或方法中 |
| computed 排序/反转破坏原数组 | `.sort()` / `.reverse()` 修改原数组 | 先 `.slice()` 副本再操作 |
| computed 条件分支后不更新 | 条件短路导致依赖未被追踪 | 确保所有分支都访问响应式依赖 |
| 向 computed 传参 | computed 不接受参数 | 改用返回函数的 computed 或方法 |
## 侦听器陷阱
| 问题 | 原因 | 解法 |
|------|------|------|
| watch reactive 属性不触发 | 未用 getter 函数 | `watch(() => state.prop, ...)` |
| 异步回调中创建 watcher 内存泄漏 | setup 上下文丢失,不自动清理 | 在 setup 同步上下文中创建 |
| deep watch 的 old/new 值相同 | 同一引用,深层修改不产生新引用 | 预期行为,如需 old 值可 clone |
| watchEffect 在 await 后丢失依赖 | await 之后的代码不在同步追踪中 | 将响应式访问放在 await 之前 |
## 组件陷阱
| 问题 | 原因 | 解法 |
|------|------|------|
| 父组件无法访问子组件 ref | `<script setup>` 默认不暴露 | 子组件使用 `defineExpose` |
| 自定义事件未触发 | 未声明 emits | 使用 `defineEmits` 声明 |
| 事件触发两次 | 未声明 emits 导致原生事件叠加 | 声明 emits 覆盖原生事件 |
| 组件命名冲突 | 同名组件覆盖 | 使用明确的命名空间或别名导入 |
## TypeScript 陷阱
| 问题 | 原因 | 解法 |
|------|------|------|
| `withDefaults` 可变默认值泄漏 | 对象默认值被所有实例共享 | 使用工厂函数 `() => ({})` |
| template ref 类型报 null | ref 在挂载前为 null | 类型标注 `ref<T \| null>(null)` + 空检查 |
| `defineProps` 导入类型报错 | 编译器宏限制外部复杂类型 | 使用本地 interface 或 `type ... = ...` |
| DOM 事件 handler 类型不匹配 | strict 模式下 Event 类型 | 显式标注 `(event: MouseEvent) => void` |
| 动态组件 ref 触发 reactive 警告 | Component 对象被 deep reactive | 使用 `shallowRef` 存储动态组件 |
## Vuetify 3 常见问题
| 问题 | 原因 | 解法 |
|------|------|------|
| 样式不生效 | 未导入 Vuetify styles | 确保 `import 'vuetify/styles'` |
| 图标不显示 | 未安装 mdi 图标字体 | 安装 `@mdi/font` 并在入口导入 |
| 主题切换不生效 | 直接修改 theme 对象 | 使用 `theme.global.name.value = 'dark'` |
| v-data-table 性能差 | 大量数据未虚拟化 | 使用 `v-data-table-virtual` |
| 对话框层叠遮罩问题 | 多个 overlay 叠加 | 使用 `z-index` 或避免嵌套 dialog |
| scoped 样式无法覆盖 Vuetify | scoped CSS 优先级不够 | 使用 `:deep()` 选择器 |
## 路由常见问题
| 问题 | 原因 | 解法 |
|------|------|------|
| 参数变化页面不刷新 | 同路由不重新挂载组件 | watch 路由参数或使用 `:key` |
| 导航守卫无限重定向 | 守卫条件未排除目标路由 | 检查 `to.name !== 'Login'` |
| 异步守卫中请求未等待 | 忘记 await | 守卫函数声明为 `async` 并 await |
| 组件卸载后事件监听未清理 | 未在 unmount 中清理 | 使用 `onUnmounted` 清理 |
## 调试技巧
### Vue DevTools
```typescript
// 在 Store 中暴露所有状态以便 DevTools 显示
// Setup Store 必须 return 所有 ref
return { state1, state2, action1 }
```
### 响应式调试
```typescript
import { watch } from 'vue'
// 调试追踪 ref 变化来源
watch(someRef, (newVal, oldVal) => {
console.trace('someRef changed:', oldVal, '→', newVal)
})
```
### 组件渲染调试
```vue
<script setup lang="ts">
import { onRenderTracked, onRenderTriggered } from 'vue'
// 仅开发环境使用
if (import.meta.env.DEV) {
onRenderTracked((event) => {
console.log('render tracked:', event)
})
onRenderTriggered((event) => {
console.log('render triggered:', event)
})
}
</script>
```
### Vuetify 覆盖样式调试
```css
/* 使用 :deep() 穿透 scoped 样式 */
.my-custom-table :deep(.v-data-table-header) {
background-color: rgb(var(--v-theme-surface-variant));
}
```

View File

@@ -0,0 +1,215 @@
# UI 美学与设计规范
## 核心原则
- 页面不是功能的堆砌,而是信息的有序传达
- 留白是设计元素,不是浪费空间
- 颜色、字体、间距的选择必须有意为之,不使用默认模板风格
- 错误状态和空状态同样需要精心设计
## 间距与留白
### Vuetify 间距系统
Vuetify 使用 4px 基数的间距系统(`ma-1` = 4px, `ma-2` = 8px, ...
| 类名 | 尺寸 | 使用场景 |
|------|------|---------|
| `pa-1` / `ma-1` | 4px | 紧密关联的元素间距 |
| `pa-2` / `ma-2` | 8px | 组内元素间距 |
| `pa-3` / `ma-3` | 12px | 小节内间距 |
| `pa-4` / `ma-4` | 16px | 标准内容间距(推荐默认) |
| `pa-6` / `ma-6` | 24px | 区块间距 |
| `pa-8` / `ma-8` | 32px | 大区块间距 |
| `pa-12` / `ma-12` | 48px | 页面级留白 |
### 留白原则
```
页面外边距pa-4移动端→ pa-6平板→ pa-8桌面
卡片内边距pa-4标准→ pa-6宽松
卡片间距ga-4row gap
表单字段间距:通过 v-row dense 控制
标题与内容间距mb-4标题下方
区块间距my-6 或 my-8
```
### 响应式间距
```vue
<template>
<v-container class="pa-4 pa-md-6 pa-lg-8">
<v-row>
<v-col cols="12" class="mb-4 mb-md-6">
<h1 class="text-h4 text-md-h3">页面标题</h1>
</v-col>
</v-row>
</v-container>
</template>
```
## Typography 排版
### 字体层级
使用 Vuetify 内置的 Typography 类,保持一致:
| 类名 | 用途 |
|------|------|
| `text-h3` / `text-h4` | 页面大标题 |
| `text-h5` / `text-h6` | 区块标题 |
| `text-subtitle-1` | 副标题 |
| `text-subtitle-2` | 小副标题 |
| `text-body-1` | 正文(默认) |
| `text-body-2` | 辅助说明 |
| `text-caption` | 标注、时间戳 |
| `text-overline` | 分类标签 |
### 字体颜色层级
```vue
<!-- 主要文字 -->
<span class="text-on-surface">主标题</span>
<!-- 次要文字使用主题自定义色 -->
<span class="text-on-surface-muted">描述文字</span>
<!-- 禁用状态 -->
<span class="text-disabled">不可用</span>
```
## 色彩使用
### 语义色使用场景
| 颜色 | 场景 |
|------|------|
| `primary` | 主按钮、链接、选中状态、品牌标识 |
| `secondary` | 次要操作、辅助装饰 |
| `error` | 错误提示、删除操作、验证失败 |
| `warning` | 警告信息、需要注意的状态 |
| `success` | 成功操作、正常状态 |
| `info` | 一般性提示信息 |
### 状态色复用
```vue
<!-- 状态芯片统一用语义色 -->
<v-chip :color="statusColor" size="small">
{{ statusText }}
</v-chip>
<script setup lang="ts">
import { computed } from 'vue'
type StatusType = 'active' | 'inactive' | 'pending' | 'error'
const props = defineProps<{ status: StatusType }>()
const statusColor = computed(() => {
const colorMap: Record<StatusType, string> = {
active: 'success',
inactive: 'grey',
pending: 'warning',
error: 'error',
}
return colorMap[props.status]
})
</script>
```
## 空状态设计
每个可能为空的列表/表格都必须设计空状态:
```vue
<template>
<!-- 数据加载中 -->
<v-skeleton-loader v-if="loading" type="table" />
<!-- 空状态 -->
<v-card v-else-if="items.length === 0" variant="flat" class="text-center pa-12">
<v-icon icon="mdi-inbox-outline" size="64" color="on-surface-muted" />
<p class="text-h6 mt-4">暂无数据</p>
<p class="text-body-2 text-on-surface-muted mb-4">
还没有任何记录点击下方按钮创建第一条
</p>
<v-btn color="primary" prepend-icon="mdi-plus" @click="emit('create')">
新建
</v-btn>
</v-card>
<!-- 正常数据展示 -->
<v-data-table v-else :items="items" :headers="headers" />
</template>
```
## 错误状态设计
错误提示应当具体、可操作,不使用模糊的"出错了"
```vue
<template>
<v-alert
v-if="error"
type="error"
variant="tonal"
closable
class="mb-4"
@click:close="error = null"
>
<v-alert-title>{{ error }}</v-alert-title>
<template #append>
<v-btn variant="text" size="small" @click="retry">
重试
</v-btn>
</template>
</v-alert>
</template>
```
## 文案写作规范
借鉴 frontend-design skill 的核心原则:
| 原则 | 说明 |
|------|------|
| 用户视角命名 | "通知管理" 而非 "Webhook 配置" |
| 主动语态 | "保存更改" 而非 "提交" |
| 动词一致性 | 按钮说 "发布" → 提示说 "已发布" |
| 错误不道歉 | "用户名已被注册" 而非 "抱歉,出错了" |
| 空状态即邀请 | 引导用户下一步操作 |
| 简洁 | 标签只标注,示例只演示,不让文字身兼多职 |
## 反模式
### ❌ 毫无留白的拥挤界面
```vue
<!-- 禁止所有元素紧贴无间距 -->
<v-card>
<v-card-title>标题</v-card-title>
<v-data-table />
<v-btn>操作</v-btn>
</v-card>
```
### ❌ 颜色乱用
```vue
<!-- 禁止删除按钮用 primary -->
<v-btn color="primary" @click="deleteItem">删除</v-btn>
<!-- 正确危险操作用 error -->
<v-btn color="error" @click="deleteItem">删除</v-btn>
```
### ❌ 模糊的错误提示
```vue
<!-- 禁止 -->
<v-alert type="error">操作失败</v-alert>
<!-- 正确 -->
<v-alert type="error">用户名已被注册请更换后重试</v-alert>
```

View File

@@ -0,0 +1,165 @@
# Pinia 状态管理精要
## 核心原则
- 使用 Pinia 进行全局/跨组件的状态管理
- 优先使用 Setup Store 语法(与 Composition API 一致)
- 组件内部的临时状态使用 `ref` / `reactive`,不上升到 Store
- Store 解构时使用 `storeToRefs()` 保持响应式
## Store 定义
### Setup Store推荐
```typescript
// src/stores/useUserStore.ts
import { defineStore, storeToRefs } from 'pinia'
import { ref, computed } from 'vue'
import { getUserList, type UserInfo } from '@/api/user'
import type { PageData } from '@/utils/request'
export const useUserStore = defineStore('user', () => {
// State
const users = ref<UserInfo[]>([])
const currentUser = ref<UserInfo | null>(null)
const loading = ref(false)
const total = ref(0)
// Getters
const userCount = computed(() => users.value.length)
const isLoggedIn = computed(() => currentUser.value !== null)
// Actions
async function fetchUsers(page: number, pageSize: number): Promise<void> {
loading.value = true
try {
const result: PageData<UserInfo> = await getUserList({
page,
page_size: pageSize,
})
users.value = result.list
total.value = result.total
} finally {
loading.value = false
}
}
function setCurrentUser(user: UserInfo | null): void {
currentUser.value = user
}
function $reset(): void {
users.value = []
currentUser.value = null
loading.value = false
total.value = 0
}
// Setup Store 必须返回所有需要暴露的状态和方法
return {
users,
currentUser,
loading,
total,
userCount,
isLoggedIn,
fetchUsers,
setCurrentUser,
$reset,
}
})
```
## 在组件中使用
### 正确的解构方式
```vue
<script setup lang="ts">
import { storeToRefs } from 'pinia'
import { useUserStore } from '@/stores/useUserStore'
const userStore = useUserStore()
// ✅ 响应式状态用 storeToRefs 解构
const { users, loading, userCount } = storeToRefs(userStore)
// ✅ 方法直接解构(不需要 storeToRefs
const { fetchUsers, setCurrentUser } = userStore
</script>
```
### ❌ 错误的解构方式
```typescript
// ❌ 禁止:直接解构丢失响应式
const { users, loading } = useUserStore() // users 和 loading 不再响应式
```
## Store 之间通信
```typescript
// src/stores/useAuthStore.ts
import { defineStore } from 'pinia'
import { ref, computed } from 'vue'
import { useUserStore } from './useUserStore'
export const useAuthStore = defineStore('auth', () => {
const token = ref<string | null>(null)
const isAuthenticated = computed(() => token.value !== null)
function logout(): void {
token.value = null
localStorage.removeItem('access_token')
// 通知其他 Store 重置
const userStore = useUserStore()
userStore.$reset()
}
return { token, isAuthenticated, logout }
})
```
## 状态分类决策
| 状态类型 | 存放位置 | 示例 |
|---------|---------|------|
| 全局共享状态 | Pinia Store | 用户信息、权限、主题偏好 |
| 页面级共享状态 | Pinia Store 或 Composable | 列表筛选条件、分页状态 |
| 组件内部状态 | `ref` / `reactive` | 表单输入值、对话框开关 |
| URL 可恢复状态 | Vue Router query/params | 搜索关键词、页码、筛选条件 |
| 临时 UI 状态 | `ref` | loading、hover、展开/折叠 |
## 反模式
### ❌ 所有状态都放 Store
```typescript
// ❌ 禁止:对话框开关状态不属于全局状态
export const useDialogStore = defineStore('dialog', () => {
const showCreateDialog = ref(false) // ← 应该放在组件内
})
```
### ❌ Store 中直接操作 DOM
```typescript
// ❌ 禁止
export const useAppStore = defineStore('app', () => {
function showAlert(msg: string): void {
window.alert(msg) // ← Store 不应有 DOM 副作用
}
})
```
### ❌ 忘记在 Setup Store 中返回状态
```typescript
// ❌ 错误DevTools 看不到 internalState
export const useBadStore = defineStore('bad', () => {
const internalState = ref(0) // ← 未返回
return { /* 忘记返回 internalState */ }
})
```

View File

@@ -0,0 +1,216 @@
# Vue Router 4 模式精要
## 核心原则
- 使用 Vue Router 4配合 Vue 3
- 路由配置使用 TypeScript 严格类型
- 导航守卫使用返回值模式(弃用 `next()` 回调)
- 异步数据加载在守卫或组件 `onMounted` 中处理
## 路由配置
```typescript
// src/router/index.ts
import { createRouter, createWebHistory, type RouteRecordRaw } from 'vue-router'
const routes: RouteRecordRaw[] = [
{
path: '/login',
name: 'Login',
component: () => import('@/views/auth/LoginView.vue'),
meta: { requiresAuth: false, title: '登录' },
},
{
path: '/',
component: () => import('@/layouts/MainLayout.vue'),
meta: { requiresAuth: true },
children: [
{
path: '',
name: 'Dashboard',
component: () => import('@/views/dashboard/DashboardView.vue'),
meta: { title: '仪表板' },
},
{
path: 'users',
name: 'UserList',
component: () => import('@/views/user/UserListView.vue'),
meta: { title: '用户管理' },
},
{
path: 'users/:id',
name: 'UserDetail',
component: () => import('@/views/user/UserDetailView.vue'),
meta: { title: '用户详情' },
props: true,
},
],
},
{
path: '/:pathMatch(.*)*',
name: 'NotFound',
component: () => import('@/views/error/NotFoundView.vue'),
meta: { title: '页面未找到' },
},
]
const router = createRouter({
history: createWebHistory(import.meta.env.BASE_URL),
routes,
})
export default router
```
## 路由 Meta 类型扩展
```typescript
// src/types/router.d.ts
import 'vue-router'
declare module 'vue-router' {
interface RouteMeta {
/** 是否需要登录 */
requiresAuth?: boolean
/** 页面标题 */
title?: string
/** 所需权限 */
permissions?: string[]
}
}
```
## 导航守卫
### 全局前置守卫
```typescript
// src/router/guards.ts
import type { Router } from 'vue-router'
import { useAuthStore } from '@/stores/useAuthStore'
export function setupRouterGuards(router: Router): void {
router.beforeEach((to) => {
const authStore = useAuthStore()
// 需要认证但未登录 → 跳登录页
if (to.meta.requiresAuth !== false && !authStore.isAuthenticated) {
return { name: 'Login', query: { redirect: to.fullPath } }
}
// 已登录访问登录页 → 跳首页
if (to.name === 'Login' && authStore.isAuthenticated) {
return { name: 'Dashboard' }
}
// 设置页面标题
if (to.meta.title) {
document.title = `${to.meta.title} - 应用名称`
}
// 放行
return true
})
}
```
### ❌ 弃用的 next() 模式
```typescript
// ❌ 禁止:使用 next() 回调
router.beforeEach((to, from, next) => {
if (authenticated) {
next()
} else {
next('/login')
}
})
// ✅ 正确:使用返回值
router.beforeEach((to) => {
if (!authenticated) {
return { path: '/login' }
}
// 不返回或返回 true 表示放行
})
```
## 路由参数变化不触发生命周期
同一路由不同参数(如 `/users/1``/users/2`)不会触发组件重新挂载。
### 解法一watch 路由参数
```vue
<script setup lang="ts">
import { watch } from 'vue'
import { useRoute } from 'vue-router'
const route = useRoute()
watch(
() => route.params.id,
(newId) => {
if (newId) {
void fetchUserDetail(Number(newId))
}
},
{ immediate: true },
)
</script>
```
### 解法二:使用 key 强制重新挂载
```vue
<!-- 在父组件的 router-view 上绑定 key -->
<router-view :key="$route.fullPath" />
```
## 异步守卫模式
```typescript
router.beforeEach(async (to) => {
if (to.meta.permissions) {
const authStore = useAuthStore()
const hasPermission = await authStore.checkPermissions(to.meta.permissions)
if (!hasPermission) {
return { name: 'Forbidden' }
}
}
})
```
## 导航后清理
```vue
<script setup lang="ts">
import { onBeforeRouteLeave } from 'vue-router'
// 离开页面前清理(如未保存提示)
onBeforeRouteLeave((to, from) => {
if (isDirty.value) {
const answer = window.confirm('有未保存的更改,确定离开吗?')
if (!answer) return false
}
})
</script>
```
## 反模式
### ❌ 无限重定向循环
```typescript
// ❌ 禁止:每个路由都重定向到另一个需要守卫的路由
router.beforeEach((to) => {
if (!auth) return { name: 'Login' } // Login 也需要 auth→ 死循环
})
// ✅ 正确:排除不需要认证的路由
router.beforeEach((to) => {
if (to.meta.requiresAuth !== false && !auth) {
return { name: 'Login' }
}
})
```

View File

@@ -0,0 +1,246 @@
# TypeScript 严格规范
## 核心原则
- 所有代码必须使用 TypeScript禁止出现 `.js` / `.jsx` 文件
- 禁止使用 `any` 类型,必须提供具体的类型定义
- 所有函数必须声明返回类型(`void` 也要显式标注)
- 使用 `interface` 定义对象形状,使用 `type` 定义联合类型和工具类型
## tsconfig 严格模式
```json
{
"compilerOptions": {
"strict": true,
"noImplicitAny": true,
"strictNullChecks": true,
"strictFunctionTypes": true,
"strictPropertyInitialization": true,
"noImplicitReturns": true,
"noFallthroughCasesInSwitch": true,
"noUncheckedIndexedAccess": true,
"exactOptionalPropertyTypes": false,
"forceConsistentCasingInFileNames": true,
"skipLibCheck": true
}
}
```
## 类型定义规范
### interface vs type 选择
```typescript
// ✅ interface用于对象形状定义可扩展
interface UserInfo {
id: number
username: string
email: string
role: UserRole
createdAt: string
}
// ✅ type用于联合类型、交叉类型、工具类型
type UserRole = 'admin' | 'editor' | 'viewer'
type Nullable<T> = T | null
type AsyncData<T> = {
data: T | null
loading: boolean
error: string | null
}
```
### Props 类型定义
```vue
<script setup lang="ts">
// ✅ 正确:使用 TypeScript 类型定义 props
interface Props {
title: string
count?: number
items: ReadonlyArray<ListItem>
variant?: 'outlined' | 'tonal' | 'elevated'
}
const props = withDefaults(defineProps<Props>(), {
count: 0,
variant: 'tonal',
})
// ✅ 正确:可变默认值使用工厂函数
interface FormProps {
initialValues?: FormValues
}
const formProps = withDefaults(defineProps<FormProps>(), {
initialValues: () => ({ name: '', email: '' }),
})
</script>
```
### Emits 类型定义
```vue
<script setup lang="ts">
interface Emits {
(event: 'update:modelValue', value: string): void
(event: 'submit', data: FormData): void
(event: 'delete', id: number): void
}
const emit = defineEmits<Emits>()
</script>
```
### Ref 类型
```typescript
import { ref, shallowRef, computed, type Ref } from 'vue'
// ✅ 类型推断充分时不需要显式标注
const count = ref(0) // Ref<number>
const name = ref('') // Ref<string>
// ✅ 需要显式标注的场景
const user = ref<UserInfo | null>(null) // 初始值无法推断目标类型
const items = ref<ListItem[]>([]) // 空数组无法推断元素类型
const el = ref<HTMLDivElement | null>(null) // DOM ref
// ✅ shallowRef 用于非 Vue 实例对象(如 ECharts、地图实例等
const chartInstance = shallowRef<ECharts | null>(null)
```
### 响应式默认值策略
选择正确的响应式原语(借鉴 vuetify0 策略):
| 场景 | 使用 | 原因 |
|------|------|------|
| 原始值string / number / boolean | `ref` | 自动 unwrap简单高效 |
| 对象/数组(需要深层响应式) | `ref` | 深层响应式追踪 |
| 非 Vue 管理的外部对象实例 | `shallowRef` | 避免代理破坏第三方实例 |
| 只读派生值 | `computed` | 缓存 + 自动依赖追踪 |
| 派生 ref传递给 watcher | `toRef` | 从 getter/props 创建 ref |
| 读取响应式值的当前快照 | `toValue` | 在需要当前值时解包 |
## 禁止模式
### ❌ 使用 any
```typescript
// ❌ 禁止
const data: any = response.data
function process(input: any): any { ... }
// ✅ 正确:定义具体类型
const data: ApiResponse<UserInfo> = response.data
function process(input: ProcessInput): ProcessResult { ... }
```
### ❌ 使用 @ts-ignore / @ts-nocheck
```typescript
// ❌ 禁止
// @ts-ignore
someFunction(wrongType)
// ✅ 正确:修复类型问题或使用类型守卫
if (isUserInfo(data)) {
someFunction(data)
}
```
### ❌ 类型断言滥用
```typescript
// ❌ 禁止:无依据的类型断言
const user = data as UserInfo
// ✅ 正确:使用类型守卫
function isUserInfo(data: unknown): data is UserInfo {
return (
typeof data === 'object' &&
data !== null &&
'id' in data &&
'username' in data
)
}
```
## 常用类型工具
```typescript
// src/types/utils.ts
/** 将类型 T 的所有属性变为可选 + null */
type Nullable<T> = { [K in keyof T]: T[K] | null }
/** 从类型 T 中排除 null 和 undefined */
type NonNullableFields<T> = { [K in keyof T]: NonNullable<T[K]> }
/** 分页请求参数 */
interface PaginationParams {
page: number
pageSize: number
}
/** 排序参数 */
interface SortParams {
sortBy: string
sortOrder: 'asc' | 'desc'
}
/** 列表查询参数 */
type ListQueryParams = PaginationParams & Partial<SortParams> & {
search?: string
}
/** 表单字段状态 */
interface FieldState<T> {
value: T
error: string | null
dirty: boolean
touched: boolean
}
```
## Vue 3 + TypeScript 特有规范
### template ref 类型
```vue
<script setup lang="ts">
import { useTemplateRef } from 'vue'
// Vue 3.5+ 推荐
const formRef = useTemplateRef<InstanceType<typeof VForm>>('formRef')
// 或使用 ref
const inputRef = ref<HTMLInputElement | null>(null)
</script>
<template>
<v-form ref="formRef">
<input ref="inputRef" />
</v-form>
</template>
```
### provide / inject 类型安全
```typescript
import { type InjectionKey, provide, inject } from 'vue'
// 定义类型安全的 injection key
const UserContextKey: InjectionKey<UserContext> = Symbol('UserContext')
// 提供
provide(UserContextKey, userContext)
// 注入(类型安全)
const userContext = inject(UserContextKey)
if (!userContext) {
throw new Error('UserContext not provided')
}
```

View File

@@ -0,0 +1,158 @@
# Vuetify 3 组件使用规范
## 核心原则
- 所有 UI 必须使用 Vuetify 3 组件构建,禁止引入其他 UI 库Element Plus、Ant Design Vue 等)
- 优先使用 Vuetify 内置组件,只有在 Vuetify 确实不提供时才考虑自定义实现
- 组件 props 使用 kebab-case`:item-value`),事件使用 `@update:model-value` 形式
## 组件选择决策表
编码前先查此表,按问题匹配组件,不要按关键词猜测。
### 布局与容器
| 问题场景 | 使用组件 | 说明 |
|---------|---------|------|
| 页面整体布局 | `v-app` + `v-main` | 应用根容器,必须包裹所有内容 |
| 侧边导航栏 | `v-navigation-drawer` | 支持 `rail` 模式(收缩)和响应式断点 |
| 顶部应用栏 | `v-app-bar` | 支持 `density` 调节高度 |
| 底部导航(移动端) | `v-bottom-navigation` | 移动端底部 Tab 栏 |
| 栅格布局 | `v-container` + `v-row` + `v-col` | 12 列 Grid 系统 |
| 卡片容器 | `v-card` | 标准内容容器,支持 `variant` |
| 通用容器 | `v-sheet` | 轻量级容器,用于自定义布局 |
| 工具栏 | `v-toolbar` | 独立工具栏(非 app-bar |
| 内容分隔线 | `v-divider` | 水平/垂直分隔 |
| 间距控制 | `v-spacer` | Flex 弹性占位 |
### 数据展示
| 问题场景 | 使用组件 | 说明 |
|---------|---------|------|
| 表格数据(带排序/分页) | `v-data-table` | 功能完整的数据表格 |
| 简单表格(无排序) | `v-table` | 原生表格包装 |
| 虚拟滚动长列表 | `v-virtual-scroll` | 大数据量渲染优化 |
| 列表展示 | `v-list` + `v-list-item` | 垂直列表,支持分组 |
| 芯片/标签 | `v-chip` | 状态标签、筛选标签 |
| 徽章/角标 | `v-badge` | 数字或状态角标 |
| 头像 | `v-avatar` | 用户头像或图标容器 |
| 进度指示 | `v-progress-linear` / `v-progress-circular` | 进度条 |
| 骨架屏 | `v-skeleton-loader` | 数据加载占位 |
| 时间线 | `v-timeline` | 时间序列展示 |
| 树形结构 | `v-treeview` | 层级数据展示 |
### 表单与输入
| 问题场景 | 使用组件 | 说明 |
|---------|---------|------|
| 表单容器(带验证) | `v-form` | 统一表单验证入口 |
| 文本输入 | `v-text-field` | 支持 `type``rules``variant` |
| 多行文本 | `v-textarea` | 多行输入 |
| 下拉选择 | `v-select` | 单选/多选下拉 |
| 搜索选择(自动补全) | `v-autocomplete` | 可搜索的下拉 |
| 组合输入(自由输入+选择) | `v-combobox` | 允许输入不在列表中的值 |
| 开关 | `v-switch` | 布尔值切换 |
| 复选框 | `v-checkbox` | 单个/多个复选 |
| 单选 | `v-radio-group` + `v-radio` | 互斥选择 |
| 滑块 | `v-slider` / `v-range-slider` | 数值范围选择 |
| 文件上传 | `v-file-input` | 文件选择 |
| 日期选择 | `v-date-input` + `v-date-picker` | 日期输入 |
| 颜色选择 | `v-color-picker` | 颜色输入 |
### 反馈与交互
| 问题场景 | 使用组件 | 说明 |
|---------|---------|------|
| 模态对话框 | `v-dialog` | 居中弹窗 |
| 底部弹出面板 | `v-bottom-sheet` | 移动端底部弹出 |
| 临时消息通知 | `v-snackbar` | 自动消失的提示 |
| 工具提示 | `v-tooltip` | 悬浮提示 |
| 弹出菜单 | `v-menu` | 下拉/右键菜单 |
| 确认操作 | `v-dialog` + 自定义确认内容 | 危险操作二次确认 |
| 全屏遮罩/加载 | `v-overlay` | 覆盖层 |
| 警告横幅 | `v-alert` | 页面级提示信息 |
| 横幅通知 | `v-banner` | 持续性通知 |
### 导航
| 问题场景 | 使用组件 | 说明 |
|---------|---------|------|
| 页面标签页 | `v-tabs` + `v-tab` + `v-tabs-window` | 标签页切换 |
| 面包屑 | `v-breadcrumbs` | 路径导航 |
| 分页器 | `v-pagination` | 页码导航 |
| 步骤条 | `v-stepper` | 多步骤向导 |
| 手风琴面板 | `v-expansion-panels` | 折叠/展开面板组 |
### 按钮与操作
| 问题场景 | 使用组件 | 说明 |
|---------|---------|------|
| 标准按钮 | `v-btn` | 支持 `variant``color``size` |
| 图标按钮 | `v-btn` + `icon` prop | 仅图标的按钮 |
| 浮动操作按钮 | `v-btn` + `position="fixed"` | FAB 按钮 |
| 按钮组 | `v-btn-toggle` | 互斥/多选按钮组 |
| 图标 | `v-icon` | Material Design Icons |
## 组件使用规范
### variant 选择策略
```
v-btn / v-text-field / v-card 等支持 variant 的组件:
- elevated → 强调操作(主按钮、重要卡片)
- filled → 次要强调
- tonal → 柔和强调(推荐默认)
- outlined → 边框风格
- text → 无背景无边框
- plain → 最低视觉权重
```
### density 选择策略
```
适用于 v-text-field / v-select / v-list 等:
- default → 标准间距
- comfortable → 稍紧凑
- compact → 紧凑模式(数据密集场景)
```
### 图标使用
统一使用 Material Design Iconsmdi
```vue
<v-icon icon="mdi-account" />
<v-btn prepend-icon="mdi-plus">新建</v-btn>
```
禁止混用多种图标库。项目中应在 `vuetify.ts` 插件中统一配置图标集。
## 反模式
### ❌ 混用 UI 库
```vue
<!-- 禁止混用 Element Plus -->
<el-button>提交</el-button>
<v-card>...</v-card>
```
### ❌ 原生 HTML 替代 Vuetify 组件
```vue
<!-- 禁止 Vuetify 对应组件时使用原生 HTML -->
<table>...</table> <!-- 应使用 v-data-table v-table -->
<input type="text" /> <!-- 应使用 v-text-field -->
<select>...</select> <!-- 应使用 v-select -->
<button>...</button> <!-- 应使用 v-btn -->
```
### ❌ 硬编码颜色值
```vue
<!-- 禁止硬编码颜色 -->
<v-card style="background: #1976d2">
<!-- 正确使用 Vuetify 主题色 -->
<v-card color="primary">
```

View File

@@ -0,0 +1,207 @@
# Vuetify 3 响应式设计
## 核心原则
- 使用 Vuetify 的 Grid 系统(`v-container` / `v-row` / `v-col`)构建响应式布局
- 使用 `useDisplay()` composable 进行编程式断点判断
- 移动优先:从小屏开始设计,逐步增强大屏体验
- 禁止使用原生 CSS media query 替代 Vuetify 断点系统
## 断点定义
Vuetify 3 默认断点:
| 断点名 | 范围 | 设备 |
|--------|------|------|
| `xs` | 0 599px | 小型手机 |
| `sm` | 600 959px | 大型手机 / 小型平板 |
| `md` | 960 1279px | 平板 / 小型笔电 |
| `lg` | 1280 1919px | 桌面显示器 |
| `xl` | 1920 2559px | 大型显示器 |
| `xxl` | 2560px+ | 超大屏 |
## Grid 系统
### 基础布局
```vue
<template>
<v-container>
<v-row>
<!-- 手机全宽平板6列桌面4列 -->
<v-col cols="12" sm="6" md="4">
<v-card>内容A</v-card>
</v-col>
<v-col cols="12" sm="6" md="4">
<v-card>内容B</v-card>
</v-col>
<v-col cols="12" sm="12" md="4">
<v-card>内容C</v-card>
</v-col>
</v-row>
</v-container>
</template>
```
### 响应式间距
```vue
<v-row dense> <!-- 紧凑间距 -->
<v-row no-gutters> <!-- 无间距 -->
<v-row :dense="mobile"> <!-- 条件间距 -->
```
### 对齐控制
```vue
<v-row align="center" justify="space-between">
<v-col cols="auto">左侧内容</v-col>
<v-col cols="auto">右侧内容</v-col>
</v-row>
```
## 编程式断点判断
### useDisplay() composable
```typescript
import { useDisplay } from 'vuetify'
import { computed } from 'vue'
const { mobile, mdAndUp, lgAndUp, name, width } = useDisplay()
// 常用判断
const isMobile = computed(() => mobile.value)
const isTabletOrAbove = computed(() => mdAndUp.value)
const isDesktop = computed(() => lgAndUp.value)
```
### 响应式组件属性
```vue
<script setup lang="ts">
import { useDisplay } from 'vuetify'
import { computed } from 'vue'
const { mobile, mdAndUp } = useDisplay()
const drawerRail = computed(() => !mdAndUp.value)
const cardCols = computed(() => mobile.value ? 12 : 6)
const tableDensity = computed<'default' | 'compact'>(() =>
mobile.value ? 'compact' : 'default'
)
</script>
<template>
<v-navigation-drawer :rail="drawerRail" />
<v-col :cols="cardCols">...</v-col>
<v-data-table :density="tableDensity" />
</template>
```
## 响应式典型模式
### 侧边栏模式
```vue
<script setup lang="ts">
import { useDisplay } from 'vuetify'
import { ref, watch } from 'vue'
const { mdAndUp } = useDisplay()
const drawerOpen = ref(true)
// 移动端默认收起侧边栏
watch(mdAndUp, (isDesktop) => {
drawerOpen.value = isDesktop
}, { immediate: true })
</script>
<template>
<v-navigation-drawer
v-model="drawerOpen"
:temporary="!mdAndUp"
:permanent="mdAndUp"
>
<!-- 导航内容 -->
</v-navigation-drawer>
</template>
```
### 列表/卡片切换
```vue
<script setup lang="ts">
import { useDisplay } from 'vuetify'
import { computed } from 'vue'
const { mobile } = useDisplay()
const viewMode = computed(() => mobile.value ? 'list' : 'grid')
</script>
<template>
<!-- 移动端列表视图 -->
<v-list v-if="viewMode === 'list'">
<v-list-item v-for="item in items" :key="item.id" :title="item.name" />
</v-list>
<!-- 桌面端卡片网格 -->
<v-row v-else>
<v-col v-for="item in items" :key="item.id" cols="4">
<v-card :title="item.name" />
</v-col>
</v-row>
</template>
```
### 响应式对话框
```vue
<script setup lang="ts">
import { useDisplay } from 'vuetify'
import { computed } from 'vue'
const { mobile } = useDisplay()
const dialogWidth = computed(() => mobile.value ? '100%' : '600')
const dialogFullscreen = computed(() => mobile.value)
</script>
<template>
<v-dialog
:width="dialogWidth"
:fullscreen="dialogFullscreen"
>
<!-- 对话框内容 -->
</v-dialog>
</template>
```
## 反模式
### ❌ 原生 media query 替代 Vuetify 断点
```css
/* 禁止:使用原生 media query */
@media (max-width: 960px) {
.sidebar { display: none; }
}
```
```vue
<!-- 正确使用 Vuetify 断点 -->
<v-navigation-drawer v-if="mdAndUp" />
```
### ❌ 固定像素宽度
```vue
<!-- 禁止固定像素宽度 -->
<v-col style="width: 400px">
<!-- 正确使用栅格 -->
<v-col cols="12" md="4">
```
### ❌ 忽略移动端适配
所有页面必须在 xs 断点(< 600px下可用且内容可读

View File

@@ -0,0 +1,208 @@
# Vuetify 3 明暗主题配置
## 核心原则
- 所有颜色必须通过 Vuetify 主题系统定义,禁止在组件中硬编码颜色值
- 必须同时提供 light 和 dark 两套主题配色
- 自定义颜色通过 `colors` 扩展,不覆盖 Vuetify 默认语义色primary / secondary / error 等)
- CSS 中引用颜色使用 `rgb(var(--v-theme-<color>))` 变量
## 主题配置模板
### vuetify 插件配置
```typescript
// src/plugins/vuetify.ts
import { createVuetify, type ThemeDefinition } from 'vuetify'
import 'vuetify/styles'
const lightTheme: ThemeDefinition = {
dark: false,
colors: {
background: '#FAFAFA',
surface: '#FFFFFF',
'surface-variant': '#F5F5F5',
primary: '#1867C0',
'primary-darken-1': '#1459A3',
secondary: '#5CBBF6',
'secondary-darken-1': '#4BA3D8',
error: '#B00020',
info: '#2196F3',
success: '#4CAF50',
warning: '#FB8C00',
// 自定义扩展色
'on-surface-muted': '#757575',
'border-color': '#E0E0E0',
},
}
const darkTheme: ThemeDefinition = {
dark: true,
colors: {
background: '#121212',
surface: '#1E1E1E',
'surface-variant': '#2C2C2C',
primary: '#2196F3',
'primary-darken-1': '#1976D2',
secondary: '#54B4EB',
'secondary-darken-1': '#4BA3D8',
error: '#CF6679',
info: '#2196F3',
success: '#4CAF50',
warning: '#FB8C00',
// 自定义扩展色
'on-surface-muted': '#9E9E9E',
'border-color': '#424242',
},
}
export default createVuetify({
theme: {
defaultTheme: 'light',
themes: {
light: lightTheme,
dark: darkTheme,
},
},
})
```
## 主题切换
### 使用 useTheme() composable
```typescript
// src/composables/useThemeToggle.ts
import { useTheme } from 'vuetify'
import { computed } from 'vue'
export function useThemeToggle() {
const theme = useTheme()
const isDark = computed(() => theme.global.current.value.dark)
function toggleTheme(): void {
theme.global.name.value = isDark.value ? 'light' : 'dark'
}
return { isDark, toggleTheme }
}
```
### 模板中使用
```vue
<script setup lang="ts">
import { useThemeToggle } from '@/composables/useThemeToggle'
const { isDark, toggleTheme } = useThemeToggle()
</script>
<template>
<v-btn
:icon="isDark ? 'mdi-weather-sunny' : 'mdi-weather-night'"
@click="toggleTheme"
/>
</template>
```
## CSS 中引用主题色
### 使用 CSS 变量
```css
/* 正确:通过 Vuetify CSS 变量引用主题色 */
.custom-border {
border: 1px solid rgb(var(--v-theme-border-color));
}
.muted-text {
color: rgb(var(--v-theme-on-surface-muted));
}
/* 带透明度 */
.overlay-bg {
background-color: rgba(var(--v-theme-surface), 0.85);
}
```
### 禁止模式
```css
/* ❌ 禁止:硬编码颜色 */
.custom-bg { background: #1976d2; }
/* ❌ 禁止:不走主题系统的 CSS 变量 */
.custom-bg { background: var(--my-custom-color); }
/* ✅ 正确:走 Vuetify 主题 */
.custom-bg { background-color: rgb(var(--v-theme-primary)); }
```
## 组件级主题色使用
```vue
<!-- 通过 color prop 引用主题色 -->
<v-card color="surface-variant">
<v-card-title class="text-primary">标题</v-card-title>
<v-card-text class="text-on-surface-muted">描述文字</v-card-text>
</v-card>
<!-- 使用 Vuetify 工具类 -->
<div class="bg-surface text-on-surface">内容</div>
```
## 主题感知的条件样式
```vue
<script setup lang="ts">
import { useTheme } from 'vuetify'
import { computed } from 'vue'
const theme = useTheme()
const cardElevation = computed(() =>
theme.global.current.value.dark ? 0 : 2
)
</script>
<template>
<v-card :elevation="cardElevation" />
</template>
```
## 主题持久化
将用户主题偏好存储到 localStorage并在应用初始化时恢复
```typescript
// src/composables/useThemeToggle.ts
import { useTheme } from 'vuetify'
import { computed, watch } from 'vue'
const THEME_STORAGE_KEY = 'user-theme-preference'
export function useThemeToggle() {
const theme = useTheme()
const isDark = computed(() => theme.global.current.value.dark)
function toggleTheme(): void {
theme.global.name.value = isDark.value ? 'light' : 'dark'
}
// 持久化
watch(
() => theme.global.name.value,
(themeName) => {
localStorage.setItem(THEME_STORAGE_KEY, themeName)
},
)
return { isDark, toggleTheme }
}
export function restoreThemePreference(): string {
return localStorage.getItem(THEME_STORAGE_KEY) ?? 'light'
}
```

View File

@@ -0,0 +1,91 @@
---
name: developing-go-gin-gorm
description: >
用于 create / modify / review production-ready 的 Go Gin/GORM backend code。该 skill 强制执行
POST + JSON RequestBody API、handler-service-dao layering、自包含 common runtime、AppError error mapping、
single-DB / multi-DB design、Unit of Work / TxDAO transaction pattern、Asia/Shanghai time handling、
structured logging、JWT / security / audit middleware 以及 cross-platform validation。
---
# Go Gin/GORM 工程化开发 Skill
## 核心工作流
1. 先阅读目标模块,再动手修改。识别现有的 handler、service、DAO、DTO、entity、middleware、config 和数据库连接方式。
2. 如果目标项目没有本地 common 基础能力,从 `reference/common-runtime.md` 脚手架化生成;不要依赖外部 common 仓库或某台机器上的特定目录。
3.`handler -> service -> dao` 分层新增或修改代码DTO 与 entity 必须分离。
4. 所有业务 API 统一设计为 `POST + JSON RequestBody`;不新增 GET、PUT、PATCH、DELETE、Path Param 或 Query Param 风格业务接口。
5. DAO 返回底层错误Service 将基础设施错误转换为领域错误或 `AppError`Handler 只负责把 `AppError` 映射成统一响应,不能感知 GORM 细节。
6. 根据模块实际情况选择单数据库或多数据库设计;涉及事务的写操作必须使用 Unit of Work / TxDAO 模式。
7. 时间和日志只使用 skill 内定义的 common 运行时规范;时间固定为 `Asia/Shanghai`API 时间戳使用 RFC3339 且带 `+08:00` 偏移。
8. 受保护 API 必须接入安全与审计规则,包括 JWT claims、管理员校验、权限中间件、限流、CORS、敏感日志脱敏和关键操作审计。
9. 完成后运行 `scripts/validate_go_gin_gorm.py` 跨平台验证器;当目标项目可本地编译时,再运行 `gofmt``go test ./...`
## 强制规则
- API 路由必须使用 `POST("/xxx/action", handler.Method)`,请求参数必须使用 `ShouldBindJSON`
- Handler 禁止调用 `c.JSON``AbortWithStatusJSON``c.Param``c.Query``ShouldBindQuery`,也禁止 import `gorm`
- Handler 禁止承载业务规则、权限判断细节、事务边界和数据库访问。
- Service 负责业务编排、领域校验、错误映射、事务边界和关键业务日志。
- DAO 负责所有 GORM 操作,并且每个 GORM 调用必须使用 `WithContext(ctx)`
- DAO 返回底层错误Service 负责转换为 `common.AppError` 或明确的领域错误。
- 需要参与事务的 DAO 方法必须接收 `tx *gorm.DB`;同一个事务内的所有 DAO 调用必须使用同一个 `tx`
- DAO 禁止主动开启事务;事务只能由 Service 通过 Unit of Work 开启。
- 时间必须使用 `common.Now()``common.ParseTime()``common.FormatTime()`;除 common 时间运行时外,禁止直接使用 `time.Now()``time.Parse()`
- 日志必须使用 `common.Debug/Info/Warn/Error`禁止打印密码、Token、私钥、Secret、完整 Authorization Header 等敏感信息。
- Go 导出标识符和非显然业务逻辑必须写有价值的中文注释。
- 代码优先清晰、小而稳;只有在能消除真实复杂度或保护真实边界时才引入设计模式。
## 实现原则
- 以第一性原理为基础,优先解决真实业务流程、数据一致性、安全风险、可观测性缺口和核心工程问题。
- 发挥资深架构师经验,优化系统的可维护性、可扩展性、可观测性、稳定性、安全性和工程落地成本。
- 避免炫技式设计、性能表演、过早抽象和没有业务收益的复杂化。
- 优先使用显式 DTO、显式错误映射、显式事务所有权和显式业务边界日志。
- 只有当当前代码路径、数据量或故障模式证明成本真实存在时,才做针对性优化。
## 设计模式使用准则
- 使用分层架构约束 handler、service、dao 的依赖方向。
- 使用 DTO + Mapper 防止数据库 entity 泄漏到 API 合约。
- 使用 `AppError` 作为 Service 暴露给 Handler 的稳定错误契约。
- 使用 Unit of Work 管理 Service 层事务边界。
- 使用 TxDAO 模式,将 `tx *gorm.DB` 传入事务内 DAO 方法。
- Repository/DAO 只处理持久化访问,不承载业务决策。
- Handler、Service、DAO、DB Manager、Middleware 使用构造函数/工厂函数创建。
- 只有存在多个真实可替换算法时才使用 Strategy。
- 外部系统对接,例如 CI、支付、身份、对象存储优先使用 Adapter 隔离边界。
- Auth、Admin、Permission、RequestID、Audit、RateLimit、CORS、Recovery 等横切能力使用 Middleware / Decorator 风格组合。
- 不为了设计模式而设计;只有一个实现、没有边界收益的模式不要引入。
## 参考资料加载规则
| 场景 | 读取文件 |
|------|----------|
| 本地 common 运行时统一响应、错误码、AppError、时间、日志、请求上下文 | `reference/common-runtime.md` |
| 编码规范、注释、命名、错误处理、DTO/entity 分离 | `reference/coding-standards.md` |
| API 路由、DTO、分页、响应规范 | `reference/api-design-spec.md`, `reference/api-response-spec.md` |
| 单数据库、多数据库、Unit of Work、TxDAO、GORM 规则 | `reference/database-patterns.md`, `reference/framework-usage.md` |
| 日志分级、debug 模式、敏感字段脱敏 | `reference/logging-standards.md` |
| 东八区时间处理 | `reference/time-handling.md` |
| JWT、权限、管理员校验、审计、CORS、限流、安全规则 | `reference/security-audit.md` |
| 目录结构和依赖边界 | `reference/project-structure.md` |
| 代码示例 | `examples/*.go` |
| 跨平台验证 | `scripts/validate_go_gin_gorm.py` |
> 错误码与 `AppError` 的唯一规范来源是 `reference/common-runtime.md`。不要再维护独立的 `error-codes.go` reference避免双源漂移。
## 完成检查清单
- [ ] 所有业务 API 都是 POST + RequestBody。
- [ ] Handler 只做 DTO 绑定、Service 调用和 common 统一响应。
- [ ] Handler 没有 GORM import也不检查 GORM 错误。
- [ ] Service 将 DAO/基础设施错误映射为 `common.AppError`
- [ ] DAO 方法使用 `WithContext(ctx)` 并返回底层错误。
- [ ] 事务使用 Unit of Work并向所有参与事务的 DAO 方法传递同一个 `tx`
- [ ] 单数据库或多数据库归属明确。
- [ ] 时间使用 `common.Now/ParseTime/FormatTime`
- [ ] 日志使用 common 结构化日志,并对敏感字段脱敏。
- [ ] 需要保护的接口已接入 Auth/Admin/Permission/Audit/RateLimit/CORS 等规则。
- [ ] `python scripts/validate_go_gin_gorm.py <project-root>` 通过。
- [ ] `gofmt``go test ./...` 通过;如果因环境限制不能运行,需要明确说明。

View File

@@ -0,0 +1,73 @@
package dao
import (
"context"
"my-project/internal/model/dto"
"my-project/internal/model/entity"
"gorm.io/gorm"
)
// UserDAO 用户数据访问对象。DAO 只封装 GORM 操作,不做业务决策。
type UserDAO struct {
db *gorm.DB
}
// NewUserDAO 创建用户 DAO 实例。
func NewUserDAO(db *gorm.DB) *UserDAO {
return &UserDAO{db: db}
}
func (d *UserDAO) session(ctx context.Context, tx *gorm.DB) *gorm.DB {
if tx != nil {
return tx.WithContext(ctx)
}
return d.db.WithContext(ctx)
}
// FindByID 根据用户 ID 查询用户。未找到时返回 GORM 原始错误。
func (d *UserDAO) FindByID(ctx context.Context, tx *gorm.DB, userID int64) (*entity.User, error) {
var user entity.User
if err := d.session(ctx, tx).First(&user, "id = ?", userID).Error; err != nil {
return nil, err
}
return &user, nil
}
// ExistsByUsername 判断用户名是否已存在。
func (d *UserDAO) ExistsByUsername(ctx context.Context, tx *gorm.DB, username string) (bool, error) {
var count int64
if err := d.session(ctx, tx).Model(&entity.User{}).Where("username = ?", username).Count(&count).Error; err != nil {
return false, err
}
return count > 0, nil
}
// Create 创建用户。事务内调用必须传入 tx。
func (d *UserDAO) Create(ctx context.Context, tx *gorm.DB, user *entity.User) error {
return d.session(ctx, tx).Create(user).Error
}
// List 分页查询用户列表。
func (d *UserDAO) List(ctx context.Context, tx *gorm.DB, req *dto.ListUsersRequest) ([]*entity.User, int64, error) {
query := d.session(ctx, tx).Model(&entity.User{})
if req.Keyword != "" {
query = query.Where("username LIKE ?", "%"+req.Keyword+"%")
}
if req.Status != "" {
query = query.Where("status = ?", req.Status)
}
var total int64
if err := query.Count(&total).Error; err != nil {
return nil, 0, err
}
var users []*entity.User
offset := (req.Page - 1) * req.PageSize
if err := query.Order("created_at DESC").Limit(req.PageSize).Offset(offset).Find(&users).Error; err != nil {
return nil, 0, err
}
return users, total, nil
}

View File

@@ -0,0 +1,62 @@
package handler
import (
"my-project/internal/common"
"my-project/internal/model/dto"
"my-project/internal/service"
"github.com/gin-gonic/gin"
)
// UserHandler 用户相关 API 处理器。
type UserHandler struct {
userService *service.UserService
}
// NewUserHandler 创建用户 Handler 实例。
func NewUserHandler(userService *service.UserService) *UserHandler {
return &UserHandler{userService: userService}
}
// RegisterUserRoutes 注册用户路由。业务 API 强制使用 POST + RequestBody。
func RegisterUserRoutes(group *gin.RouterGroup, h *UserHandler) {
users := group.Group("/users")
{
users.POST("/detail", h.GetUserDetail)
users.POST("/create", h.CreateUser)
}
}
// GetUserDetail 根据请求体中的用户 ID 获取用户详情。
func (h *UserHandler) GetUserDetail(c *gin.Context) {
var req dto.GetUserDetailRequest
if err := c.ShouldBindJSON(&req); err != nil {
common.ResponseAppError(c, common.WrapAppError(common.CodeParamError, "请求参数错误", err))
return
}
resp, err := h.userService.GetUserDetail(c.Request.Context(), &req)
if err != nil {
common.ResponseAppError(c, err)
return
}
common.ResponseSuccess(c, resp)
}
// CreateUser 创建用户。
func (h *UserHandler) CreateUser(c *gin.Context) {
var req dto.CreateUserRequest
if err := c.ShouldBindJSON(&req); err != nil {
common.ResponseAppError(c, common.WrapAppError(common.CodeParamError, "请求参数错误", err))
return
}
resp, err := h.userService.CreateUser(c.Request.Context(), &req)
if err != nil {
common.ResponseAppError(c, err)
return
}
common.ResponseSuccessWithMessage(c, resp, "用户创建成功")
}

View File

@@ -0,0 +1,79 @@
package service
import (
"context"
"errors"
"my-project/internal/common"
"my-project/internal/dao"
"my-project/internal/model/dto"
"my-project/internal/model/entity"
"my-project/internal/model/mapper"
"gorm.io/gorm"
)
// UnitOfWork 定义 service 使用的事务边界。
type UnitOfWork interface {
Transaction(ctx context.Context, fn func(tx *gorm.DB) error) error
}
// UserService 用户业务服务。
type UserService struct {
uow UnitOfWork
userDAO *dao.UserDAO
}
// NewUserService 创建用户服务实例。
func NewUserService(uow UnitOfWork, userDAO *dao.UserDAO) *UserService {
return &UserService{uow: uow, userDAO: userDAO}
}
// GetUserDetail 根据用户 ID 获取用户详情。
func (s *UserService) GetUserDetail(ctx context.Context, req *dto.GetUserDetailRequest) (*dto.UserDetailResponse, error) {
user, err := s.userDAO.FindByID(ctx, nil, req.UserID)
if err != nil {
return nil, mapUserLookupError(err)
}
return mapper.ToUserDetailResponse(user), nil
}
// CreateUser 创建用户并通过 UnitOfWork 保证事务一致性。
func (s *UserService) CreateUser(ctx context.Context, req *dto.CreateUserRequest) (*dto.UserDetailResponse, error) {
var created *entity.User
err := s.uow.Transaction(ctx, func(tx *gorm.DB) error {
exists, err := s.userDAO.ExistsByUsername(ctx, tx, req.Username)
if err != nil {
return common.WrapAppError(common.CodeServerError, "检查用户名失败", err)
}
if exists {
return common.NewAppError(common.CodeDuplicate, "用户名已存在")
}
user := &entity.User{
Username: req.Username,
Email: req.Email,
CreatedAt: common.Now(),
UpdatedAt: common.Now(),
}
if err := s.userDAO.Create(ctx, tx, user); err != nil {
return common.WrapAppError(common.CodeServerError, "创建用户失败", err)
}
created = user
return nil
})
if err != nil {
return nil, err
}
common.Info(ctx, "用户创建成功", "user_id", created.ID, "username", created.Username)
return mapper.ToUserDetailResponse(created), nil
}
func mapUserLookupError(err error) error {
if errors.Is(err, gorm.ErrRecordNotFound) {
return common.WrapAppError(common.CodeNotFound, "用户不存在", err)
}
return common.WrapAppError(common.CodeServerError, "查询用户失败", err)
}

View File

@@ -0,0 +1,125 @@
# API 设计规范
所有业务 API 必须使用 `POST + JSON RequestBody`。这是生成代码的硬性规则。
## 路由规则
- 列表、详情、创建、更新、删除、同步、触发、导出、内部检查等接口都使用 `POST`
- 所有业务参数都放在 JSON request body 中。
- 禁止使用 path variables。
- 禁止使用 query parameters。
- 禁止使用 `ShouldBindQuery`
- 路由名称保持 action-oriented并保证语义稳定。
| 操作 | 路由后缀 | 示例 |
|------|----------|------|
| 列表 | `/list` | `/api/users/list` |
| 详情 | `/detail` | `/api/users/detail` |
| 创建 | `/create` | `/api/users/create` |
| 更新 | `/update` | `/api/users/update` |
| 删除 | `/delete` | `/api/users/delete` |
| 同步 | `/sync` | `/api/ci/resources/sync` |
| 触发 | `/trigger` | `/api/builds/trigger` |
| 导出 | `/export` | `/api/audit/logs/export` |
| 权限检查 | `/check` | `/api/permissions/check` |
## 路由注册
```go
func RegisterUserRoutes(group *gin.RouterGroup, h *UserHandler) {
users := group.Group("/users")
{
users.POST("/list", h.ListUsers)
users.POST("/detail", h.GetUserDetail)
users.POST("/create", h.CreateUser)
users.POST("/update", h.UpdateUser)
users.POST("/delete", h.DeleteUser)
}
}
```
## DTO 规则
```go
type PageRequest struct {
Page int `json:"page" binding:"required,min=1"`
PageSize int `json:"page_size" binding:"required,min=1,max=100"`
}
type ListUsersRequest struct {
PageRequest
Keyword string `json:"keyword,omitempty"`
Status string `json:"status,omitempty"`
}
type GetUserDetailRequest struct {
UserID int64 `json:"user_id" binding:"required,min=1"`
}
type CreateUserRequest struct {
Username string `json:"username" binding:"required"`
Email string `json:"email" binding:"omitempty,email"`
}
type UpdateUserRequest struct {
UserID int64 `json:"user_id" binding:"required,min=1"`
Email string `json:"email" binding:"omitempty,email"`
Status string `json:"status,omitempty"`
}
type DeleteUserRequest struct {
UserID int64 `json:"user_id" binding:"required,min=1"`
}
```
DTO 命名规则:
| 类型 | 命名 |
|------|------|
| 列表请求 | `List{Resource}Request` |
| 详情请求 | `Get{Resource}DetailRequest` |
| 创建请求 | `Create{Resource}Request` |
| 更新请求 | `Update{Resource}Request` |
| 删除请求 | `Delete{Resource}Request` |
| 列表响应 | `List{Resource}Response` |
| 详情响应 | `{Resource}DetailResponse` |
## Handler 模板
```go
func (h *UserHandler) GetUserDetail(c *gin.Context) {
var req dto.GetUserDetailRequest
if err := c.ShouldBindJSON(&req); err != nil {
common.ResponseAppError(c, common.WrapAppError(common.CodeParamError, "请求参数错误", err))
return
}
resp, err := h.userService.GetUserDetail(c.Request.Context(), &req)
if err != nil {
common.ResponseAppError(c, err)
return
}
common.ResponseSuccess(c, resp)
}
```
## 分页响应
优先使用字段明确的类型化响应;通用工具或临时场景可以使用 `common.PageResponse`
```go
type ListUsersResponse struct {
List []*UserDTO `json:"list"`
Total int64 `json:"total"`
Page int `json:"page"`
PageSize int `json:"page_size"`
}
```
## 安全注意事项
- 敏感值必须只放在 request body 中,并且仍然禁止写入日志。
- 参数绑定和校验错误映射为 `CodeParamError``CodeValidationFail`
- Auth、Admin、Permission 中间件禁止从路径参数或查询参数中提取业务资源标识。
- 删除接口在需要业务恢复或审计追溯时优先使用软删除。

View File

@@ -0,0 +1,64 @@
# API 响应规范
所有 Handler 必须使用 `reference/common-runtime.md` 中定义的本地 common 响应函数。
## 响应体
```go
type Response struct {
Code int `json:"code"`
Message string `json:"message"`
Data any `json:"data"`
Timestamp string `json:"timestamp"`
RequestID string `json:"request_id"`
}
```
规则:
- 成功响应使用 `CodeSuccess`,并保持 `message=success`
- 失败响应使用非 0 业务码,且 `data=null`
- `timestamp` 使用 Asia/Shanghai RFC3339例如 `2026-06-29T10:30:00+08:00`
- `request_id` 必须来自请求上下文或 RequestID 中间件。
- Handler 通过 `ResponseAppError(c, err)` 返回错误。
## 响应函数
```go
common.ResponseSuccess(c, data)
common.ResponseSuccessWithMessage(c, data, "创建成功")
common.ResponseError(c, common.CodeParamError, "请求参数错误")
common.ResponseAppError(c, err)
```
业务 Handler 禁止直接调用 Gin 原生响应方法。
## 错误映射
| 错误来源 | 映射责任方 | 响应码 |
|----------|------------|--------|
| JSON 绑定错误 | Handler | `CodeParamError``CodeValidationFail` |
| GORM 未找到记录 | Service | `CodeNotFound` |
| 唯一键冲突 | Service | `CodeDuplicate` |
| 领域校验失败 | Service | `CodeValidationFail``CodeBusinessError` |
| 权限不足 | Middleware/Service | `CodeForbidden` |
| 外部 API 失败 | Service/Adapter | `CodeExternalAPIError` |
| 未知基础设施失败 | Service 或 `ToAppError` fallback | `CodeServerError` |
## 分页结构
```go
type PageRequest struct {
Page int `json:"page" binding:"required,min=1"`
PageSize int `json:"page_size" binding:"required,min=1,max=100"`
}
type PageResponse struct {
List any `json:"list"`
Total int64 `json:"total"`
Page int `json:"page"`
PageSize int `json:"page_size"`
}
```
能明确响应结构时优先使用类型化列表响应;通用工具场景再使用 `PageResponse`

View File

@@ -0,0 +1,57 @@
# 编码规范
## 命名规范
| 对象 | 规则 | 示例 |
|------|------|------|
| package | 小写、短名、不使用下划线 | `service`, `dao`, `common` |
| 导出标识符 | PascalCase并提供有价值的中文注释 | `CreateUser` |
| 非导出标识符 | camelCase | `mapUserError` |
| 请求 DTO | `{Action}{Resource}Request` | `CreateUserRequest` |
| 响应 DTO | `{Resource}{Shape}Response` | `UserDetailResponse` |
| DAO | `{Resource}DAO` | `UserDAO` |
| Service | `{Resource}Service` | `UserService` |
## 注释规范
- 导出的 type、func、method 必须有中文注释。
- 注释应解释意图、边界或业务含义。
- 避免只复述代码行为的空注释。
- 非显然的校验、事务、审计、补偿逻辑前应添加简短说明。
## 错误处理
- 禁止用 `_ = err` 丢弃错误。
- Service 包装基础设施错误时必须补充业务上下文。
- Handler 的参数绑定错误转换为 `common.AppError`
- DAO 返回的错误由 Service 转换为 `common.AppError`
- Handler 禁止检查 GORM 错误。
## DTO 与 Entity 分离
- Request DTO 承载客户端输入。
- Response DTO 定义客户端输出。
- Entity 定义数据库持久化结构。
- Mapper 负责 entity 与 DTO 的转换。
- 除非 entity 被明确设计为 API 合约,否则 API 禁止直接返回 entity。
## 实现纪律
- 优先解决真实业务问题。
- 代码保持足够小,方便 review。
- 只有存在清晰边界或重复复杂度时才添加抽象。
- 依赖关系优先通过构造函数显式注入。
- 除 logger 和只读配置类基础设施外,避免全局可变状态。
- 在工作流边界打印日志,不要每行都打日志。
- 让故障可观测:日志中应包含请求 ID、用户 ID、操作、资源 ID、耗时等关键字段。
## 业务代码禁用项
- 在 common 时间运行时之外直接使用 `time.Now()`
- 直接使用 `fmt.Println``log.Println` 或临时日志方案。
- Handler 直接返回 Gin JSON 响应。
- 业务 API 使用 path/query 参数绑定。
- Service import Gin。
- Handler import GORM。
- DAO 禁止 import service 或 handler。
- 用字符串拼接 SQL 并混入用户输入。

View File

@@ -0,0 +1,373 @@
# 自包含 Common Runtime
当目标项目没有等价的本地基础能力时,使用这些模板。不要依赖外部 common 仓库或某台机器上的固定路径。应用项目优先使用 `internal/common`;只有当模块确实需要向其他模块暴露这些辅助函数时,才使用 `pkg/common`
## 包结构
```text
internal/common/
app_error.go
codes.go
response.go
time.go
logging.go
request_context.go
```
## 错误码与 AppError
```go
package common
import (
"errors"
"fmt"
)
const (
CodeSuccess = 0
CodeParamError = 1001
CodeValidationFail = 1002
CodeUnauthorized = 1003
CodeForbidden = 1004
CodeNotFound = 1005
CodeTimeout = 1006
CodeServerError = 1007
CodeDuplicate = 1008
CodeOperationFail = 1009
CodeBusinessError = 2001
CodeDataNotReady = 2002
CodeStatusInvalid = 2003
CodeDependencyError = 2004
CodeExternalAPIError = 2005
CodeResourceLocked = 2006
CodeQuotaExceeded = 2007
CodeConcurrentConflict = 2008
)
var CodeMessage = map[int]string{
CodeSuccess: "success",
CodeParamError: "参数错误",
CodeValidationFail: "数据验证失败",
CodeUnauthorized: "未授权,请先登录",
CodeForbidden: "权限不足,禁止访问",
CodeNotFound: "资源不存在",
CodeTimeout: "请求超时",
CodeServerError: "服务器内部错误",
CodeDuplicate: "数据重复",
CodeOperationFail: "操作失败",
CodeBusinessError: "业务处理失败",
CodeDataNotReady: "数据未就绪",
CodeStatusInvalid: "状态不合法",
CodeDependencyError: "依赖服务错误",
CodeExternalAPIError: "外部服务调用失败",
CodeResourceLocked: "资源被锁定",
CodeQuotaExceeded: "配额超限",
CodeConcurrentConflict: "并发冲突",
}
// AppError 是 service 层向 handler 层暴露的稳定错误契约。
// DAO 返回底层错误service 负责转换为 AppError。
type AppError struct {
Code int
Message string
Cause error
Fields map[string]any
}
func NewAppError(code int, message string) *AppError {
if message == "" {
message = CodeMessage[code]
}
return &AppError{Code: code, Message: message}
}
func WrapAppError(code int, message string, cause error) *AppError {
appErr := NewAppError(code, message)
appErr.Cause = cause
return appErr
}
func (e *AppError) Error() string {
if e == nil {
return ""
}
if e.Cause == nil {
return fmt.Sprintf("%d:%s", e.Code, e.Message)
}
return fmt.Sprintf("%d:%s: %v", e.Code, e.Message, e.Cause)
}
func (e *AppError) Unwrap() error {
if e == nil {
return nil
}
return e.Cause
}
func AsAppError(err error) (*AppError, bool) {
var appErr *AppError
if errors.As(err, &appErr) {
return appErr, true
}
return nil, false
}
func ToAppError(err error) *AppError {
if err == nil {
return nil
}
if appErr, ok := AsAppError(err); ok {
return appErr
}
return WrapAppError(CodeServerError, CodeMessage[CodeServerError], err)
}
```
## 统一响应
```go
package common
import (
"net/http"
"github.com/gin-gonic/gin"
)
// Response 是所有 API 的统一响应体。失败时 Data 必须为 nil。
type Response struct {
Code int `json:"code"`
Message string `json:"message"`
Data any `json:"data"`
Timestamp string `json:"timestamp"`
RequestID string `json:"request_id"`
}
type PageResponse struct {
List any `json:"list"`
Total int64 `json:"total"`
Page int `json:"page"`
PageSize int `json:"page_size"`
}
func ResponseSuccess(c *gin.Context, data any) {
respond(c, http.StatusOK, CodeSuccess, CodeMessage[CodeSuccess], data)
}
func ResponseSuccessWithMessage(c *gin.Context, data any, message string) {
respond(c, http.StatusOK, CodeSuccess, message, data)
}
func ResponseError(c *gin.Context, code int, message string) {
respond(c, httpStatusByCode(code), code, message, nil)
}
func ResponseAppError(c *gin.Context, err error) {
appErr := ToAppError(err)
Error(c.Request.Context(), appErr.Message, "code", appErr.Code, "error", appErr.Cause)
respond(c, httpStatusByCode(appErr.Code), appErr.Code, appErr.Message, nil)
}
func respond(c *gin.Context, httpStatus int, code int, message string, data any) {
if message == "" {
message = CodeMessage[code]
}
c.JSON(httpStatus, Response{
Code: code,
Message: message,
Data: data,
Timestamp: FormatTime(Now()),
RequestID: RequestIDFromGin(c),
})
}
func httpStatusByCode(code int) int {
switch code {
case CodeUnauthorized:
return http.StatusUnauthorized
case CodeForbidden:
return http.StatusForbidden
case CodeParamError, CodeValidationFail:
return http.StatusBadRequest
case CodeNotFound:
return http.StatusNotFound
default:
return http.StatusOK
}
}
```
## 请求上下文
```go
package common
import (
"context"
"github.com/gin-gonic/gin"
)
type contextKey string
const (
ContextKeyRequestID contextKey = "request_id"
ContextKeyUserID contextKey = "user_id"
ContextKeyUsername contextKey = "username"
ContextKeyRole contextKey = "role"
)
func WithRequestID(ctx context.Context, requestID string) context.Context {
return context.WithValue(ctx, ContextKeyRequestID, requestID)
}
func RequestIDFromContext(ctx context.Context) string {
if value, ok := ctx.Value(ContextKeyRequestID).(string); ok {
return value
}
return ""
}
func RequestIDFromGin(c *gin.Context) string {
if value, exists := c.Get(string(ContextKeyRequestID)); exists {
if requestID, ok := value.(string); ok {
return requestID
}
}
return RequestIDFromContext(c.Request.Context())
}
func UserIDFromContext(ctx context.Context) int64 {
if value, ok := ctx.Value(ContextKeyUserID).(int64); ok {
return value
}
return 0
}
```
## Asia/Shanghai 时间
```go
package common
import "time"
const TimeFormat = time.RFC3339
var shanghaiLocation = mustLoadShanghaiLocation()
func mustLoadShanghaiLocation() *time.Location {
location, err := time.LoadLocation("Asia/Shanghai")
if err != nil {
return time.FixedZone("Asia/Shanghai", 8*60*60)
}
return location
}
// Now 返回东八区当前时间。只有 common/time.go 允许直接调用 time.Now。
func Now() time.Time {
return time.Now().In(shanghaiLocation)
}
func FormatTime(t time.Time) string {
return t.In(shanghaiLocation).Format(TimeFormat)
}
func ParseTime(value string) (time.Time, error) {
parsed, err := time.Parse(TimeFormat, value)
if err != nil {
return time.Time{}, err
}
return parsed.In(shanghaiLocation), nil
}
func StartOfDay(t time.Time) time.Time {
local := t.In(shanghaiLocation)
return time.Date(local.Year(), local.Month(), local.Day(), 0, 0, 0, 0, shanghaiLocation)
}
func EndOfDay(t time.Time) time.Time {
return StartOfDay(t).Add(24*time.Hour - time.Nanosecond)
}
```
## 结构化日志
```go
package common
import (
"context"
"fmt"
"log/slog"
"os"
"strings"
)
var logger = slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{Level: slog.LevelInfo}))
func InitLogger(debug bool) {
level := slog.LevelInfo
if debug {
level = slog.LevelDebug
}
logger = slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{Level: level}))
}
func Debug(ctx context.Context, message string, fields ...any) {
logger.DebugContext(ctx, message, appendContextFields(ctx, sanitizeFields(fields)...)...)
}
func Info(ctx context.Context, message string, fields ...any) {
logger.InfoContext(ctx, message, appendContextFields(ctx, sanitizeFields(fields)...)...)
}
func Warn(ctx context.Context, message string, fields ...any) {
logger.WarnContext(ctx, message, appendContextFields(ctx, sanitizeFields(fields)...)...)
}
func Error(ctx context.Context, message string, fields ...any) {
logger.ErrorContext(ctx, message, appendContextFields(ctx, sanitizeFields(fields)...)...)
}
func appendContextFields(ctx context.Context, fields ...any) []any {
if requestID := RequestIDFromContext(ctx); requestID != "" {
fields = append(fields, "request_id", requestID)
}
if userID := UserIDFromContext(ctx); userID > 0 {
fields = append(fields, "user_id", userID)
}
return fields
}
func sanitizeFields(fields []any) []any {
sanitized := make([]any, 0, len(fields))
for i := 0; i < len(fields); i += 2 {
key := fields[i]
if i+1 >= len(fields) {
sanitized = append(sanitized, key)
break
}
value := fields[i+1]
if isSensitiveKey(key) {
value = "***REDACTED***"
}
sanitized = append(sanitized, key, value)
}
return sanitized
}
func isSensitiveKey(key any) bool {
name := strings.ToLower(strings.TrimSpace(strings.ReplaceAll(fmt.Sprint(key), "_", "")))
sensitive := []string{"password", "passwd", "token", "authorization", "secret", "privatekey", "apikey", "cookie"}
for _, item := range sensitive {
if strings.Contains(name, item) {
return true
}
}
return false
}
```

View File

@@ -0,0 +1,226 @@
# 数据库设计模式
当任务涉及 GORM、DAO 设计、事务、迁移或多数据库时,读取本文件。
## 通用规则
- DAO 负责所有 GORM 调用。
- DAO 返回底层错误,不把错误转换成响应码。
- Service 将 DAO 错误映射为 `common.AppError` 或明确的领域错误。
- Handler 禁止 import `gorm`,也禁止检查 `gorm.ErrRecordNotFound`
- 每个 GORM 操作都必须使用 `WithContext(ctx)`
- 可能参与事务的 DAO 方法必须接收 `tx *gorm.DB`
- DAO 禁止调用 `Transaction`;事务由 Service 通过 Unit of Work 开启。
- 同一个事务工作流中禁止混用事务 DAO 调用和非事务 DAO 调用。
- 更新操作必须显式指定字段,避免直接保存完整 request DTO。
- 列表 API 必须使用分页限制和稳定排序。
## 单数据库设计
当模块所有数据都位于同一个物理数据库,并且一个一致性边界足够时,使用单数据库设计。
```text
internal/db/
database.go
internal/dao/
user_dao.go
internal/service/
user_service.go
```
```go
package db
import (
"context"
"gorm.io/gorm"
)
// UnitOfWork 负责单数据库事务边界。
type UnitOfWork struct {
db *gorm.DB
}
func NewUnitOfWork(db *gorm.DB) *UnitOfWork {
return &UnitOfWork{db: db}
}
func (u *UnitOfWork) DB() *gorm.DB {
return u.db
}
func (u *UnitOfWork) Transaction(ctx context.Context, fn func(tx *gorm.DB) error) error {
return u.db.WithContext(ctx).Transaction(func(tx *gorm.DB) error {
return fn(tx)
})
}
```
Service 构造函数接收 `*db.UnitOfWork` 和需要的 DAO。非事务读取传入 `nil` tx事务写入必须把 Unit of Work 提供的 `tx` 传给每个 DAO 方法。
## 多数据库设计
当模块需要读写多个物理数据库,例如 `core``user``audit``ci``billing`,使用多数据库设计。
```go
package db
import (
"context"
"fmt"
"gorm.io/gorm"
)
type DatabaseName string
const (
DBCore DatabaseName = "core"
DBUser DatabaseName = "user"
DBAudit DatabaseName = "audit"
DBCI DatabaseName = "ci"
)
type Manager struct {
dbs map[DatabaseName]*gorm.DB
}
func NewManager(dbs map[DatabaseName]*gorm.DB) *Manager {
return &Manager{dbs: dbs}
}
func (m *Manager) DB(name DatabaseName) (*gorm.DB, error) {
db, ok := m.dbs[name]
if !ok || db == nil {
return nil, fmt.Errorf("database %s is not configured", name)
}
return db, nil
}
func (m *Manager) Transaction(ctx context.Context, name DatabaseName, fn func(tx *gorm.DB) error) error {
db, err := m.DB(name)
if err != nil {
return err
}
return db.WithContext(ctx).Transaction(func(tx *gorm.DB) error {
return fn(tx)
})
}
```
多数据库约束:
- 每个 DAO 必须在构造函数或文件注释中声明数据库归属。
- 一个事务只能覆盖一个物理数据库。
- 禁止用嵌套 GORM transaction 伪造跨库事务。
- 跨库工作流应使用幂等键、outbox 表、重试、对账或显式补偿动作。
- 除非业务明确要求审计先落库再返回成功,否则审计写入失败不应破坏主业务事务。
## 带 tx 参数的 DAO
```go
package dao
import (
"context"
"gorm.io/gorm"
)
type UserDAO struct {
db *gorm.DB
}
func NewUserDAO(db *gorm.DB) *UserDAO {
return &UserDAO{db: db}
}
func (d *UserDAO) session(ctx context.Context, tx *gorm.DB) *gorm.DB {
if tx != nil {
return tx.WithContext(ctx)
}
return d.db.WithContext(ctx)
}
// Create 创建用户。事务内调用必须传入 tx非事务调用传 nil。
func (d *UserDAO) Create(ctx context.Context, tx *gorm.DB, user *User) error {
return d.session(ctx, tx).Create(user).Error
}
// FindByID 根据用户ID查询用户。未找到时返回 GORM 原始错误。
func (d *UserDAO) FindByID(ctx context.Context, tx *gorm.DB, userID int64) (*User, error) {
var user User
if err := d.session(ctx, tx).First(&user, "id = ?", userID).Error; err != nil {
return nil, err
}
return &user, nil
}
```
上方 `User` 类型代表本地 entity 类型。实际项目中 entity 应放在 `internal/model/entity`;这里为了突出 TxDAO 规则而缩短示例。
## Service 层事务模式
```go
func (s *UserService) CreateUser(ctx context.Context, req *dto.CreateUserRequest) (*dto.UserDTO, error) {
var created *entity.User
err := s.uow.Transaction(ctx, func(tx *gorm.DB) error {
exists, err := s.userDAO.ExistsByUsername(ctx, tx, req.Username)
if err != nil {
return common.WrapAppError(common.CodeServerError, "检查用户名失败", err)
}
if exists {
return common.NewAppError(common.CodeDuplicate, "用户名已存在")
}
user := &entity.User{Username: req.Username, Email: req.Email}
if err := s.userDAO.Create(ctx, tx, user); err != nil {
return common.WrapAppError(common.CodeServerError, "创建用户失败", err)
}
created = user
return nil
})
if err != nil {
return nil, err
}
common.Info(ctx, "用户创建成功", "user_id", created.ID, "username", created.Username)
return mapper.ToUserDTO(created), nil
}
```
## GORM 错误映射
推荐在 Service 中完成错误映射:
```go
func mapUserLookupError(err error) error {
if err == nil {
return nil
}
if errors.Is(err, gorm.ErrRecordNotFound) {
return common.WrapAppError(common.CodeNotFound, "用户不存在", err)
}
return common.WrapAppError(common.CodeServerError, "查询用户失败", err)
}
```
各层 GORM import 规则:
| 层级 | 是否允许 import GORM | 原因 |
|------|----------------------|------|
| Handler | 不允许 | Handler 只应知道 HTTP、DTO、Service、common 统一响应 |
| Service | 有限允许 | 用于错误映射和 Unit of Work 事务类型 |
| DAO | 允许 | DAO 拥有 GORM 持久化访问 |
| Entity | 允许 | 用于 GORM tags |
| Common runtime | 默认不允许 | 仅当 DB/UnitOfWork 辅助能力被明确放在此处时例外 |
## 数据迁移规则
- 禁止在请求路径中随意调用 `AutoMigrate`
- 数据迁移只能在启动阶段或专用迁移命令中执行。
- 生产 schema 变更应具备幂等性,经过 review并尽量可回滚。
- 索引设计必须结合查询模式、唯一性约束和分页方式。
- 大表回填必须分批执行,并保证过程可观测。

View File

@@ -0,0 +1,153 @@
# Gin 与 GORM 使用规范
## Gin
### 路由分组
使用 router group 表达模块边界,并且业务路由只能注册为 `POST`
```go
func RegisterRoutes(r *gin.Engine, userHandler *UserHandler, auditHandler *AuditHandler) {
api := r.Group("/api")
{
users := api.Group("/users")
{
users.POST("/list", userHandler.ListUsers)
users.POST("/detail", userHandler.GetUserDetail)
users.POST("/create", userHandler.CreateUser)
users.POST("/update", userHandler.UpdateUser)
users.POST("/delete", userHandler.DeleteUser)
}
audit := api.Group("/audit")
{
audit.POST("/logs/list", auditHandler.ListLogs)
audit.POST("/logs/export", auditHandler.ExportLogs)
}
}
}
```
### Handler 规则
允许使用:
- `ShouldBindJSON`
- `c.Request.Context()`
- `common.ResponseSuccess`
- `common.ResponseAppError`
禁止使用:
- Path parameters
- Query parameters
- 直接调用 Gin 响应方法
- GORM imports
- 业务决策
- 数据库调用
### 中间件Middleware
横切 middleware 只注册一次,让业务 handler 保持小而清晰。
```go
func SetupMiddleware(r *gin.Engine, cfg Config, auditWriter AuditWriter) {
r.Use(common.Recovery())
r.Use(common.RequestIDMiddleware())
r.Use(common.CORSMiddleware(cfg.CORS))
r.Use(common.RateLimitMiddleware(cfg.RateLimit))
r.Use(common.AccessLogMiddleware())
r.Use(common.AuditMiddleware(auditWriter))
}
```
受保护路由组示例:
```go
users := api.Group("/users")
users.Use(common.AuthMiddleware(jwtSecret))
{
users.POST("/list", userHandler.ListUsers)
users.POST("/detail", userHandler.GetUserDetail)
}
adminUsers := api.Group("/admin/users")
adminUsers.Use(common.AuthMiddleware(jwtSecret), common.RequireAdmin())
{
adminUsers.POST("/create", userHandler.CreateUser)
adminUsers.POST("/delete", userHandler.DeleteUser)
}
```
## GORM
### 仅 DAO 操作 GORM
所有 GORM 调用必须放在 DAO 文件中。Service 只有在事务类型和基础设施错误映射需要时,才有限接触 GORM。
```go
func (d *UserDAO) List(ctx context.Context, tx *gorm.DB, req *dto.ListUsersRequest) ([]*entity.User, int64, error) {
session := d.session(ctx, tx).Model(&entity.User{})
if req.Keyword != "" {
session = session.Where("username ILIKE ?", "%"+req.Keyword+"%")
}
if req.Status != "" {
session = session.Where("status = ?", req.Status)
}
var total int64
if err := session.Count(&total).Error; err != nil {
return nil, 0, err
}
var users []*entity.User
offset := (req.Page - 1) * req.PageSize
if err := session.Order("created_at DESC").Limit(req.PageSize).Offset(offset).Find(&users).Error; err != nil {
return nil, 0, err
}
return users, total, nil
}
```
### 原生 SQLRaw SQL
只在 DAO 中为复杂 SQL、性能敏感查询或数据库特性使用 Raw/Exec。参数必须绑定禁止把不可信输入拼接进 SQL。
```go
func (d *UserDAO) CountActiveUsersByRole(ctx context.Context, tx *gorm.DB) ([]RoleCount, error) {
var rows []RoleCount
err := d.session(ctx, tx).Raw(`
SELECT role, COUNT(*) AS count
FROM users
WHERE status = ?
GROUP BY role
`, "active").Scan(&rows).Error
return rows, err
}
```
### 事务
Unit of Work 和 TxDAO 规则详见 `reference/database-patterns.md`。简版规则如下:
- Service 开启事务。
- DAO 接收 `tx *gorm.DB`
- 事务内每个 DAO 调用都使用同一个 `tx`
- DAO 永远不主动开启事务。
### 上下文Context
每个 GORM 调用都必须使用 request context
```go
d.session(ctx, tx).Create(entity)
d.session(ctx, tx).First(&entity.User{}, "id = ?", userID)
```
### 更新规则
- 使用 `Updates(map[string]any{...})` 或显式更新 DTO 映射。
- 禁止直接持久化 request DTO。
- 局部更新禁止使用 `Save`
- 状态流转必须先在 Service 中校验,再交给 DAO 更新。

View File

@@ -0,0 +1,95 @@
# 日志规范
只能使用 `reference/common-runtime.md` 中定义的本地 common 日志函数。
## 日志 API
```go
common.InitLogger(debug)
common.Debug(ctx, "调试信息", "key", value)
common.Info(ctx, "业务节点", "key", value)
common.Warn(ctx, "可预期异常", "key", value)
common.Error(ctx, "不可恢复错误", "key", value, "error", err)
```
Debug 模式:
- 开发环境或排障场景可以通过配置或环境变量开启 debug 模式。
- 生产环境默认使用 info 级别。
- Debug 日志也必须进行敏感字段脱敏。
## 日志级别规则
| 级别 | 使用场景 | 示例 |
|------|----------|------|
| Debug | 本地诊断和详细流程 | 解析后的筛选条件、分支决策、缓存命中/未命中 |
| Info | 成功的重要业务事件 | 登录成功、用户创建、权限分配 |
| Warn | 可预期但异常的情况 | 参数校验失败、权限不足、限流触发 |
| Error | 当前工作流发生非预期失败 | 数据库错误、外部 API 失败、事务回滚 |
## 必要字段
可获得时应包含这些字段:
- `request_id`
- `user_id`
- `username`
- `action`
- `resource_type`
- `resource_id`
- `elapsed_ms`
- `error`
- `external_service`
common logger 应在可获得时自动从上下文附加请求 ID 和用户 ID。
## 各层日志职责
| 层级 | 日志职责 |
|------|----------|
| Handler | 通常不主动打日志,参数绑定失败通过 common response 统一处理 |
| Middleware | 请求开始/结束、认证失败、权限拒绝、限流、审计 |
| Service | 业务成功、业务拒绝、事务失败 |
| DAO | 默认不打常规日志,错误向上返回 |
| External adapter | 请求失败、超时、重试、降级兜底 |
## 必打日志
以下场景必须打印日志:
- 应用启动和 debug 模式状态,禁止包含 secret。
- 数据库连接成功/失败,禁止包含密码。
- 登录成功/失败,失败日志使用脱敏用户名。
- 注册、改密、用户状态变更。
- 权限分配、撤销、复制、角色变更。
- 管理员拒绝或权限拒绝。
- 外部调用失败,包含耗时和脱敏后的错误。
- 批处理结果,包含聚合计数。
- 事务回滚,包含 action 和资源标识。
## 敏感数据
禁止记录:
- 密码或密码 hash。
- Token、Authorization 头、Cookie、Refresh token。
- 私钥、Secret、API key。
- MFA secret 或验证码。
- 可能包含 secret 的原始请求体。
使用脱敏后的标识:
```go
common.Warn(ctx, "用户登录失败",
"username", maskUsername(req.Username),
"ip", clientIP,
"reason", "invalid_credential",
)
```
## 反模式
- 禁止使用 `fmt.Println``log.Println` 或临时全局 logger。
- 禁止在紧密循环中逐条打印日志,应使用聚合计数。
- 禁止每层重复记录并返回同一个错误;只在拥有有效上下文的边界打印一次。
- 普通校验失败或权限失败不打印 stack trace。

View File

@@ -0,0 +1,67 @@
# 项目结构规范
除非目标仓库已经有兼容的本地约定,否则优先使用以下结构。生成代码必须自包含在目标 module 内。
```text
cmd/
server/
internal/
common/ # response, AppError, time, logging, request context
config/
db/ # 单数据库 UnitOfWork 或多数据库 Manager
handler/ # Gin handler绑定 DTO、调用 service、返回响应
middleware/ # auth、admin、permission、request ID、audit、rate limit、CORS
service/ # 业务编排、AppError 映射、事务、日志
dao/ # 仅负责 GORM 持久化
model/
dto/ # request/response DTO
entity/ # GORM entity
mapper/ # entity <-> DTO 转换
configs/
scripts/
```
## 依赖方向
```text
handler -> service -> dao -> entity
| | |
+----------+--------+-> common
middleware -> common
middleware -> service 仅在权限/认证逻辑确实需要业务数据时允许
mapper -> dto + entity
```
禁止依赖:
- `dao -> service`
- `dao -> handler`
- `service -> handler`
- `entity -> dto`
- `handler -> gorm`
- `handler -> dao`
- `common -> project business packages`
## 分层职责
| 层级 | 负责 | 禁止负责 |
|------|------|----------|
| Handler | JSON 绑定、Service 调用、common 统一响应 | 业务规则、GORM、事务 |
| Service | 业务规则、AppError 映射、事务、业务日志 | Gin context、HTTP 响应 |
| DAO | GORM 查询和持久化 | 业务决策、响应码 |
| Entity | 数据库结构和 GORM tags | API 展示结构 |
| DTO | API 请求/响应结构 | 数据库 tags 和持久化行为 |
| Mapper | Entity/DTO 转换 | 数据库调用 |
| Common | 横切基础能力 | 领域业务行为 |
## 新模块检查清单
- 先创建 DTO再写 handler。
- 只有需要持久化时才创建或复用 entity。
- 对外返回 entity 数据时必须创建 mapper。
- 每个持久化操作都通过 DAO 方法封装。
- Service 方法负责将 DAO 错误映射为 `common.AppError`
- Handler 方法只绑定 JSON 并调用 Service。
- 只注册带 action 后缀的 `POST` 路由。
- 为变更文件补充验证脚本覆盖。

View File

@@ -0,0 +1,170 @@
# 安全与审计规则
认证、授权、JWT claims、管理员校验、审计日志、限流、CORS、敏感数据处理相关任务需要读取本文件。
## 中间件Middleware顺序
按以下顺序注册 middleware
```text
Recovery -> RequestID -> CORS -> RateLimit -> AccessLog -> Auth -> RequireAdmin/RequirePermission -> Audit
```
规则:
- 公开认证接口可以跳过 `Auth`,但登录/注册尝试仍应使用 RequestID、CORS、RateLimit、AccessLog 和 Audit。
- 受保护接口必须使用 `Auth`
- 管理员接口必须先使用 `Auth`,再使用 `RequireAdmin`
- 权限接口必须先使用 `Auth`,再使用 `RequirePermission`
- 变更数据的接口必须创建审计事件。
## JWT 声明Claims
使用本地 claims 类型。原始 token 字符串不得离开认证中间件。
```go
type Claims struct {
UserID int64 `json:"user_id"`
Username string `json:"username"`
Role string `json:"role"`
}
```
Auth middleware 职责:
- 只读取 `Authorization: Bearer <token>` 请求头。
- 校验签名、过期时间,以及配置要求的 issuer/audience。
-`user_id``username``role` 写入 Gin 上下文和请求上下文。
- token 缺失、格式错误、过期、非法时返回 `CodeUnauthorized`
- 日志只记录请求 ID、脱敏用户名、IP、路径、失败原因禁止记录 token 内容。
## 管理员与权限校验
```go
func RequireAdmin() gin.HandlerFunc {
return func(c *gin.Context) {
role, _ := c.Get("role")
if role != "admin" && role != "superadmin" {
common.ResponseError(c, common.CodeForbidden, "需要管理员权限")
c.Abort()
return
}
c.Next()
}
}
```
权限 middleware 必须:
- 从 JSON body DTO 或上下文读取权限输入,禁止从路径参数或查询参数读取。
- 资源标识缺失时 fail closed。
- 访问被拒绝时返回 `CodeForbidden`
- 使用 warn 级别记录拒绝日志,字段包含 `user_id``resource_type``resource_id``action``request_id`
## 敏感数据脱敏
禁止记录以下值:
- 密码、密码 hash、旧密码、新密码
- Token、刷新 token、authorization 请求头、cookie
- 私钥、secret、API key、access key
- MFA secret、验证码
- 非必要场景下的完整手机号或邮箱
推荐日志字段:
```go
common.Info(ctx, "用户登录成功", "user_id", user.ID, "username", user.Username, "ip", clientIP)
common.Warn(ctx, "用户登录失败", "username", maskUsername(req.Username), "ip", clientIP, "reason", "invalid_password")
```
## 审计事件
以下操作必须审计:
| 分类 | 事件 |
|------|------|
| 认证 | 登录成功、登录失败、退出登录、token 刷新、密码修改 |
| 用户 | 注册、创建用户、更新用户、禁用用户、删除用户 |
| 权限 | 分配权限、复制权限、撤销权限、角色变更 |
| 安全 | 管理员访问拒绝、权限拒绝、限流触发 |
| 外部影响 | 触发构建、部署、同步、导出、删除远端资源 |
审计记录最小字段:
```go
type AuditEvent struct {
RequestID string
UserID int64
Username string
Action string
ResourceType string
ResourceID string
Result string
Reason string
IP string
UserAgent string
CreatedAt time.Time
}
```
审计规则:
- 审计记录不得包含密码、token、私钥或请求体 secret。
- 用户 lookup 前发生登录失败时,记录脱敏用户名和 IP。
- 权限变更必须记录操作者、目标用户、资源、可获得的旧值和新值。
- 审计写入失败时必须记录 error是否中断请求由具体业务域决定。
## 限流
至少为以下接口增加限流:
- 登录/注册/重置密码接口
- Token refresh 接口
- 导出接口
- 高成本列表/搜索接口
- 外部触发类接口
默认建议:
| API 类型 | 默认值 |
|----------|--------|
| 登录 | 每 IP + 每用户名每分钟 5 次 |
| 注册 | 每 IP 每分钟 10 次 |
| 导出 | 每用户每分钟 3 次 |
| 外部触发 | 每用户每分钟 10 次 |
| 列表/搜索 | 每用户每分钟 60 次 |
单实例可以使用内存限流;多实例必须使用 Redis 或其他共享存储。
## CORS
CORS 规则:
- 生产环境禁止使用 wildcard origin。
- 只允许配置中的前端来源。
- 只允许必要方法;业务 API 应只允许 `POST``OPTIONS`
- 允许 `Authorization``Content-Type` 和请求 ID 请求头。
- 除非明确使用 cookie否则保持凭证携带能力关闭。
## 密码与 Token 处理
- 使用 bcrypt、argon2 或项目批准的密码哈希函数存储密码,禁止明文存储。
- 使用库提供的恒定时间比较能力校验密码 hash。
- JWT signing secret 必须放在代码和示例配置之外。
- Secret rotate 必须有明确运维方案。
- 登录失败时不要暴露到底是用户名错误还是密码错误,返回统一认证失败信息。
## 必要日志
必须打印这些日志:
- 启动配置摘要,禁止包含 secret。
- 数据库连接成功/失败,禁止包含密码。
- 认证成功/失败,必须脱敏。
- 权限拒绝和管理员拒绝。
- 数据变更操作成功,包含资源标识。
- 外部服务调用失败,包含 endpoint 名称、耗时和脱敏错误。
- 事务失败,包含操作、可获得的 entity ID 和请求 ID。
循环中禁止逐行打印噪声日志,应使用聚合计数。

View File

@@ -0,0 +1,66 @@
# 时间处理规范
时间是硬一致性规则,不是展示层偏好。
## 强制标准
- 时区:`Asia/Shanghai` (`UTC+08:00`)
- API 格式:带 offset 的 RFC3339例如 `2026-06-29T10:30:00+08:00`
- 存储类型:`time.Time`
- 运行时函数:本地 `common.Now()``common.FormatTime()``common.ParseTime()`
## 禁止项
```go
time.Now()
time.Parse(layout, value)
t.Format(layout)
```
除本地 common 时间运行时外,禁止直接使用上述调用。应使用:
```go
now := common.Now()
timestamp := common.FormatTime(now)
parsed, err := common.ParseTime(req.StartTime)
```
## GORM 实体Entity
```go
type User struct {
ID int64 `gorm:"primaryKey;autoIncrement" json:"id"`
CreatedAt time.Time `gorm:"autoCreateTime" json:"created_at"`
UpdatedAt time.Time `gorm:"autoUpdateTime" json:"updated_at"`
ExpiresAt time.Time `json:"expires_at"`
}
```
规则:
- 持久化字段允许使用 GORM auto timestamps。
- 业务时间必须使用 `common.Now()` 赋值。
- API 响应中的时间由 mapper 统一格式化。
- Handler 禁止手动格式化时间。
## Mapper 示例
```go
func ToUserDTO(user *entity.User) *dto.UserDTO {
if user == nil {
return nil
}
return &dto.UserDTO{
ID: user.ID,
Username: user.Username,
CreatedAt: common.FormatTime(user.CreatedAt),
UpdatedAt: common.FormatTime(user.UpdatedAt),
}
}
```
## 数据库配置
- PostgreSQL 部署应显式设置服务端/会话时区。
- 应用代码仍负责把 API 输出转换为 Asia/Shanghai。
- 禁止依赖宿主机本地时区。

View File

@@ -0,0 +1,21 @@
param(
[string]$ProjectRoot = "."
)
$ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
$Validator = Join-Path $ScriptDir "validate_go_gin_gorm.py"
$Python = Get-Command python -ErrorAction SilentlyContinue
if ($Python) {
& $Python.Source $Validator $ProjectRoot
exit $LASTEXITCODE
}
$PyLauncher = Get-Command py -ErrorAction SilentlyContinue
if ($PyLauncher) {
& $PyLauncher.Source -3 $Validator $ProjectRoot
exit $LASTEXITCODE
}
Write-Error "Python 3 is required to run the validator."
exit 2

View File

@@ -0,0 +1,16 @@
#!/usr/bin/env sh
set -eu
SCRIPT_DIR=$(CDPATH= cd -- "$(dirname -- "$0")" && pwd)
PROJECT_ROOT=${1:-.}
if command -v python3 >/dev/null 2>&1; then
exec python3 "$SCRIPT_DIR/validate_go_gin_gorm.py" "$PROJECT_ROOT"
fi
if command -v python >/dev/null 2>&1; then
exec python "$SCRIPT_DIR/validate_go_gin_gorm.py" "$PROJECT_ROOT"
fi
echo "Python 3 is required to run the validator." >&2
exit 2

View File

@@ -0,0 +1,214 @@
#!/usr/bin/env python3
"""Cross-platform validator for the developing-go-gin-gorm skill."""
from __future__ import annotations
import argparse
import os
import re
import shutil
import subprocess
import sys
from dataclasses import dataclass
from pathlib import Path
SKIP_DIRS = {
".git",
".idea",
".vscode",
"vendor",
"node_modules",
"dist",
"build",
"tmp",
}
FORBIDDEN_ROUTE_METHOD = re.compile(r"\.\s*(GET|PUT|PATCH|DELETE|HEAD|OPTIONS)\s*\(")
FORBIDDEN_GIN_INPUT = re.compile(r"\.\s*(Param|Query|DefaultQuery|ShouldBindQuery)\s*\(")
FORBIDDEN_DIRECT_RESPONSE = re.compile(r"\.\s*(JSON|AbortWithStatusJSON|String|XML|YAML)\s*\(")
FORBIDDEN_TIME = re.compile(r"\btime\s*\.\s*(Now|Parse)\s*\(")
FORBIDDEN_PRINT = re.compile(r"\b(fmt\.Println|log\.Println|println)\s*\(")
@dataclass
class Finding:
severity: str
path: Path
line: int
message: str
def render(self, root: Path) -> str:
rel = self.path.relative_to(root) if self.path.is_relative_to(root) else self.path
return f"[{self.severity}] {rel}:{self.line}: {self.message}"
def iter_files(root: Path, suffixes: set[str]) -> list[Path]:
files: list[Path] = []
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [name for name in dirnames if name not in SKIP_DIRS]
base = Path(dirpath)
for filename in filenames:
path = base / filename
if path.suffix in suffixes:
files.append(path)
return sorted(files)
def read_text(path: Path) -> str:
return path.read_text(encoding="utf-8", errors="replace")
def is_under(path: Path, *names: str) -> bool:
parts = path.parts
if len(parts) < len(names):
return False
for i in range(0, len(parts) - len(names) + 1):
if parts[i : i + len(names)] == names:
return True
return False
def is_common_runtime(path: Path) -> bool:
return is_under(path, "internal", "common") or is_under(path, "pkg", "common")
def is_handler(path: Path) -> bool:
return "handler" in path.parts
def is_service(path: Path) -> bool:
return "service" in path.parts
def is_dao(path: Path) -> bool:
return "dao" in path.parts or "repository" in path.parts
def scan_text(root: Path, path: Path, text: str) -> list[Finding]:
findings: list[Finding] = []
lines = text.splitlines()
for index, line in enumerate(lines, start=1):
if FORBIDDEN_ROUTE_METHOD.search(line):
findings.append(Finding("ERROR", path, index, "business routes must use POST only"))
if FORBIDDEN_GIN_INPUT.search(line):
findings.append(Finding("ERROR", path, index, "business APIs must bind JSON body, not path/query input"))
if FORBIDDEN_DIRECT_RESPONSE.search(line) and not is_common_runtime(path):
findings.append(Finding("ERROR", path, index, "use common response helpers instead of direct Gin responses"))
if FORBIDDEN_TIME.search(line) and not is_common_runtime(path):
findings.append(Finding("ERROR", path, index, "use common.Now/common.ParseTime instead of direct time calls"))
if FORBIDDEN_PRINT.search(line):
findings.append(Finding("ERROR", path, index, "use common structured logging instead of print/log.Println"))
if is_dao(path) and ".Transaction(" in line:
findings.append(Finding("ERROR", path, index, "DAO must not start transactions; service owns Unit of Work"))
if is_handler(path) and "gorm.io/" in text:
findings.append(Finding("ERROR", path, 1, "handler must not import GORM"))
if is_handler(path) and "/dao" in text:
findings.append(Finding("ERROR", path, 1, "handler must not import DAO directly"))
if is_service(path) and "github.com/gin-gonic/gin" in text:
findings.append(Finding("ERROR", path, 1, "service must not import Gin"))
if is_dao(path) and ("/service" in text or "/handler" in text):
findings.append(Finding("ERROR", path, 1, "DAO must not import service or handler"))
return findings
def scan_crlf(root: Path) -> list[Finding]:
findings: list[Finding] = []
for path in iter_files(root, {".go", ".md", ".sh", ".py", ".ps1", ".bat"}):
data = path.read_bytes()
if b"\r\n" in data:
findings.append(Finding("ERROR", path, 1, "file uses CRLF; use LF to keep scripts portable"))
return findings
def run_gofmt(root: Path, go_files: list[Path]) -> list[Finding]:
if not go_files or shutil.which("gofmt") is None:
return []
findings: list[Finding] = []
chunk_size = 100
for start in range(0, len(go_files), chunk_size):
chunk = go_files[start : start + chunk_size]
result = subprocess.run(
["gofmt", "-l", *[str(path) for path in chunk]],
cwd=root,
text=True,
capture_output=True,
check=False,
)
if result.returncode != 0:
findings.append(Finding("ERROR", root, 1, f"gofmt failed: {result.stderr.strip()}"))
continue
for filename in result.stdout.splitlines():
findings.append(Finding("ERROR", Path(filename), 1, "file is not gofmt-formatted"))
return findings
def run_go_test(root: Path) -> list[Finding]:
if shutil.which("go") is None:
return [Finding("WARN", root, 1, "go binary not found; skipped go test")]
if not (root / "go.mod").exists():
return [Finding("WARN", root, 1, "go.mod not found; skipped go test")]
result = subprocess.run(
["go", "test", "./..."],
cwd=root,
text=True,
capture_output=True,
check=False,
)
if result.returncode == 0:
return []
output = (result.stdout + "\n" + result.stderr).strip()
first_line = output.splitlines()[0] if output else "go test failed"
return [Finding("ERROR", root, 1, first_line)]
def validate(root: Path, run_tests: bool, skip_gofmt: bool) -> list[Finding]:
findings: list[Finding] = []
go_files = iter_files(root, {".go"})
findings.extend(scan_crlf(root))
for path in go_files:
findings.extend(scan_text(root, path, read_text(path)))
if not skip_gofmt:
findings.extend(run_gofmt(root, go_files))
if run_tests:
findings.extend(run_go_test(root))
return findings
def main() -> int:
parser = argparse.ArgumentParser(description="Validate Go Gin/GORM engineering rules.")
parser.add_argument("root", nargs="?", default=".", help="project root to validate")
parser.add_argument("--run-go-test", action="store_true", help="also run go test ./...")
parser.add_argument("--skip-gofmt", action="store_true", help="skip gofmt -l check")
args = parser.parse_args()
root = Path(args.root).resolve()
if not root.exists():
print(f"target root does not exist: {root}", file=sys.stderr)
return 2
findings = validate(root, run_tests=args.run_go_test, skip_gofmt=args.skip_gofmt)
errors = [finding for finding in findings if finding.severity == "ERROR"]
warnings = [finding for finding in findings if finding.severity == "WARN"]
for finding in findings:
print(finding.render(root))
if errors:
print(f"\nValidation failed: {len(errors)} error(s), {len(warnings)} warning(s).")
return 1
print(f"Validation passed: 0 error(s), {len(warnings)} warning(s).")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,76 @@
---
name: create-adaptable-composable
description: Create a library-grade Vue composable that accepts maybe-reactive inputs (MaybeRef / MaybeRefOrGetter) so callers can pass a plain value, ref, or getter. Normalize inputs with toValue()/toRef() inside reactive effects (watch/watchEffect) to keep behavior predictable and reactive. Use this skill when user asks for creating adaptable or reusable composables.
license: MIT
metadata:
author: github.com/vuejs-ai
version: "17.0.0"
compatibility: Requires Vue 3 (or above) or Nuxt 3 (or above) project
---
# Create Adaptable Composable
Adaptable composables are reusable functions that can accept both reactive and non-reactive inputs. This allows developers to use the composable in a variety of contexts without worrying about the reactivity of the inputs.
Steps to design an adaptable composable in Vue.js:
1. Confirm the composable's purpose and API design and expected inputs/outputs.
2. Identify inputs params that should be reactive (MaybeRef / MaybeRefOrGetter).
3. Use `toValue()` or `toRef()` to normalize inputs inside reactive effects.
4. Implement the core logic of the composable using Vue's reactivity APIs.
## Core Type Concepts
### Type Utilities
```ts
/**
* value or writable ref (value/ref/shallowRef/writable computed)
*/
export type MaybeRef<T = any> = T | Ref<T> | ShallowRef<T> | WritableComputedRef<T>;
/**
* MaybeRef<T> + ComputedRef<T> + () => T
*/
export type MaybeRefOrGetter<T = any> = MaybeRef<T> | ComputedRef<T> | (() => T);
```
### Policy and Rules
- Read-only, computed-friendly input: use `MaybeRefOrGetter`
- Needs to be writable / two-way input: use `MaybeRef`
- Parameter might be a function value (callback/predicate/comparator): do not use `MaybeRefOrGetter`, or you may accidentally invoke it as a getter.
- DOM/Element targets: if you want computed/derived targets, use `MaybeRefOrGetter`.
When `MaybeRefOrGetter` or `MaybeRef` is used:
- resolve reactive value using `toRef()` (e.g. watcher source)
- resolve non-reactive value using `toValue()`
### Examples
Adaptable `useDocumentTitle` Composable: read-only title parameter
```ts
import { watch, toRef } from 'vue'
import type { MaybeRefOrGetter } from 'vue'
export function useDocumentTitle(title: MaybeRefOrGetter<string>) {
watch(toRef(title), (t) => {
document.title = t
}, { immediate: true })
}
```
Adaptable `useCounter` Composable: two-way writable count parameter
```ts
import { watch, toRef } from 'vue'
import type { MaybeRef } from 'vue'
function useCounter(count: MaybeRef<number>) {
const countRef = toRef(count)
function add() {
countRef.value++
}
return { add }
}
```

View File

@@ -0,0 +1,177 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS

View File

@@ -0,0 +1,55 @@
---
name: frontend-design
description: Guidance for distinctive, intentional visual design when building new UI or reshaping an existing one. Helps with aesthetic direction, typography, and making choices that don't read as templated defaults.
license: Complete terms in LICENSE.txt
---
# Frontend Design
Approach this as the design lead at a small studio known for giving every client a visual identity that could not be mistaken for anyone else's. This client has already rejected proposals that felt templated, and is paying for a distinctive point of view: make deliberate, opinionated choices about palette, typography, and layout that are specific to this brief, and take one real aesthetic risk you can justify.
## Ground it in the subject
If the brief does not pin down what the product or subject is, pin it yourself before designing: name one concrete subject, its audience, and the page's single job, and state your choice. If there's any information in your memory about the human's preferences, context about what they're building, or designs you've made before use that as a hint. The subject's own world, its materials, instruments, artifacts, and vernacular, is where distinctive choices come from. Build with the brief's real content and subject matter throughout.
## Design principles
For web designs, the hero is a thesis. Open with the most characteristic thing in the subject's world, in whatever form makes sense for it: a headline, an image, an animation, a live demo, an interactive moment. Be deliberate with your choice: a big number with a small label, supporting stats, and a gradient accent is the template answer, only use if that's truly the best option.
Typography carries the personality of the page. Pair the display and body faces deliberately, not the same families you would reach for on any other project, and set a clear type scale with intentional weights, widths, and spacing. Make the type treatment itself a memorable part of the design, not a neutral delivery vehicle for the content.
Structure is information. Structural devices, numbering, eyebrows, dividers, labels, should encode something true about the content, not decorate it. Many generic designs use numbered markers (01 / 02 / 03), but that's only appropriate if the content actually is a sequence - like a real process or a typed timeline where order carries information the reader needs. Question if choices like numbered markers actually make sense before incorporating them.
Leverage motion deliberately. Think about where and if animation can serve the subject: a page-load sequence, a scroll-triggered reveal, hover micro-interactions, ambient atmosphere. An orchestrated moment usually lands harder than scattered effects; choose what the direction calls for. However, sometimes less is more, and extra animation contributes to the feeling that the design is AI-generated.
Match complexity to the vision. Maximalist directions need elaborate execution; minimal directions need precision in spacing, type, and detail. Elegance is executing the chosen vision well.
Consider written content carefully. Often a design brief may not contain real content, and it's up to you to come up with copy. Copy can make a design feel as templated as the design itself. See the below section on writing for more guidance.
## Process: brainstorm, explore, plan, critique, build, critique again
For calibration: AI-generated design right now clusters around three looks: (1) a warm cream background (near #F4F1EA) with a high-contrast serif display and a terracotta accent; (2) a near-black background with a single bright acid-green or vermilion accent; (3) a broadsheet-style layout with hairline rules, zero border-radius, and dense newspaper-like columns. All three are legitimate for some briefs, but they are defaults rather than choices, and they appear regardless of subject. Where the brief pins down a visual direction, follow it exactly — the brief's own words always win, including when it asks for one of these looks. Where it leaves an axis free, don't spend that freedom on one of these defaults. Just like a human designer who's hired, there's often a careful balance between doing what you're good at and taking each project as a chance to experiment and learn.
Work in two passes. First, brainstorm a short design plan based on the human's design brief: create a compact token system with color, type, layout, and signature. Color: describe the palette as 46 named hex values. Type: the typefaces for 2+ roles (a characterful display face that's used with restraint, a complementary body face, and a utility face for captions or data if needed). Layout: a layout concept, using one-sentence prose descriptions and ASCII wireframes to ideate and compare. Signature: the single unique element this page will be remembered by that embodies the brief in an appropriate way.
Then review that plan against the brief before building: if any part of it reads like the generic default you would produce for any similar page (work through a similar prompt to see if you arrive somewhere similar) rather than a choice made for this specific brief — revise that part, say what you changed and why. Only after you've confirmed the relative uniqueness of your design plan should you start to write the code, following the revised plan exactly and deriving every color and type decision from it.
When writing the code, be careful of structuring your CSS selector specificities. It's easy to generate CSS classes that cancel each other out (especially with a type-based selector like .section and a element-based selector like .cta). This can happen often with paddings/margins between sections.
Try to do a lot of this planning and iteration in your thinking, and only show ideas to the user when you have higher confidence it'll delight them.
## Restraint and self-critique
Spend your boldness in one place. Let the signature element be the one memorable thing, keep everything around it quiet and disciplined, and cut any decoration that does not serve the brief. Not taking a risk can be a risk itself! Build to a quality floor without announcing it: responsive down to mobile, visible keyboard focus, reduced motion respected. Critique your own work as you build, taking screenshots if your environment supports it a picture is worth 1000 tokens. Consider Chanel's advice: before leaving the house, take a look in the mirror and remove one accessory. Human creators have memory and always try to do something new, so if you have a space to quickly jot down notes about what you've tried, it can help you in future passes.
## More on writing in design
Words appear in a design for one reason: to make it easier to understand, and therefore easier to use. They are design material, not decoration. Bring the same intentionality to copy that you would bring to spacing and color. Before writing anything, ask what the design needs to say, and how it can best be said to help the person navigate the experience.
Write from the end user's side of the screen. Name things by what people control and recognize, never by how the system is built. A person manages notifications, not webhook config. Describe what something does in plain terms rather than selling it. Being specific is always better than being clever.
Use active voice as default. A control should say exactly what happens when it's used: "Save changes," not "Submit." An action keeps the same name through the whole flow, so the button that says "Publish" produces a toast that says "Published." The vocabulary of an interface is the signposting for someone navigating the product. Cohesion and consistency are how people learn their way around.
Treat failure and emptiness as moments for direction, not mood. Explain what went wrong and how to fix it, in the interface's voice rather than a person's. Errors don't apologize, and they are never vague about what happened. An empty screen is an invitation to act.
Keep the register conversational and tuned: plain verbs, sentence case, no filler, with tone matched to the brand and the audience. Let each element do exactly one job. A label labels, an example demonstrates, and nothing quietly does double duty.

View File

@@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2026 Anthropic, PBC.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,485 @@
---
name: skill-creator
description: Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
---
# Skill Creator
A skill for creating new skills and iteratively improving them.
At a high level, the process of creating a skill goes like this:
- Decide what you want the skill to do and roughly how it should do it
- Write a draft of the skill
- Create a few test prompts and run claude-with-access-to-the-skill on them
- Help the user evaluate the results both qualitatively and quantitatively
- While the runs happen in the background, draft some quantitative evals if there aren't any (if there are some, you can either use as is or modify if you feel something needs to change about them). Then explain them to the user (or if they already existed, explain the ones that already exist)
- Use the `eval-viewer/generate_review.py` script to show the user the results for them to look at, and also let them look at the quantitative metrics
- Rewrite the skill based on feedback from the user's evaluation of the results (and also if there are any glaring flaws that become apparent from the quantitative benchmarks)
- Repeat until you're satisfied
- Expand the test set and try again at larger scale
Your job when using this skill is to figure out where the user is in this process and then jump in and help them progress through these stages. So for instance, maybe they're like "I want to make a skill for X". You can help narrow down what they mean, write a draft, write the test cases, figure out how they want to evaluate, run all the prompts, and repeat.
On the other hand, maybe they already have a draft of the skill. In this case you can go straight to the eval/iterate part of the loop.
Of course, you should always be flexible and if the user is like "I don't need to run a bunch of evaluations, just vibe with me", you can do that instead.
Then after the skill is done (but again, the order is flexible), you can also run the skill description improver, which we have a whole separate script for, to optimize the triggering of the skill.
Cool? Cool.
## Communicating with the user
The skill creator is liable to be used by people across a wide range of familiarity with coding jargon. If you haven't heard (and how could you, it's only very recently that it started), there's a trend now where the power of Claude is inspiring plumbers to open up their terminals, parents and grandparents to google "how to install npm". On the other hand, the bulk of users are probably fairly computer-literate.
So please pay attention to context cues to understand how to phrase your communication! In the default case, just to give you some idea:
- "evaluation" and "benchmark" are borderline, but OK
- for "JSON" and "assertion" you want to see serious cues from the user that they know what those things are before using them without explaining them
It's OK to briefly explain terms if you're in doubt, and feel free to clarify terms with a short definition if you're unsure if the user will get it.
---
## Creating a skill
### Capture Intent
Start by understanding the user's intent. The current conversation might already contain a workflow the user wants to capture (e.g., they say "turn this into a skill"). If so, extract answers from the conversation history first — the tools used, the sequence of steps, corrections the user made, input/output formats observed. The user may need to fill the gaps, and should confirm before proceeding to the next step.
1. What should this skill enable Claude to do?
2. When should this skill trigger? (what user phrases/contexts)
3. What's the expected output format?
4. Should we set up test cases to verify the skill works? Skills with objectively verifiable outputs (file transforms, data extraction, code generation, fixed workflow steps) benefit from test cases. Skills with subjective outputs (writing style, art) often don't need them. Suggest the appropriate default based on the skill type, but let the user decide.
### Interview and Research
Proactively ask questions about edge cases, input/output formats, example files, success criteria, and dependencies. Wait to write test prompts until you've got this part ironed out.
Check available MCPs - if useful for research (searching docs, finding similar skills, looking up best practices), research in parallel via subagents if available, otherwise inline. Come prepared with context to reduce burden on the user.
### Write the SKILL.md
Based on the user interview, fill in these components:
- **name**: Skill identifier
- **description**: When to trigger, what it does. This is the primary triggering mechanism - include both what the skill does AND specific contexts for when to use it. All "when to use" info goes here, not in the body. Note: currently Claude has a tendency to "undertrigger" skills -- to not use them when they'd be useful. To combat this, please make the skill descriptions a little bit "pushy". So for instance, instead of "How to build a simple fast dashboard to display internal Anthropic data.", you might write "How to build a simple fast dashboard to display internal Anthropic data. Make sure to use this skill whenever the user mentions dashboards, data visualization, internal metrics, or wants to display any kind of company data, even if they don't explicitly ask for a 'dashboard.'"
- **compatibility**: Required tools, dependencies (optional, rarely needed)
- **the rest of the skill :)**
### Skill Writing Guide
#### Anatomy of a Skill
```
skill-name/
├── SKILL.md (required)
│ ├── YAML frontmatter (name, description required)
│ └── Markdown instructions
└── Bundled Resources (optional)
├── scripts/ - Executable code for deterministic/repetitive tasks
├── references/ - Docs loaded into context as needed
└── assets/ - Files used in output (templates, icons, fonts)
```
#### Progressive Disclosure
Skills use a three-level loading system:
1. **Metadata** (name + description) - Always in context (~100 words)
2. **SKILL.md body** - In context whenever skill triggers (<500 lines ideal)
3. **Bundled resources** - As needed (unlimited, scripts can execute without loading)
These word counts are approximate and you can feel free to go longer if needed.
**Key patterns:**
- Keep SKILL.md under 500 lines; if you're approaching this limit, add an additional layer of hierarchy along with clear pointers about where the model using the skill should go next to follow up.
- Reference files clearly from SKILL.md with guidance on when to read them
- For large reference files (>300 lines), include a table of contents
**Domain organization**: When a skill supports multiple domains/frameworks, organize by variant:
```
cloud-deploy/
├── SKILL.md (workflow + selection)
└── references/
├── aws.md
├── gcp.md
└── azure.md
```
Claude reads only the relevant reference file.
#### Principle of Lack of Surprise
This goes without saying, but skills must not contain malware, exploit code, or any content that could compromise system security. A skill's contents should not surprise the user in their intent if described. Don't go along with requests to create misleading skills or skills designed to facilitate unauthorized access, data exfiltration, or other malicious activities. Things like a "roleplay as an XYZ" are OK though.
#### Writing Patterns
Prefer using the imperative form in instructions.
**Defining output formats** - You can do it like this:
```markdown
## Report structure
ALWAYS use this exact template:
# [Title]
## Executive summary
## Key findings
## Recommendations
```
**Examples pattern** - It's useful to include examples. You can format them like this (but if "Input" and "Output" are in the examples you might want to deviate a little):
```markdown
## Commit message format
**Example 1:**
Input: Added user authentication with JWT tokens
Output: feat(auth): implement JWT-based authentication
```
### Writing Style
Try to explain to the model why things are important in lieu of heavy-handed musty MUSTs. Use theory of mind and try to make the skill general and not super-narrow to specific examples. Start by writing a draft and then look at it with fresh eyes and improve it.
### Test Cases
After writing the skill draft, come up with 2-3 realistic test prompts — the kind of thing a real user would actually say. Share them with the user: [you don't have to use this exact language] "Here are a few test cases I'd like to try. Do these look right, or do you want to add more?" Then run them.
Save test cases to `evals/evals.json`. Don't write assertions yet — just the prompts. You'll draft assertions in the next step while the runs are in progress.
```json
{
"skill_name": "example-skill",
"evals": [
{
"id": 1,
"prompt": "User's task prompt",
"expected_output": "Description of expected result",
"files": []
}
]
}
```
See `references/schemas.md` for the full schema (including the `assertions` field, which you'll add later).
## Running and evaluating test cases
This section is one continuous sequence — don't stop partway through. Do NOT use `/skill-test` or any other testing skill.
Put results in `<skill-name>-workspace/` as a sibling to the skill directory. Within the workspace, organize results by iteration (`iteration-1/`, `iteration-2/`, etc.) and within that, each test case gets a directory (`eval-0/`, `eval-1/`, etc.). Don't create all of this upfront — just create directories as you go.
### Step 1: Spawn all runs (with-skill AND baseline) in the same turn
For each test case, spawn two subagents in the same turn — one with the skill, one without. This is important: don't spawn the with-skill runs first and then come back for baselines later. Launch everything at once so it all finishes around the same time.
**With-skill run:**
```
Execute this task:
- Skill path: <path-to-skill>
- Task: <eval prompt>
- Input files: <eval files if any, or "none">
- Save outputs to: <workspace>/iteration-<N>/eval-<ID>/with_skill/outputs/
- Outputs to save: <what the user cares about — e.g., "the .docx file", "the final CSV">
```
**Baseline run** (same prompt, but the baseline depends on context):
- **Creating a new skill**: no skill at all. Same prompt, no skill path, save to `without_skill/outputs/`.
- **Improving an existing skill**: the old version. Before editing, snapshot the skill (`cp -r <skill-path> <workspace>/skill-snapshot/`), then point the baseline subagent at the snapshot. Save to `old_skill/outputs/`.
Write an `eval_metadata.json` for each test case (assertions can be empty for now). Give each eval a descriptive name based on what it's testing — not just "eval-0". Use this name for the directory too. If this iteration uses new or modified eval prompts, create these files for each new eval directory — don't assume they carry over from previous iterations.
```json
{
"eval_id": 0,
"eval_name": "descriptive-name-here",
"prompt": "The user's task prompt",
"assertions": []
}
```
### Step 2: While runs are in progress, draft assertions
Don't just wait for the runs to finish — you can use this time productively. Draft quantitative assertions for each test case and explain them to the user. If assertions already exist in `evals/evals.json`, review them and explain what they check.
Good assertions are objectively verifiable and have descriptive names — they should read clearly in the benchmark viewer so someone glancing at the results immediately understands what each one checks. Subjective skills (writing style, design quality) are better evaluated qualitatively — don't force assertions onto things that need human judgment.
Update the `eval_metadata.json` files and `evals/evals.json` with the assertions once drafted. Also explain to the user what they'll see in the viewer — both the qualitative outputs and the quantitative benchmark.
### Step 3: As runs complete, capture timing data
When each subagent task completes, you receive a notification containing `total_tokens` and `duration_ms`. Save this data immediately to `timing.json` in the run directory:
```json
{
"total_tokens": 84852,
"duration_ms": 23332,
"total_duration_seconds": 23.3
}
```
This is the only opportunity to capture this data — it comes through the task notification and isn't persisted elsewhere. Process each notification as it arrives rather than trying to batch them.
### Step 4: Grade, aggregate, and launch the viewer
Once all runs are done:
1. **Grade each run** — spawn a grader subagent (or grade inline) that reads `agents/grader.md` and evaluates each assertion against the outputs. Save results to `grading.json` in each run directory. The grading.json expectations array must use the fields `text`, `passed`, and `evidence` (not `name`/`met`/`details` or other variants) — the viewer depends on these exact field names. For assertions that can be checked programmatically, write and run a script rather than eyeballing it — scripts are faster, more reliable, and can be reused across iterations.
2. **Aggregate into benchmark** — run the aggregation script from the skill-creator directory:
```bash
python -m scripts.aggregate_benchmark <workspace>/iteration-N --skill-name <name>
```
This produces `benchmark.json` and `benchmark.md` with pass_rate, time, and tokens for each configuration, with mean ± stddev and the delta. If generating benchmark.json manually, see `references/schemas.md` for the exact schema the viewer expects.
Put each with_skill version before its baseline counterpart.
3. **Do an analyst pass** — read the benchmark data and surface patterns the aggregate stats might hide. See `agents/analyzer.md` (the "Analyzing Benchmark Results" section) for what to look for — things like assertions that always pass regardless of skill (non-discriminating), high-variance evals (possibly flaky), and time/token tradeoffs.
4. **Launch the viewer** with both qualitative outputs and quantitative data:
```bash
nohup python <skill-creator-path>/eval-viewer/generate_review.py \
<workspace>/iteration-N \
--skill-name "my-skill" \
--benchmark <workspace>/iteration-N/benchmark.json \
> /dev/null 2>&1 &
VIEWER_PID=$!
```
For iteration 2+, also pass `--previous-workspace <workspace>/iteration-<N-1>`.
**Cowork / headless environments:** If `webbrowser.open()` is not available or the environment has no display, use `--static <output_path>` to write a standalone HTML file instead of starting a server. Feedback will be downloaded as a `feedback.json` file when the user clicks "Submit All Reviews". After download, copy `feedback.json` into the workspace directory for the next iteration to pick up.
Note: please use generate_review.py to create the viewer; there's no need to write custom HTML.
5. **Tell the user** something like: "I've opened the results in your browser. There are two tabs — 'Outputs' lets you click through each test case and leave feedback, 'Benchmark' shows the quantitative comparison. When you're done, come back here and let me know."
### What the user sees in the viewer
The "Outputs" tab shows one test case at a time:
- **Prompt**: the task that was given
- **Output**: the files the skill produced, rendered inline where possible
- **Previous Output** (iteration 2+): collapsed section showing last iteration's output
- **Formal Grades** (if grading was run): collapsed section showing assertion pass/fail
- **Feedback**: a textbox that auto-saves as they type
- **Previous Feedback** (iteration 2+): their comments from last time, shown below the textbox
The "Benchmark" tab shows the stats summary: pass rates, timing, and token usage for each configuration, with per-eval breakdowns and analyst observations.
Navigation is via prev/next buttons or arrow keys. When done, they click "Submit All Reviews" which saves all feedback to `feedback.json`.
### Step 5: Read the feedback
When the user tells you they're done, read `feedback.json`:
```json
{
"reviews": [
{"run_id": "eval-0-with_skill", "feedback": "the chart is missing axis labels", "timestamp": "..."},
{"run_id": "eval-1-with_skill", "feedback": "", "timestamp": "..."},
{"run_id": "eval-2-with_skill", "feedback": "perfect, love this", "timestamp": "..."}
],
"status": "complete"
}
```
Empty feedback means the user thought it was fine. Focus your improvements on the test cases where the user had specific complaints.
Kill the viewer server when you're done with it:
```bash
kill $VIEWER_PID 2>/dev/null
```
---
## Improving the skill
This is the heart of the loop. You've run the test cases, the user has reviewed the results, and now you need to make the skill better based on their feedback.
### How to think about improvements
1. **Generalize from the feedback.** The big picture thing that's happening here is that we're trying to create skills that can be used a million times (maybe literally, maybe even more who knows) across many different prompts. Here you and the user are iterating on only a few examples over and over again because it helps move faster. The user knows these examples in and out and it's quick for them to assess new outputs. But if the skill you and the user are codeveloping works only for those examples, it's useless. Rather than put in fiddly overfitty changes, or oppressively constrictive MUSTs, if there's some stubborn issue, you might try branching out and using different metaphors, or recommending different patterns of working. It's relatively cheap to try and maybe you'll land on something great.
2. **Keep the prompt lean.** Remove things that aren't pulling their weight. Make sure to read the transcripts, not just the final outputs — if it looks like the skill is making the model waste a bunch of time doing things that are unproductive, you can try getting rid of the parts of the skill that are making it do that and seeing what happens.
3. **Explain the why.** Try hard to explain the **why** behind everything you're asking the model to do. Today's LLMs are *smart*. They have good theory of mind and when given a good harness can go beyond rote instructions and really make things happen. Even if the feedback from the user is terse or frustrated, try to actually understand the task and why the user is writing what they wrote, and what they actually wrote, and then transmit this understanding into the instructions. If you find yourself writing ALWAYS or NEVER in all caps, or using super rigid structures, that's a yellow flag — if possible, reframe and explain the reasoning so that the model understands why the thing you're asking for is important. That's a more humane, powerful, and effective approach.
4. **Look for repeated work across test cases.** Read the transcripts from the test runs and notice if the subagents all independently wrote similar helper scripts or took the same multi-step approach to something. If all 3 test cases resulted in the subagent writing a `create_docx.py` or a `build_chart.py`, that's a strong signal the skill should bundle that script. Write it once, put it in `scripts/`, and tell the skill to use it. This saves every future invocation from reinventing the wheel.
This task is pretty important (we are trying to create billions a year in economic value here!) and your thinking time is not the blocker; take your time and really mull things over. I'd suggest writing a draft revision and then looking at it anew and making improvements. Really do your best to get into the head of the user and understand what they want and need.
### The iteration loop
After improving the skill:
1. Apply your improvements to the skill
2. Rerun all test cases into a new `iteration-<N+1>/` directory, including baseline runs. If you're creating a new skill, the baseline is always `without_skill` (no skill) — that stays the same across iterations. If you're improving an existing skill, use your judgment on what makes sense as the baseline: the original version the user came in with, or the previous iteration.
3. Launch the reviewer with `--previous-workspace` pointing at the previous iteration
4. Wait for the user to review and tell you they're done
5. Read the new feedback, improve again, repeat
Keep going until:
- The user says they're happy
- The feedback is all empty (everything looks good)
- You're not making meaningful progress
---
## Advanced: Blind comparison
For situations where you want a more rigorous comparison between two versions of a skill (e.g., the user asks "is the new version actually better?"), there's a blind comparison system. Read `agents/comparator.md` and `agents/analyzer.md` for the details. The basic idea is: give two outputs to an independent agent without telling it which is which, and let it judge quality. Then analyze why the winner won.
This is optional, requires subagents, and most users won't need it. The human review loop is usually sufficient.
---
## Description Optimization
The description field in SKILL.md frontmatter is the primary mechanism that determines whether Claude invokes a skill. After creating or improving a skill, offer to optimize the description for better triggering accuracy.
### Step 1: Generate trigger eval queries
Create 20 eval queries — a mix of should-trigger and should-not-trigger. Save as JSON:
```json
[
{"query": "the user prompt", "should_trigger": true},
{"query": "another prompt", "should_trigger": false}
]
```
The queries must be realistic and something a Claude Code or Claude.ai user would actually type. Not abstract requests, but requests that are concrete and specific and have a good amount of detail. For instance, file paths, personal context about the user's job or situation, column names and values, company names, URLs. A little bit of backstory. Some might be in lowercase or contain abbreviations or typos or casual speech. Use a mix of different lengths, and focus on edge cases rather than making them clear-cut (the user will get a chance to sign off on them).
Bad: `"Format this data"`, `"Extract text from PDF"`, `"Create a chart"`
Good: `"ok so my boss just sent me this xlsx file (its in my downloads, called something like 'Q4 sales final FINAL v2.xlsx') and she wants me to add a column that shows the profit margin as a percentage. The revenue is in column C and costs are in column D i think"`
For the **should-trigger** queries (8-10), think about coverage. You want different phrasings of the same intent — some formal, some casual. Include cases where the user doesn't explicitly name the skill or file type but clearly needs it. Throw in some uncommon use cases and cases where this skill competes with another but should win.
For the **should-not-trigger** queries (8-10), the most valuable ones are the near-misses — queries that share keywords or concepts with the skill but actually need something different. Think adjacent domains, ambiguous phrasing where a naive keyword match would trigger but shouldn't, and cases where the query touches on something the skill does but in a context where another tool is more appropriate.
The key thing to avoid: don't make should-not-trigger queries obviously irrelevant. "Write a fibonacci function" as a negative test for a PDF skill is too easy — it doesn't test anything. The negative cases should be genuinely tricky.
### Step 2: Review with user
Present the eval set to the user for review using the HTML template:
1. Read the template from `assets/eval_review.html`
2. Replace the placeholders:
- `__EVAL_DATA_PLACEHOLDER__` → the JSON array of eval items (no quotes around it — it's a JS variable assignment)
- `__SKILL_NAME_PLACEHOLDER__` → the skill's name
- `__SKILL_DESCRIPTION_PLACEHOLDER__` → the skill's current description
3. Write to a temp file (e.g., `/tmp/eval_review_<skill-name>.html`) and open it: `open /tmp/eval_review_<skill-name>.html`
4. The user can edit queries, toggle should-trigger, add/remove entries, then click "Export Eval Set"
5. The file downloads to `~/Downloads/eval_set.json` — check the Downloads folder for the most recent version in case there are multiple (e.g., `eval_set (1).json`)
This step matters — bad eval queries lead to bad descriptions.
### Step 3: Run the optimization loop
Tell the user: "This will take some time — I'll run the optimization loop in the background and check on it periodically."
Save the eval set to the workspace, then run in the background:
```bash
python -m scripts.run_loop \
--eval-set <path-to-trigger-eval.json> \
--skill-path <path-to-skill> \
--model <model-id-powering-this-session> \
--max-iterations 5 \
--verbose
```
Use the model ID from your system prompt (the one powering the current session) so the triggering test matches what the user actually experiences.
While it runs, periodically tail the output to give the user updates on which iteration it's on and what the scores look like.
This handles the full optimization loop automatically. It splits the eval set into 60% train and 40% held-out test, evaluates the current description (running each query 3 times to get a reliable trigger rate), then calls Claude to propose improvements based on what failed. It re-evaluates each new description on both train and test, iterating up to 5 times. When it's done, it opens an HTML report in the browser showing the results per iteration and returns JSON with `best_description` — selected by test score rather than train score to avoid overfitting.
### How skill triggering works
Understanding the triggering mechanism helps design better eval queries. Skills appear in Claude's `available_skills` list with their name + description, and Claude decides whether to consult a skill based on that description. The important thing to know is that Claude only consults skills for tasks it can't easily handle on its own — simple, one-step queries like "read this PDF" may not trigger a skill even if the description matches perfectly, because Claude can handle them directly with basic tools. Complex, multi-step, or specialized queries reliably trigger skills when the description matches.
This means your eval queries should be substantive enough that Claude would actually benefit from consulting a skill. Simple queries like "read file X" are poor test cases — they won't trigger skills regardless of description quality.
### Step 4: Apply the result
Take `best_description` from the JSON output and update the skill's SKILL.md frontmatter. Show the user before/after and report the scores.
---
### Package and Present (only if `present_files` tool is available)
Check whether you have access to the `present_files` tool. If you don't, skip this step. If you do, package the skill and present the .skill file to the user:
```bash
python -m scripts.package_skill <path/to/skill-folder>
```
After packaging, direct the user to the resulting `.skill` file path so they can install it.
---
## Claude.ai-specific instructions
In Claude.ai, the core workflow is the same (draft → test → review → improve → repeat), but because Claude.ai doesn't have subagents, some mechanics change. Here's what to adapt:
**Running test cases**: No subagents means no parallel execution. For each test case, read the skill's SKILL.md, then follow its instructions to accomplish the test prompt yourself. Do them one at a time. This is less rigorous than independent subagents (you wrote the skill and you're also running it, so you have full context), but it's a useful sanity check — and the human review step compensates. Skip the baseline runs — just use the skill to complete the task as requested.
**Reviewing results**: If you can't open a browser (e.g., Claude.ai's VM has no display, or you're on a remote server), skip the browser reviewer entirely. Instead, present results directly in the conversation. For each test case, show the prompt and the output. If the output is a file the user needs to see (like a .docx or .xlsx), save it to the filesystem and tell them where it is so they can download and inspect it. Ask for feedback inline: "How does this look? Anything you'd change?"
**Benchmarking**: Skip the quantitative benchmarking — it relies on baseline comparisons which aren't meaningful without subagents. Focus on qualitative feedback from the user.
**The iteration loop**: Same as before — improve the skill, rerun the test cases, ask for feedback — just without the browser reviewer in the middle. You can still organize results into iteration directories on the filesystem if you have one.
**Description optimization**: This section requires the `claude` CLI tool (specifically `claude -p`) which is only available in Claude Code. Skip it if you're on Claude.ai.
**Blind comparison**: Requires subagents. Skip it.
**Packaging**: The `package_skill.py` script works anywhere with Python and a filesystem. On Claude.ai, you can run it and the user can download the resulting `.skill` file.
**Updating an existing skill**: The user might be asking you to update an existing skill, not create a new one. In this case:
- **Preserve the original name.** Note the skill's directory name and `name` frontmatter field -- use them unchanged. E.g., if the installed skill is `research-helper`, output `research-helper.skill` (not `research-helper-v2`).
- **Copy to a writeable location before editing.** The installed skill path may be read-only. Copy to `/tmp/skill-name/`, edit there, and package from the copy.
- **If packaging manually, stage in `/tmp/` first**, then copy to the output directory -- direct writes may fail due to permissions.
---
## Cowork-Specific Instructions
If you're in Cowork, the main things to know are:
- You have subagents, so the main workflow (spawn test cases in parallel, run baselines, grade, etc.) all works. (However, if you run into severe problems with timeouts, it's OK to run the test prompts in series rather than parallel.)
- You don't have a browser or display, so when generating the eval viewer, use `--static <output_path>` to write a standalone HTML file instead of starting a server. Then proffer a link that the user can click to open the HTML in their browser.
- For whatever reason, the Cowork setup seems to disincline Claude from generating the eval viewer after running the tests, so just to reiterate: whether you're in Cowork or in Claude Code, after running tests, you should always generate the eval viewer for the human to look at examples before revising the skill yourself and trying to make corrections, using `generate_review.py` (not writing your own boutique html code). Sorry in advance but I'm gonna go all caps here: GENERATE THE EVAL VIEWER *BEFORE* evaluating inputs yourself. You want to get them in front of the human ASAP!
- Feedback works differently: since there's no running server, the viewer's "Submit All Reviews" button will download `feedback.json` as a file. You can then read it from there (you may have to request access first).
- Packaging works — `package_skill.py` just needs Python and a filesystem.
- Description optimization (`run_loop.py` / `run_eval.py`) should work in Cowork just fine since it uses `claude -p` via subprocess, not a browser, but please save it until you've fully finished making the skill and the user agrees it's in good shape.
- **Updating an existing skill**: The user might be asking you to update an existing skill, not create a new one. Follow the update guidance in the claude.ai section above.
---
## Reference files
The agents/ directory contains instructions for specialized subagents. Read them when you need to spawn the relevant subagent.
- `agents/grader.md` — How to evaluate assertions against outputs
- `agents/comparator.md` — How to do blind A/B comparison between two outputs
- `agents/analyzer.md` — How to analyze why one version beat another
The references/ directory has additional documentation:
- `references/schemas.md` — JSON structures for evals.json, grading.json, etc.
---
Repeating one more time the core loop here for emphasis:
- Figure out what the skill is about
- Draft or edit the skill
- Run claude-with-access-to-the-skill on test prompts
- With the user, evaluate the outputs:
- Create benchmark.json and run `eval-viewer/generate_review.py` to help the user review them
- Run quantitative evals
- Repeat until you and the user are satisfied
- Package the final skill and return it to the user.
Please add steps to your TodoList, if you have such a thing, to make sure you don't forget. If you're in Cowork, please specifically put "Create evals JSON and run `eval-viewer/generate_review.py` so human can review test cases" in your TodoList to make sure it happens.
Good luck!

View File

@@ -0,0 +1,274 @@
# Post-hoc Analyzer Agent
Analyze blind comparison results to understand WHY the winner won and generate improvement suggestions.
## Role
After the blind comparator determines a winner, the Post-hoc Analyzer "unblids" the results by examining the skills and transcripts. The goal is to extract actionable insights: what made the winner better, and how can the loser be improved?
## Inputs
You receive these parameters in your prompt:
- **winner**: "A" or "B" (from blind comparison)
- **winner_skill_path**: Path to the skill that produced the winning output
- **winner_transcript_path**: Path to the execution transcript for the winner
- **loser_skill_path**: Path to the skill that produced the losing output
- **loser_transcript_path**: Path to the execution transcript for the loser
- **comparison_result_path**: Path to the blind comparator's output JSON
- **output_path**: Where to save the analysis results
## Process
### Step 1: Read Comparison Result
1. Read the blind comparator's output at comparison_result_path
2. Note the winning side (A or B), the reasoning, and any scores
3. Understand what the comparator valued in the winning output
### Step 2: Read Both Skills
1. Read the winner skill's SKILL.md and key referenced files
2. Read the loser skill's SKILL.md and key referenced files
3. Identify structural differences:
- Instructions clarity and specificity
- Script/tool usage patterns
- Example coverage
- Edge case handling
### Step 3: Read Both Transcripts
1. Read the winner's transcript
2. Read the loser's transcript
3. Compare execution patterns:
- How closely did each follow their skill's instructions?
- What tools were used differently?
- Where did the loser diverge from optimal behavior?
- Did either encounter errors or make recovery attempts?
### Step 4: Analyze Instruction Following
For each transcript, evaluate:
- Did the agent follow the skill's explicit instructions?
- Did the agent use the skill's provided tools/scripts?
- Were there missed opportunities to leverage skill content?
- Did the agent add unnecessary steps not in the skill?
Score instruction following 1-10 and note specific issues.
### Step 5: Identify Winner Strengths
Determine what made the winner better:
- Clearer instructions that led to better behavior?
- Better scripts/tools that produced better output?
- More comprehensive examples that guided edge cases?
- Better error handling guidance?
Be specific. Quote from skills/transcripts where relevant.
### Step 6: Identify Loser Weaknesses
Determine what held the loser back:
- Ambiguous instructions that led to suboptimal choices?
- Missing tools/scripts that forced workarounds?
- Gaps in edge case coverage?
- Poor error handling that caused failures?
### Step 7: Generate Improvement Suggestions
Based on the analysis, produce actionable suggestions for improving the loser skill:
- Specific instruction changes to make
- Tools/scripts to add or modify
- Examples to include
- Edge cases to address
Prioritize by impact. Focus on changes that would have changed the outcome.
### Step 8: Write Analysis Results
Save structured analysis to `{output_path}`.
## Output Format
Write a JSON file with this structure:
```json
{
"comparison_summary": {
"winner": "A",
"winner_skill": "path/to/winner/skill",
"loser_skill": "path/to/loser/skill",
"comparator_reasoning": "Brief summary of why comparator chose winner"
},
"winner_strengths": [
"Clear step-by-step instructions for handling multi-page documents",
"Included validation script that caught formatting errors",
"Explicit guidance on fallback behavior when OCR fails"
],
"loser_weaknesses": [
"Vague instruction 'process the document appropriately' led to inconsistent behavior",
"No script for validation, agent had to improvise and made errors",
"No guidance on OCR failure, agent gave up instead of trying alternatives"
],
"instruction_following": {
"winner": {
"score": 9,
"issues": [
"Minor: skipped optional logging step"
]
},
"loser": {
"score": 6,
"issues": [
"Did not use the skill's formatting template",
"Invented own approach instead of following step 3",
"Missed the 'always validate output' instruction"
]
}
},
"improvement_suggestions": [
{
"priority": "high",
"category": "instructions",
"suggestion": "Replace 'process the document appropriately' with explicit steps: 1) Extract text, 2) Identify sections, 3) Format per template",
"expected_impact": "Would eliminate ambiguity that caused inconsistent behavior"
},
{
"priority": "high",
"category": "tools",
"suggestion": "Add validate_output.py script similar to winner skill's validation approach",
"expected_impact": "Would catch formatting errors before final output"
},
{
"priority": "medium",
"category": "error_handling",
"suggestion": "Add fallback instructions: 'If OCR fails, try: 1) different resolution, 2) image preprocessing, 3) manual extraction'",
"expected_impact": "Would prevent early failure on difficult documents"
}
],
"transcript_insights": {
"winner_execution_pattern": "Read skill -> Followed 5-step process -> Used validation script -> Fixed 2 issues -> Produced output",
"loser_execution_pattern": "Read skill -> Unclear on approach -> Tried 3 different methods -> No validation -> Output had errors"
}
}
```
## Guidelines
- **Be specific**: Quote from skills and transcripts, don't just say "instructions were unclear"
- **Be actionable**: Suggestions should be concrete changes, not vague advice
- **Focus on skill improvements**: The goal is to improve the losing skill, not critique the agent
- **Prioritize by impact**: Which changes would most likely have changed the outcome?
- **Consider causation**: Did the skill weakness actually cause the worse output, or is it incidental?
- **Stay objective**: Analyze what happened, don't editorialize
- **Think about generalization**: Would this improvement help on other evals too?
## Categories for Suggestions
Use these categories to organize improvement suggestions:
| Category | Description |
|----------|-------------|
| `instructions` | Changes to the skill's prose instructions |
| `tools` | Scripts, templates, or utilities to add/modify |
| `examples` | Example inputs/outputs to include |
| `error_handling` | Guidance for handling failures |
| `structure` | Reorganization of skill content |
| `references` | External docs or resources to add |
## Priority Levels
- **high**: Would likely change the outcome of this comparison
- **medium**: Would improve quality but may not change win/loss
- **low**: Nice to have, marginal improvement
---
# Analyzing Benchmark Results
When analyzing benchmark results, the analyzer's purpose is to **surface patterns and anomalies** across multiple runs, not suggest skill improvements.
## Role
Review all benchmark run results and generate freeform notes that help the user understand skill performance. Focus on patterns that wouldn't be visible from aggregate metrics alone.
## Inputs
You receive these parameters in your prompt:
- **benchmark_data_path**: Path to the in-progress benchmark.json with all run results
- **skill_path**: Path to the skill being benchmarked
- **output_path**: Where to save the notes (as JSON array of strings)
## Process
### Step 1: Read Benchmark Data
1. Read the benchmark.json containing all run results
2. Note the configurations tested (with_skill, without_skill)
3. Understand the run_summary aggregates already calculated
### Step 2: Analyze Per-Assertion Patterns
For each expectation across all runs:
- Does it **always pass** in both configurations? (may not differentiate skill value)
- Does it **always fail** in both configurations? (may be broken or beyond capability)
- Does it **always pass with skill but fail without**? (skill clearly adds value here)
- Does it **always fail with skill but pass without**? (skill may be hurting)
- Is it **highly variable**? (flaky expectation or non-deterministic behavior)
### Step 3: Analyze Cross-Eval Patterns
Look for patterns across evals:
- Are certain eval types consistently harder/easier?
- Do some evals show high variance while others are stable?
- Are there surprising results that contradict expectations?
### Step 4: Analyze Metrics Patterns
Look at time_seconds, tokens, tool_calls:
- Does the skill significantly increase execution time?
- Is there high variance in resource usage?
- Are there outlier runs that skew the aggregates?
### Step 5: Generate Notes
Write freeform observations as a list of strings. Each note should:
- State a specific observation
- Be grounded in the data (not speculation)
- Help the user understand something the aggregate metrics don't show
Examples:
- "Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value"
- "Eval 3 shows high variance (50% ± 40%) - run 2 had an unusual failure that may be flaky"
- "Without-skill runs consistently fail on table extraction expectations (0% pass rate)"
- "Skill adds 13s average execution time but improves pass rate by 50%"
- "Token usage is 80% higher with skill, primarily due to script output parsing"
- "All 3 without-skill runs for eval 1 produced empty output"
### Step 6: Write Notes
Save notes to `{output_path}` as a JSON array of strings:
```json
[
"Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value",
"Eval 3 shows high variance (50% ± 40%) - run 2 had an unusual failure",
"Without-skill runs consistently fail on table extraction expectations",
"Skill adds 13s average execution time but improves pass rate by 50%"
]
```
## Guidelines
**DO:**
- Report what you observe in the data
- Be specific about which evals, expectations, or runs you're referring to
- Note patterns that aggregate metrics would hide
- Provide context that helps interpret the numbers
**DO NOT:**
- Suggest improvements to the skill (that's for the improvement step, not benchmarking)
- Make subjective quality judgments ("the output was good/bad")
- Speculate about causes without evidence
- Repeat information already in the run_summary aggregates

View File

@@ -0,0 +1,202 @@
# Blind Comparator Agent
Compare two outputs WITHOUT knowing which skill produced them.
## Role
The Blind Comparator judges which output better accomplishes the eval task. You receive two outputs labeled A and B, but you do NOT know which skill produced which. This prevents bias toward a particular skill or approach.
Your judgment is based purely on output quality and task completion.
## Inputs
You receive these parameters in your prompt:
- **output_a_path**: Path to the first output file or directory
- **output_b_path**: Path to the second output file or directory
- **eval_prompt**: The original task/prompt that was executed
- **expectations**: List of expectations to check (optional - may be empty)
## Process
### Step 1: Read Both Outputs
1. Examine output A (file or directory)
2. Examine output B (file or directory)
3. Note the type, structure, and content of each
4. If outputs are directories, examine all relevant files inside
### Step 2: Understand the Task
1. Read the eval_prompt carefully
2. Identify what the task requires:
- What should be produced?
- What qualities matter (accuracy, completeness, format)?
- What would distinguish a good output from a poor one?
### Step 3: Generate Evaluation Rubric
Based on the task, generate a rubric with two dimensions:
**Content Rubric** (what the output contains):
| Criterion | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
|-----------|----------|----------------|---------------|
| Correctness | Major errors | Minor errors | Fully correct |
| Completeness | Missing key elements | Mostly complete | All elements present |
| Accuracy | Significant inaccuracies | Minor inaccuracies | Accurate throughout |
**Structure Rubric** (how the output is organized):
| Criterion | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
|-----------|----------|----------------|---------------|
| Organization | Disorganized | Reasonably organized | Clear, logical structure |
| Formatting | Inconsistent/broken | Mostly consistent | Professional, polished |
| Usability | Difficult to use | Usable with effort | Easy to use |
Adapt criteria to the specific task. For example:
- PDF form → "Field alignment", "Text readability", "Data placement"
- Document → "Section structure", "Heading hierarchy", "Paragraph flow"
- Data output → "Schema correctness", "Data types", "Completeness"
### Step 4: Evaluate Each Output Against the Rubric
For each output (A and B):
1. **Score each criterion** on the rubric (1-5 scale)
2. **Calculate dimension totals**: Content score, Structure score
3. **Calculate overall score**: Average of dimension scores, scaled to 1-10
### Step 5: Check Assertions (if provided)
If expectations are provided:
1. Check each expectation against output A
2. Check each expectation against output B
3. Count pass rates for each output
4. Use expectation scores as secondary evidence (not the primary decision factor)
### Step 6: Determine the Winner
Compare A and B based on (in priority order):
1. **Primary**: Overall rubric score (content + structure)
2. **Secondary**: Assertion pass rates (if applicable)
3. **Tiebreaker**: If truly equal, declare a TIE
Be decisive - ties should be rare. One output is usually better, even if marginally.
### Step 7: Write Comparison Results
Save results to a JSON file at the path specified (or `comparison.json` if not specified).
## Output Format
Write a JSON file with this structure:
```json
{
"winner": "A",
"reasoning": "Output A provides a complete solution with proper formatting and all required fields. Output B is missing the date field and has formatting inconsistencies.",
"rubric": {
"A": {
"content": {
"correctness": 5,
"completeness": 5,
"accuracy": 4
},
"structure": {
"organization": 4,
"formatting": 5,
"usability": 4
},
"content_score": 4.7,
"structure_score": 4.3,
"overall_score": 9.0
},
"B": {
"content": {
"correctness": 3,
"completeness": 2,
"accuracy": 3
},
"structure": {
"organization": 3,
"formatting": 2,
"usability": 3
},
"content_score": 2.7,
"structure_score": 2.7,
"overall_score": 5.4
}
},
"output_quality": {
"A": {
"score": 9,
"strengths": ["Complete solution", "Well-formatted", "All fields present"],
"weaknesses": ["Minor style inconsistency in header"]
},
"B": {
"score": 5,
"strengths": ["Readable output", "Correct basic structure"],
"weaknesses": ["Missing date field", "Formatting inconsistencies", "Partial data extraction"]
}
},
"expectation_results": {
"A": {
"passed": 4,
"total": 5,
"pass_rate": 0.80,
"details": [
{"text": "Output includes name", "passed": true},
{"text": "Output includes date", "passed": true},
{"text": "Format is PDF", "passed": true},
{"text": "Contains signature", "passed": false},
{"text": "Readable text", "passed": true}
]
},
"B": {
"passed": 3,
"total": 5,
"pass_rate": 0.60,
"details": [
{"text": "Output includes name", "passed": true},
{"text": "Output includes date", "passed": false},
{"text": "Format is PDF", "passed": true},
{"text": "Contains signature", "passed": false},
{"text": "Readable text", "passed": true}
]
}
}
}
```
If no expectations were provided, omit the `expectation_results` field entirely.
## Field Descriptions
- **winner**: "A", "B", or "TIE"
- **reasoning**: Clear explanation of why the winner was chosen (or why it's a tie)
- **rubric**: Structured rubric evaluation for each output
- **content**: Scores for content criteria (correctness, completeness, accuracy)
- **structure**: Scores for structure criteria (organization, formatting, usability)
- **content_score**: Average of content criteria (1-5)
- **structure_score**: Average of structure criteria (1-5)
- **overall_score**: Combined score scaled to 1-10
- **output_quality**: Summary quality assessment
- **score**: 1-10 rating (should match rubric overall_score)
- **strengths**: List of positive aspects
- **weaknesses**: List of issues or shortcomings
- **expectation_results**: (Only if expectations provided)
- **passed**: Number of expectations that passed
- **total**: Total number of expectations
- **pass_rate**: Fraction passed (0.0 to 1.0)
- **details**: Individual expectation results
## Guidelines
- **Stay blind**: DO NOT try to infer which skill produced which output. Judge purely on output quality.
- **Be specific**: Cite specific examples when explaining strengths and weaknesses.
- **Be decisive**: Choose a winner unless outputs are genuinely equivalent.
- **Output quality first**: Assertion scores are secondary to overall task completion.
- **Be objective**: Don't favor outputs based on style preferences; focus on correctness and completeness.
- **Explain your reasoning**: The reasoning field should make it clear why you chose the winner.
- **Handle edge cases**: If both outputs fail, pick the one that fails less badly. If both are excellent, pick the one that's marginally better.

View File

@@ -0,0 +1,223 @@
# Grader Agent
Evaluate expectations against an execution transcript and outputs.
## Role
The Grader reviews a transcript and output files, then determines whether each expectation passes or fails. Provide clear evidence for each judgment.
You have two jobs: grade the outputs, and critique the evals themselves. A passing grade on a weak assertion is worse than useless — it creates false confidence. When you notice an assertion that's trivially satisfied, or an important outcome that no assertion checks, say so.
## Inputs
You receive these parameters in your prompt:
- **expectations**: List of expectations to evaluate (strings)
- **transcript_path**: Path to the execution transcript (markdown file)
- **outputs_dir**: Directory containing output files from execution
## Process
### Step 1: Read the Transcript
1. Read the transcript file completely
2. Note the eval prompt, execution steps, and final result
3. Identify any issues or errors documented
### Step 2: Examine Output Files
1. List files in outputs_dir
2. Read/examine each file relevant to the expectations. If outputs aren't plain text, use the inspection tools provided in your prompt — don't rely solely on what the transcript says the executor produced.
3. Note contents, structure, and quality
### Step 3: Evaluate Each Assertion
For each expectation:
1. **Search for evidence** in the transcript and outputs
2. **Determine verdict**:
- **PASS**: Clear evidence the expectation is true AND the evidence reflects genuine task completion, not just surface-level compliance
- **FAIL**: No evidence, or evidence contradicts the expectation, or the evidence is superficial (e.g., correct filename but empty/wrong content)
3. **Cite the evidence**: Quote the specific text or describe what you found
### Step 4: Extract and Verify Claims
Beyond the predefined expectations, extract implicit claims from the outputs and verify them:
1. **Extract claims** from the transcript and outputs:
- Factual statements ("The form has 12 fields")
- Process claims ("Used pypdf to fill the form")
- Quality claims ("All fields were filled correctly")
2. **Verify each claim**:
- **Factual claims**: Can be checked against the outputs or external sources
- **Process claims**: Can be verified from the transcript
- **Quality claims**: Evaluate whether the claim is justified
3. **Flag unverifiable claims**: Note claims that cannot be verified with available information
This catches issues that predefined expectations might miss.
### Step 5: Read User Notes
If `{outputs_dir}/user_notes.md` exists:
1. Read it and note any uncertainties or issues flagged by the executor
2. Include relevant concerns in the grading output
3. These may reveal problems even when expectations pass
### Step 6: Critique the Evals
After grading, consider whether the evals themselves could be improved. Only surface suggestions when there's a clear gap.
Good suggestions test meaningful outcomes — assertions that are hard to satisfy without actually doing the work correctly. Think about what makes an assertion *discriminating*: it passes when the skill genuinely succeeds and fails when it doesn't.
Suggestions worth raising:
- An assertion that passed but would also pass for a clearly wrong output (e.g., checking filename existence but not file content)
- An important outcome you observed — good or bad — that no assertion covers at all
- An assertion that can't actually be verified from the available outputs
Keep the bar high. The goal is to flag things the eval author would say "good catch" about, not to nitpick every assertion.
### Step 7: Write Grading Results
Save results to `{outputs_dir}/../grading.json` (sibling to outputs_dir).
## Grading Criteria
**PASS when**:
- The transcript or outputs clearly demonstrate the expectation is true
- Specific evidence can be cited
- The evidence reflects genuine substance, not just surface compliance (e.g., a file exists AND contains correct content, not just the right filename)
**FAIL when**:
- No evidence found for the expectation
- Evidence contradicts the expectation
- The expectation cannot be verified from available information
- The evidence is superficial — the assertion is technically satisfied but the underlying task outcome is wrong or incomplete
- The output appears to meet the assertion by coincidence rather than by actually doing the work
**When uncertain**: The burden of proof to pass is on the expectation.
### Step 8: Read Executor Metrics and Timing
1. If `{outputs_dir}/metrics.json` exists, read it and include in grading output
2. If `{outputs_dir}/../timing.json` exists, read it and include timing data
## Output Format
Write a JSON file with this structure:
```json
{
"expectations": [
{
"text": "The output includes the name 'John Smith'",
"passed": true,
"evidence": "Found in transcript Step 3: 'Extracted names: John Smith, Sarah Johnson'"
},
{
"text": "The spreadsheet has a SUM formula in cell B10",
"passed": false,
"evidence": "No spreadsheet was created. The output was a text file."
},
{
"text": "The assistant used the skill's OCR script",
"passed": true,
"evidence": "Transcript Step 2 shows: 'Tool: Bash - python ocr_script.py image.png'"
}
],
"summary": {
"passed": 2,
"failed": 1,
"total": 3,
"pass_rate": 0.67
},
"execution_metrics": {
"tool_calls": {
"Read": 5,
"Write": 2,
"Bash": 8
},
"total_tool_calls": 15,
"total_steps": 6,
"errors_encountered": 0,
"output_chars": 12450,
"transcript_chars": 3200
},
"timing": {
"executor_duration_seconds": 165.0,
"grader_duration_seconds": 26.0,
"total_duration_seconds": 191.0
},
"claims": [
{
"claim": "The form has 12 fillable fields",
"type": "factual",
"verified": true,
"evidence": "Counted 12 fields in field_info.json"
},
{
"claim": "All required fields were populated",
"type": "quality",
"verified": false,
"evidence": "Reference section was left blank despite data being available"
}
],
"user_notes_summary": {
"uncertainties": ["Used 2023 data, may be stale"],
"needs_review": [],
"workarounds": ["Fell back to text overlay for non-fillable fields"]
},
"eval_feedback": {
"suggestions": [
{
"assertion": "The output includes the name 'John Smith'",
"reason": "A hallucinated document that mentions the name would also pass — consider checking it appears as the primary contact with matching phone and email from the input"
},
{
"reason": "No assertion checks whether the extracted phone numbers match the input — I observed incorrect numbers in the output that went uncaught"
}
],
"overall": "Assertions check presence but not correctness. Consider adding content verification."
}
}
```
## Field Descriptions
- **expectations**: Array of graded expectations
- **text**: The original expectation text
- **passed**: Boolean - true if expectation passes
- **evidence**: Specific quote or description supporting the verdict
- **summary**: Aggregate statistics
- **passed**: Count of passed expectations
- **failed**: Count of failed expectations
- **total**: Total expectations evaluated
- **pass_rate**: Fraction passed (0.0 to 1.0)
- **execution_metrics**: Copied from executor's metrics.json (if available)
- **output_chars**: Total character count of output files (proxy for tokens)
- **transcript_chars**: Character count of transcript
- **timing**: Wall clock timing from timing.json (if available)
- **executor_duration_seconds**: Time spent in executor subagent
- **total_duration_seconds**: Total elapsed time for the run
- **claims**: Extracted and verified claims from the output
- **claim**: The statement being verified
- **type**: "factual", "process", or "quality"
- **verified**: Boolean - whether the claim holds
- **evidence**: Supporting or contradicting evidence
- **user_notes_summary**: Issues flagged by the executor
- **uncertainties**: Things the executor wasn't sure about
- **needs_review**: Items requiring human attention
- **workarounds**: Places where the skill didn't work as expected
- **eval_feedback**: Improvement suggestions for the evals (only when warranted)
- **suggestions**: List of concrete suggestions, each with a `reason` and optionally an `assertion` it relates to
- **overall**: Brief assessment — can be "No suggestions, evals look solid" if nothing to flag
## Guidelines
- **Be objective**: Base verdicts on evidence, not assumptions
- **Be specific**: Quote the exact text that supports your verdict
- **Be thorough**: Check both transcript and output files
- **Be consistent**: Apply the same standard to each expectation
- **Explain failures**: Make it clear why evidence was insufficient
- **No partial credit**: Each expectation is pass or fail, not partial

View File

@@ -0,0 +1,146 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Eval Set Review - __SKILL_NAME_PLACEHOLDER__</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Poppins:wght@500;600&family=Lora:wght@400;500&display=swap" rel="stylesheet">
<style>
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: 'Lora', Georgia, serif; background: #faf9f5; padding: 2rem; color: #141413; }
h1 { font-family: 'Poppins', sans-serif; margin-bottom: 0.5rem; font-size: 1.5rem; }
.description { color: #b0aea5; margin-bottom: 1.5rem; font-style: italic; max-width: 900px; }
.controls { margin-bottom: 1rem; display: flex; gap: 0.5rem; }
.btn { font-family: 'Poppins', sans-serif; padding: 0.5rem 1rem; border: none; border-radius: 6px; cursor: pointer; font-size: 0.875rem; font-weight: 500; }
.btn-add { background: #6a9bcc; color: white; }
.btn-add:hover { background: #5889b8; }
.btn-export { background: #d97757; color: white; }
.btn-export:hover { background: #c4613f; }
table { width: 100%; max-width: 1100px; border-collapse: collapse; background: white; border-radius: 6px; overflow: hidden; box-shadow: 0 1px 3px rgba(0,0,0,0.08); }
th { font-family: 'Poppins', sans-serif; background: #141413; color: #faf9f5; padding: 0.75rem 1rem; text-align: left; font-size: 0.875rem; }
td { padding: 0.75rem 1rem; border-bottom: 1px solid #e8e6dc; vertical-align: top; }
tr:nth-child(even) td { background: #faf9f5; }
tr:hover td { background: #f3f1ea; }
.section-header td { background: #e8e6dc; font-family: 'Poppins', sans-serif; font-weight: 500; font-size: 0.8rem; color: #141413; text-transform: uppercase; letter-spacing: 0.05em; }
.query-input { width: 100%; padding: 0.4rem; border: 1px solid #e8e6dc; border-radius: 4px; font-size: 0.875rem; font-family: 'Lora', Georgia, serif; resize: vertical; min-height: 60px; }
.query-input:focus { outline: none; border-color: #d97757; box-shadow: 0 0 0 2px rgba(217,119,87,0.15); }
.toggle { position: relative; display: inline-block; width: 44px; height: 24px; }
.toggle input { opacity: 0; width: 0; height: 0; }
.toggle .slider { position: absolute; inset: 0; background: #b0aea5; border-radius: 24px; cursor: pointer; transition: 0.2s; }
.toggle .slider::before { content: ""; position: absolute; width: 18px; height: 18px; left: 3px; bottom: 3px; background: white; border-radius: 50%; transition: 0.2s; }
.toggle input:checked + .slider { background: #d97757; }
.toggle input:checked + .slider::before { transform: translateX(20px); }
.btn-delete { background: #c44; color: white; padding: 0.3rem 0.6rem; border: none; border-radius: 4px; cursor: pointer; font-size: 0.75rem; font-family: 'Poppins', sans-serif; }
.btn-delete:hover { background: #a33; }
.summary { margin-top: 1rem; color: #b0aea5; font-size: 0.875rem; }
</style>
</head>
<body>
<h1>Eval Set Review: <span id="skill-name">__SKILL_NAME_PLACEHOLDER__</span></h1>
<p class="description">Current description: <span id="skill-desc">__SKILL_DESCRIPTION_PLACEHOLDER__</span></p>
<div class="controls">
<button class="btn btn-add" onclick="addRow()">+ Add Query</button>
<button class="btn btn-export" onclick="exportEvalSet()">Export Eval Set</button>
</div>
<table>
<thead>
<tr>
<th style="width:65%">Query</th>
<th style="width:18%">Should Trigger</th>
<th style="width:10%">Actions</th>
</tr>
</thead>
<tbody id="eval-body"></tbody>
</table>
<p class="summary" id="summary"></p>
<script>
const EVAL_DATA = __EVAL_DATA_PLACEHOLDER__;
let evalItems = [...EVAL_DATA];
function render() {
const tbody = document.getElementById('eval-body');
tbody.innerHTML = '';
// Sort: should-trigger first, then should-not-trigger
const sorted = evalItems
.map((item, origIdx) => ({ ...item, origIdx }))
.sort((a, b) => (b.should_trigger ? 1 : 0) - (a.should_trigger ? 1 : 0));
let lastGroup = null;
sorted.forEach(item => {
const group = item.should_trigger ? 'trigger' : 'no-trigger';
if (group !== lastGroup) {
const headerRow = document.createElement('tr');
headerRow.className = 'section-header';
headerRow.innerHTML = `<td colspan="3">${item.should_trigger ? 'Should Trigger' : 'Should NOT Trigger'}</td>`;
tbody.appendChild(headerRow);
lastGroup = group;
}
const idx = item.origIdx;
const tr = document.createElement('tr');
tr.innerHTML = `
<td><textarea class="query-input" onchange="updateQuery(${idx}, this.value)">${escapeHtml(item.query)}</textarea></td>
<td>
<label class="toggle">
<input type="checkbox" ${item.should_trigger ? 'checked' : ''} onchange="updateTrigger(${idx}, this.checked)">
<span class="slider"></span>
</label>
<span style="margin-left:8px;font-size:0.8rem;color:#b0aea5">${item.should_trigger ? 'Yes' : 'No'}</span>
</td>
<td><button class="btn-delete" onclick="deleteRow(${idx})">Delete</button></td>
`;
tbody.appendChild(tr);
});
updateSummary();
}
function escapeHtml(text) {
const div = document.createElement('div');
div.textContent = text;
return div.innerHTML;
}
function updateQuery(idx, value) { evalItems[idx].query = value; updateSummary(); }
function updateTrigger(idx, value) { evalItems[idx].should_trigger = value; render(); }
function deleteRow(idx) { evalItems.splice(idx, 1); render(); }
function addRow() {
evalItems.push({ query: '', should_trigger: true });
render();
const inputs = document.querySelectorAll('.query-input');
inputs[inputs.length - 1].focus();
}
function updateSummary() {
const trigger = evalItems.filter(i => i.should_trigger).length;
const noTrigger = evalItems.filter(i => !i.should_trigger).length;
document.getElementById('summary').textContent =
`${evalItems.length} queries total: ${trigger} should trigger, ${noTrigger} should not trigger`;
}
function exportEvalSet() {
const valid = evalItems.filter(i => i.query.trim() !== '');
const data = valid.map(i => ({ query: i.query.trim(), should_trigger: i.should_trigger }));
const blob = new Blob([JSON.stringify(data, null, 2)], { type: 'application/json' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = 'eval_set.json';
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
}
render();
</script>
</body>
</html>

View File

@@ -0,0 +1,471 @@
#!/usr/bin/env python3
"""Generate and serve a review page for eval results.
Reads the workspace directory, discovers runs (directories with outputs/),
embeds all output data into a self-contained HTML page, and serves it via
a tiny HTTP server. Feedback auto-saves to feedback.json in the workspace.
Usage:
python generate_review.py <workspace-path> [--port PORT] [--skill-name NAME]
python generate_review.py <workspace-path> --previous-feedback /path/to/old/feedback.json
No dependencies beyond the Python stdlib are required.
"""
import argparse
import base64
import json
import mimetypes
import os
import re
import signal
import subprocess
import sys
import time
import webbrowser
from functools import partial
from http.server import HTTPServer, BaseHTTPRequestHandler
from pathlib import Path
# Files to exclude from output listings
METADATA_FILES = {"transcript.md", "user_notes.md", "metrics.json"}
# Extensions we render as inline text
TEXT_EXTENSIONS = {
".txt", ".md", ".json", ".csv", ".py", ".js", ".ts", ".tsx", ".jsx",
".yaml", ".yml", ".xml", ".html", ".css", ".sh", ".rb", ".go", ".rs",
".java", ".c", ".cpp", ".h", ".hpp", ".sql", ".r", ".toml",
}
# Extensions we render as inline images
IMAGE_EXTENSIONS = {".png", ".jpg", ".jpeg", ".gif", ".svg", ".webp"}
# MIME type overrides for common types
MIME_OVERRIDES = {
".svg": "image/svg+xml",
".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
".pptx": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
}
def get_mime_type(path: Path) -> str:
ext = path.suffix.lower()
if ext in MIME_OVERRIDES:
return MIME_OVERRIDES[ext]
mime, _ = mimetypes.guess_type(str(path))
return mime or "application/octet-stream"
def find_runs(workspace: Path) -> list[dict]:
"""Recursively find directories that contain an outputs/ subdirectory."""
runs: list[dict] = []
_find_runs_recursive(workspace, workspace, runs)
runs.sort(key=lambda r: (r.get("eval_id", float("inf")), r["id"]))
return runs
def _find_runs_recursive(root: Path, current: Path, runs: list[dict]) -> None:
if not current.is_dir():
return
outputs_dir = current / "outputs"
if outputs_dir.is_dir():
run = build_run(root, current)
if run:
runs.append(run)
return
skip = {"node_modules", ".git", "__pycache__", "skill", "inputs"}
for child in sorted(current.iterdir()):
if child.is_dir() and child.name not in skip:
_find_runs_recursive(root, child, runs)
def build_run(root: Path, run_dir: Path) -> dict | None:
"""Build a run dict with prompt, outputs, and grading data."""
prompt = ""
eval_id = None
# Try eval_metadata.json
for candidate in [run_dir / "eval_metadata.json", run_dir.parent / "eval_metadata.json"]:
if candidate.exists():
try:
metadata = json.loads(candidate.read_text())
prompt = metadata.get("prompt", "")
eval_id = metadata.get("eval_id")
except (json.JSONDecodeError, OSError):
pass
if prompt:
break
# Fall back to transcript.md
if not prompt:
for candidate in [run_dir / "transcript.md", run_dir / "outputs" / "transcript.md"]:
if candidate.exists():
try:
text = candidate.read_text()
match = re.search(r"## Eval Prompt\n\n([\s\S]*?)(?=\n##|$)", text)
if match:
prompt = match.group(1).strip()
except OSError:
pass
if prompt:
break
if not prompt:
prompt = "(No prompt found)"
run_id = str(run_dir.relative_to(root)).replace("/", "-").replace("\\", "-")
# Collect output files
outputs_dir = run_dir / "outputs"
output_files: list[dict] = []
if outputs_dir.is_dir():
for f in sorted(outputs_dir.iterdir()):
if f.is_file() and f.name not in METADATA_FILES:
output_files.append(embed_file(f))
# Load grading if present
grading = None
for candidate in [run_dir / "grading.json", run_dir.parent / "grading.json"]:
if candidate.exists():
try:
grading = json.loads(candidate.read_text())
except (json.JSONDecodeError, OSError):
pass
if grading:
break
return {
"id": run_id,
"prompt": prompt,
"eval_id": eval_id,
"outputs": output_files,
"grading": grading,
}
def embed_file(path: Path) -> dict:
"""Read a file and return an embedded representation."""
ext = path.suffix.lower()
mime = get_mime_type(path)
if ext in TEXT_EXTENSIONS:
try:
content = path.read_text(errors="replace")
except OSError:
content = "(Error reading file)"
return {
"name": path.name,
"type": "text",
"content": content,
}
elif ext in IMAGE_EXTENSIONS:
try:
raw = path.read_bytes()
b64 = base64.b64encode(raw).decode("ascii")
except OSError:
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
return {
"name": path.name,
"type": "image",
"mime": mime,
"data_uri": f"data:{mime};base64,{b64}",
}
elif ext == ".pdf":
try:
raw = path.read_bytes()
b64 = base64.b64encode(raw).decode("ascii")
except OSError:
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
return {
"name": path.name,
"type": "pdf",
"data_uri": f"data:{mime};base64,{b64}",
}
elif ext == ".xlsx":
try:
raw = path.read_bytes()
b64 = base64.b64encode(raw).decode("ascii")
except OSError:
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
return {
"name": path.name,
"type": "xlsx",
"data_b64": b64,
}
else:
# Binary / unknown — base64 download link
try:
raw = path.read_bytes()
b64 = base64.b64encode(raw).decode("ascii")
except OSError:
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
return {
"name": path.name,
"type": "binary",
"mime": mime,
"data_uri": f"data:{mime};base64,{b64}",
}
def load_previous_iteration(workspace: Path) -> dict[str, dict]:
"""Load previous iteration's feedback and outputs.
Returns a map of run_id -> {"feedback": str, "outputs": list[dict]}.
"""
result: dict[str, dict] = {}
# Load feedback
feedback_map: dict[str, str] = {}
feedback_path = workspace / "feedback.json"
if feedback_path.exists():
try:
data = json.loads(feedback_path.read_text())
feedback_map = {
r["run_id"]: r["feedback"]
for r in data.get("reviews", [])
if r.get("feedback", "").strip()
}
except (json.JSONDecodeError, OSError, KeyError):
pass
# Load runs (to get outputs)
prev_runs = find_runs(workspace)
for run in prev_runs:
result[run["id"]] = {
"feedback": feedback_map.get(run["id"], ""),
"outputs": run.get("outputs", []),
}
# Also add feedback for run_ids that had feedback but no matching run
for run_id, fb in feedback_map.items():
if run_id not in result:
result[run_id] = {"feedback": fb, "outputs": []}
return result
def generate_html(
runs: list[dict],
skill_name: str,
previous: dict[str, dict] | None = None,
benchmark: dict | None = None,
) -> str:
"""Generate the complete standalone HTML page with embedded data."""
template_path = Path(__file__).parent / "viewer.html"
template = template_path.read_text()
# Build previous_feedback and previous_outputs maps for the template
previous_feedback: dict[str, str] = {}
previous_outputs: dict[str, list[dict]] = {}
if previous:
for run_id, data in previous.items():
if data.get("feedback"):
previous_feedback[run_id] = data["feedback"]
if data.get("outputs"):
previous_outputs[run_id] = data["outputs"]
embedded = {
"skill_name": skill_name,
"runs": runs,
"previous_feedback": previous_feedback,
"previous_outputs": previous_outputs,
}
if benchmark:
embedded["benchmark"] = benchmark
data_json = json.dumps(embedded)
return template.replace("/*__EMBEDDED_DATA__*/", f"const EMBEDDED_DATA = {data_json};")
# ---------------------------------------------------------------------------
# HTTP server (stdlib only, zero dependencies)
# ---------------------------------------------------------------------------
def _kill_port(port: int) -> None:
"""Kill any process listening on the given port."""
try:
result = subprocess.run(
["lsof", "-ti", f":{port}"],
capture_output=True, text=True, timeout=5,
)
for pid_str in result.stdout.strip().split("\n"):
if pid_str.strip():
try:
os.kill(int(pid_str.strip()), signal.SIGTERM)
except (ProcessLookupError, ValueError):
pass
if result.stdout.strip():
time.sleep(0.5)
except subprocess.TimeoutExpired:
pass
except FileNotFoundError:
print("Note: lsof not found, cannot check if port is in use", file=sys.stderr)
class ReviewHandler(BaseHTTPRequestHandler):
"""Serves the review HTML and handles feedback saves.
Regenerates the HTML on each page load so that refreshing the browser
picks up new eval outputs without restarting the server.
"""
def __init__(
self,
workspace: Path,
skill_name: str,
feedback_path: Path,
previous: dict[str, dict],
benchmark_path: Path | None,
*args,
**kwargs,
):
self.workspace = workspace
self.skill_name = skill_name
self.feedback_path = feedback_path
self.previous = previous
self.benchmark_path = benchmark_path
super().__init__(*args, **kwargs)
def do_GET(self) -> None:
if self.path == "/" or self.path == "/index.html":
# Regenerate HTML on each request (re-scans workspace for new outputs)
runs = find_runs(self.workspace)
benchmark = None
if self.benchmark_path and self.benchmark_path.exists():
try:
benchmark = json.loads(self.benchmark_path.read_text())
except (json.JSONDecodeError, OSError):
pass
html = generate_html(runs, self.skill_name, self.previous, benchmark)
content = html.encode("utf-8")
self.send_response(200)
self.send_header("Content-Type", "text/html; charset=utf-8")
self.send_header("Content-Length", str(len(content)))
self.end_headers()
self.wfile.write(content)
elif self.path == "/api/feedback":
data = b"{}"
if self.feedback_path.exists():
data = self.feedback_path.read_bytes()
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(data)))
self.end_headers()
self.wfile.write(data)
else:
self.send_error(404)
def do_POST(self) -> None:
if self.path == "/api/feedback":
length = int(self.headers.get("Content-Length", 0))
body = self.rfile.read(length)
try:
data = json.loads(body)
if not isinstance(data, dict) or "reviews" not in data:
raise ValueError("Expected JSON object with 'reviews' key")
self.feedback_path.write_text(json.dumps(data, indent=2) + "\n")
resp = b'{"ok":true}'
self.send_response(200)
except (json.JSONDecodeError, OSError, ValueError) as e:
resp = json.dumps({"error": str(e)}).encode()
self.send_response(500)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(resp)))
self.end_headers()
self.wfile.write(resp)
else:
self.send_error(404)
def log_message(self, format: str, *args: object) -> None:
# Suppress request logging to keep terminal clean
pass
def main() -> None:
parser = argparse.ArgumentParser(description="Generate and serve eval review")
parser.add_argument("workspace", type=Path, help="Path to workspace directory")
parser.add_argument("--port", "-p", type=int, default=3117, help="Server port (default: 3117)")
parser.add_argument("--skill-name", "-n", type=str, default=None, help="Skill name for header")
parser.add_argument(
"--previous-workspace", type=Path, default=None,
help="Path to previous iteration's workspace (shows old outputs and feedback as context)",
)
parser.add_argument(
"--benchmark", type=Path, default=None,
help="Path to benchmark.json to show in the Benchmark tab",
)
parser.add_argument(
"--static", "-s", type=Path, default=None,
help="Write standalone HTML to this path instead of starting a server",
)
args = parser.parse_args()
workspace = args.workspace.resolve()
if not workspace.is_dir():
print(f"Error: {workspace} is not a directory", file=sys.stderr)
sys.exit(1)
runs = find_runs(workspace)
if not runs:
print(f"No runs found in {workspace}", file=sys.stderr)
sys.exit(1)
skill_name = args.skill_name or workspace.name.replace("-workspace", "")
feedback_path = workspace / "feedback.json"
previous: dict[str, dict] = {}
if args.previous_workspace:
previous = load_previous_iteration(args.previous_workspace.resolve())
benchmark_path = args.benchmark.resolve() if args.benchmark else None
benchmark = None
if benchmark_path and benchmark_path.exists():
try:
benchmark = json.loads(benchmark_path.read_text())
except (json.JSONDecodeError, OSError):
pass
if args.static:
html = generate_html(runs, skill_name, previous, benchmark)
args.static.parent.mkdir(parents=True, exist_ok=True)
args.static.write_text(html)
print(f"\n Static viewer written to: {args.static}\n")
sys.exit(0)
# Kill any existing process on the target port
port = args.port
_kill_port(port)
handler = partial(ReviewHandler, workspace, skill_name, feedback_path, previous, benchmark_path)
try:
server = HTTPServer(("127.0.0.1", port), handler)
except OSError:
# Port still in use after kill attempt — find a free one
server = HTTPServer(("127.0.0.1", 0), handler)
port = server.server_address[1]
url = f"http://localhost:{port}"
print(f"\n Eval Viewer")
print(f" ─────────────────────────────────")
print(f" URL: {url}")
print(f" Workspace: {workspace}")
print(f" Feedback: {feedback_path}")
if previous:
print(f" Previous: {args.previous_workspace} ({len(previous)} runs)")
if benchmark_path:
print(f" Benchmark: {benchmark_path}")
print(f"\n Press Ctrl+C to stop.\n")
webbrowser.open(url)
try:
server.serve_forever()
except KeyboardInterrupt:
print("\nStopped.")
server.server_close()
if __name__ == "__main__":
main()

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,430 @@
# JSON Schemas
This document defines the JSON schemas used by skill-creator.
---
## evals.json
Defines the evals for a skill. Located at `evals/evals.json` within the skill directory.
```json
{
"skill_name": "example-skill",
"evals": [
{
"id": 1,
"prompt": "User's example prompt",
"expected_output": "Description of expected result",
"files": ["evals/files/sample1.pdf"],
"expectations": [
"The output includes X",
"The skill used script Y"
]
}
]
}
```
**Fields:**
- `skill_name`: Name matching the skill's frontmatter
- `evals[].id`: Unique integer identifier
- `evals[].prompt`: The task to execute
- `evals[].expected_output`: Human-readable description of success
- `evals[].files`: Optional list of input file paths (relative to skill root)
- `evals[].expectations`: List of verifiable statements
---
## history.json
Tracks version progression in Improve mode. Located at workspace root.
```json
{
"started_at": "2026-01-15T10:30:00Z",
"skill_name": "pdf",
"current_best": "v2",
"iterations": [
{
"version": "v0",
"parent": null,
"expectation_pass_rate": 0.65,
"grading_result": "baseline",
"is_current_best": false
},
{
"version": "v1",
"parent": "v0",
"expectation_pass_rate": 0.75,
"grading_result": "won",
"is_current_best": false
},
{
"version": "v2",
"parent": "v1",
"expectation_pass_rate": 0.85,
"grading_result": "won",
"is_current_best": true
}
]
}
```
**Fields:**
- `started_at`: ISO timestamp of when improvement started
- `skill_name`: Name of the skill being improved
- `current_best`: Version identifier of the best performer
- `iterations[].version`: Version identifier (v0, v1, ...)
- `iterations[].parent`: Parent version this was derived from
- `iterations[].expectation_pass_rate`: Pass rate from grading
- `iterations[].grading_result`: "baseline", "won", "lost", or "tie"
- `iterations[].is_current_best`: Whether this is the current best version
---
## grading.json
Output from the grader agent. Located at `<run-dir>/grading.json`.
```json
{
"expectations": [
{
"text": "The output includes the name 'John Smith'",
"passed": true,
"evidence": "Found in transcript Step 3: 'Extracted names: John Smith, Sarah Johnson'"
},
{
"text": "The spreadsheet has a SUM formula in cell B10",
"passed": false,
"evidence": "No spreadsheet was created. The output was a text file."
}
],
"summary": {
"passed": 2,
"failed": 1,
"total": 3,
"pass_rate": 0.67
},
"execution_metrics": {
"tool_calls": {
"Read": 5,
"Write": 2,
"Bash": 8
},
"total_tool_calls": 15,
"total_steps": 6,
"errors_encountered": 0,
"output_chars": 12450,
"transcript_chars": 3200
},
"timing": {
"executor_duration_seconds": 165.0,
"grader_duration_seconds": 26.0,
"total_duration_seconds": 191.0
},
"claims": [
{
"claim": "The form has 12 fillable fields",
"type": "factual",
"verified": true,
"evidence": "Counted 12 fields in field_info.json"
}
],
"user_notes_summary": {
"uncertainties": ["Used 2023 data, may be stale"],
"needs_review": [],
"workarounds": ["Fell back to text overlay for non-fillable fields"]
},
"eval_feedback": {
"suggestions": [
{
"assertion": "The output includes the name 'John Smith'",
"reason": "A hallucinated document that mentions the name would also pass"
}
],
"overall": "Assertions check presence but not correctness."
}
}
```
**Fields:**
- `expectations[]`: Graded expectations with evidence
- `summary`: Aggregate pass/fail counts
- `execution_metrics`: Tool usage and output size (from executor's metrics.json)
- `timing`: Wall clock timing (from timing.json)
- `claims`: Extracted and verified claims from the output
- `user_notes_summary`: Issues flagged by the executor
- `eval_feedback`: (optional) Improvement suggestions for the evals, only present when the grader identifies issues worth raising
---
## metrics.json
Output from the executor agent. Located at `<run-dir>/outputs/metrics.json`.
```json
{
"tool_calls": {
"Read": 5,
"Write": 2,
"Bash": 8,
"Edit": 1,
"Glob": 2,
"Grep": 0
},
"total_tool_calls": 18,
"total_steps": 6,
"files_created": ["filled_form.pdf", "field_values.json"],
"errors_encountered": 0,
"output_chars": 12450,
"transcript_chars": 3200
}
```
**Fields:**
- `tool_calls`: Count per tool type
- `total_tool_calls`: Sum of all tool calls
- `total_steps`: Number of major execution steps
- `files_created`: List of output files created
- `errors_encountered`: Number of errors during execution
- `output_chars`: Total character count of output files
- `transcript_chars`: Character count of transcript
---
## timing.json
Wall clock timing for a run. Located at `<run-dir>/timing.json`.
**How to capture:** When a subagent task completes, the task notification includes `total_tokens` and `duration_ms`. Save these immediately — they are not persisted anywhere else and cannot be recovered after the fact.
```json
{
"total_tokens": 84852,
"duration_ms": 23332,
"total_duration_seconds": 23.3,
"executor_start": "2026-01-15T10:30:00Z",
"executor_end": "2026-01-15T10:32:45Z",
"executor_duration_seconds": 165.0,
"grader_start": "2026-01-15T10:32:46Z",
"grader_end": "2026-01-15T10:33:12Z",
"grader_duration_seconds": 26.0
}
```
---
## benchmark.json
Output from Benchmark mode. Located at `benchmarks/<timestamp>/benchmark.json`.
```json
{
"metadata": {
"skill_name": "pdf",
"skill_path": "/path/to/pdf",
"executor_model": "claude-sonnet-4-20250514",
"analyzer_model": "most-capable-model",
"timestamp": "2026-01-15T10:30:00Z",
"evals_run": [1, 2, 3],
"runs_per_configuration": 3
},
"runs": [
{
"eval_id": 1,
"eval_name": "Ocean",
"configuration": "with_skill",
"run_number": 1,
"result": {
"pass_rate": 0.85,
"passed": 6,
"failed": 1,
"total": 7,
"time_seconds": 42.5,
"tokens": 3800,
"tool_calls": 18,
"errors": 0
},
"expectations": [
{"text": "...", "passed": true, "evidence": "..."}
],
"notes": [
"Used 2023 data, may be stale",
"Fell back to text overlay for non-fillable fields"
]
}
],
"run_summary": {
"with_skill": {
"pass_rate": {"mean": 0.85, "stddev": 0.05, "min": 0.80, "max": 0.90},
"time_seconds": {"mean": 45.0, "stddev": 12.0, "min": 32.0, "max": 58.0},
"tokens": {"mean": 3800, "stddev": 400, "min": 3200, "max": 4100}
},
"without_skill": {
"pass_rate": {"mean": 0.35, "stddev": 0.08, "min": 0.28, "max": 0.45},
"time_seconds": {"mean": 32.0, "stddev": 8.0, "min": 24.0, "max": 42.0},
"tokens": {"mean": 2100, "stddev": 300, "min": 1800, "max": 2500}
},
"delta": {
"pass_rate": "+0.50",
"time_seconds": "+13.0",
"tokens": "+1700"
}
},
"notes": [
"Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value",
"Eval 3 shows high variance (50% ± 40%) - may be flaky or model-dependent",
"Without-skill runs consistently fail on table extraction expectations",
"Skill adds 13s average execution time but improves pass rate by 50%"
]
}
```
**Fields:**
- `metadata`: Information about the benchmark run
- `skill_name`: Name of the skill
- `timestamp`: When the benchmark was run
- `evals_run`: List of eval names or IDs
- `runs_per_configuration`: Number of runs per config (e.g. 3)
- `runs[]`: Individual run results
- `eval_id`: Numeric eval identifier
- `eval_name`: Human-readable eval name (used as section header in the viewer)
- `configuration`: Must be `"with_skill"` or `"without_skill"` (the viewer uses this exact string for grouping and color coding)
- `run_number`: Integer run number (1, 2, 3...)
- `result`: Nested object with `pass_rate`, `passed`, `total`, `time_seconds`, `tokens`, `errors`
- `run_summary`: Statistical aggregates per configuration
- `with_skill` / `without_skill`: Each contains `pass_rate`, `time_seconds`, `tokens` objects with `mean` and `stddev` fields
- `delta`: Difference strings like `"+0.50"`, `"+13.0"`, `"+1700"`
- `notes`: Freeform observations from the analyzer
**Important:** The viewer reads these field names exactly. Using `config` instead of `configuration`, or putting `pass_rate` at the top level of a run instead of nested under `result`, will cause the viewer to show empty/zero values. Always reference this schema when generating benchmark.json manually.
---
## comparison.json
Output from blind comparator. Located at `<grading-dir>/comparison-N.json`.
```json
{
"winner": "A",
"reasoning": "Output A provides a complete solution with proper formatting and all required fields. Output B is missing the date field and has formatting inconsistencies.",
"rubric": {
"A": {
"content": {
"correctness": 5,
"completeness": 5,
"accuracy": 4
},
"structure": {
"organization": 4,
"formatting": 5,
"usability": 4
},
"content_score": 4.7,
"structure_score": 4.3,
"overall_score": 9.0
},
"B": {
"content": {
"correctness": 3,
"completeness": 2,
"accuracy": 3
},
"structure": {
"organization": 3,
"formatting": 2,
"usability": 3
},
"content_score": 2.7,
"structure_score": 2.7,
"overall_score": 5.4
}
},
"output_quality": {
"A": {
"score": 9,
"strengths": ["Complete solution", "Well-formatted", "All fields present"],
"weaknesses": ["Minor style inconsistency in header"]
},
"B": {
"score": 5,
"strengths": ["Readable output", "Correct basic structure"],
"weaknesses": ["Missing date field", "Formatting inconsistencies", "Partial data extraction"]
}
},
"expectation_results": {
"A": {
"passed": 4,
"total": 5,
"pass_rate": 0.80,
"details": [
{"text": "Output includes name", "passed": true}
]
},
"B": {
"passed": 3,
"total": 5,
"pass_rate": 0.60,
"details": [
{"text": "Output includes name", "passed": true}
]
}
}
}
```
---
## analysis.json
Output from post-hoc analyzer. Located at `<grading-dir>/analysis.json`.
```json
{
"comparison_summary": {
"winner": "A",
"winner_skill": "path/to/winner/skill",
"loser_skill": "path/to/loser/skill",
"comparator_reasoning": "Brief summary of why comparator chose winner"
},
"winner_strengths": [
"Clear step-by-step instructions for handling multi-page documents",
"Included validation script that caught formatting errors"
],
"loser_weaknesses": [
"Vague instruction 'process the document appropriately' led to inconsistent behavior",
"No script for validation, agent had to improvise"
],
"instruction_following": {
"winner": {
"score": 9,
"issues": ["Minor: skipped optional logging step"]
},
"loser": {
"score": 6,
"issues": [
"Did not use the skill's formatting template",
"Invented own approach instead of following step 3"
]
}
},
"improvement_suggestions": [
{
"priority": "high",
"category": "instructions",
"suggestion": "Replace 'process the document appropriately' with explicit steps",
"expected_impact": "Would eliminate ambiguity that caused inconsistent behavior"
}
],
"transcript_insights": {
"winner_execution_pattern": "Read skill -> Followed 5-step process -> Used validation script",
"loser_execution_pattern": "Read skill -> Unclear on approach -> Tried 3 different methods"
}
}
```

View File

@@ -0,0 +1,401 @@
#!/usr/bin/env python3
"""
Aggregate individual run results into benchmark summary statistics.
Reads grading.json files from run directories and produces:
- run_summary with mean, stddev, min, max for each metric
- delta between with_skill and without_skill configurations
Usage:
python aggregate_benchmark.py <benchmark_dir>
Example:
python aggregate_benchmark.py benchmarks/2026-01-15T10-30-00/
The script supports two directory layouts:
Workspace layout (from skill-creator iterations):
<benchmark_dir>/
└── eval-N/
├── with_skill/
│ ├── run-1/grading.json
│ └── run-2/grading.json
└── without_skill/
├── run-1/grading.json
└── run-2/grading.json
Legacy layout (with runs/ subdirectory):
<benchmark_dir>/
└── runs/
└── eval-N/
├── with_skill/
│ └── run-1/grading.json
└── without_skill/
└── run-1/grading.json
"""
import argparse
import json
import math
import sys
from datetime import datetime, timezone
from pathlib import Path
def calculate_stats(values: list[float]) -> dict:
"""Calculate mean, stddev, min, max for a list of values."""
if not values:
return {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0}
n = len(values)
mean = sum(values) / n
if n > 1:
variance = sum((x - mean) ** 2 for x in values) / (n - 1)
stddev = math.sqrt(variance)
else:
stddev = 0.0
return {
"mean": round(mean, 4),
"stddev": round(stddev, 4),
"min": round(min(values), 4),
"max": round(max(values), 4)
}
def load_run_results(benchmark_dir: Path) -> dict:
"""
Load all run results from a benchmark directory.
Returns dict keyed by config name (e.g. "with_skill"/"without_skill",
or "new_skill"/"old_skill"), each containing a list of run results.
"""
# Support both layouts: eval dirs directly under benchmark_dir, or under runs/
runs_dir = benchmark_dir / "runs"
if runs_dir.exists():
search_dir = runs_dir
elif list(benchmark_dir.glob("eval-*")):
search_dir = benchmark_dir
else:
print(f"No eval directories found in {benchmark_dir} or {benchmark_dir / 'runs'}")
return {}
results: dict[str, list] = {}
for eval_idx, eval_dir in enumerate(sorted(search_dir.glob("eval-*"))):
metadata_path = eval_dir / "eval_metadata.json"
if metadata_path.exists():
try:
with open(metadata_path) as mf:
eval_id = json.load(mf).get("eval_id", eval_idx)
except (json.JSONDecodeError, OSError):
eval_id = eval_idx
else:
try:
eval_id = int(eval_dir.name.split("-")[1])
except ValueError:
eval_id = eval_idx
# Discover config directories dynamically rather than hardcoding names
for config_dir in sorted(eval_dir.iterdir()):
if not config_dir.is_dir():
continue
# Skip non-config directories (inputs, outputs, etc.)
if not list(config_dir.glob("run-*")):
continue
config = config_dir.name
if config not in results:
results[config] = []
for run_dir in sorted(config_dir.glob("run-*")):
run_number = int(run_dir.name.split("-")[1])
grading_file = run_dir / "grading.json"
if not grading_file.exists():
print(f"Warning: grading.json not found in {run_dir}")
continue
try:
with open(grading_file) as f:
grading = json.load(f)
except json.JSONDecodeError as e:
print(f"Warning: Invalid JSON in {grading_file}: {e}")
continue
# Extract metrics
result = {
"eval_id": eval_id,
"run_number": run_number,
"pass_rate": grading.get("summary", {}).get("pass_rate", 0.0),
"passed": grading.get("summary", {}).get("passed", 0),
"failed": grading.get("summary", {}).get("failed", 0),
"total": grading.get("summary", {}).get("total", 0),
}
# Extract timing — check grading.json first, then sibling timing.json
timing = grading.get("timing", {})
result["time_seconds"] = timing.get("total_duration_seconds", 0.0)
timing_file = run_dir / "timing.json"
if result["time_seconds"] == 0.0 and timing_file.exists():
try:
with open(timing_file) as tf:
timing_data = json.load(tf)
result["time_seconds"] = timing_data.get("total_duration_seconds", 0.0)
result["tokens"] = timing_data.get("total_tokens", 0)
except json.JSONDecodeError:
pass
# Extract metrics if available
metrics = grading.get("execution_metrics", {})
result["tool_calls"] = metrics.get("total_tool_calls", 0)
if not result.get("tokens"):
result["tokens"] = metrics.get("output_chars", 0)
result["errors"] = metrics.get("errors_encountered", 0)
# Extract expectations — viewer requires fields: text, passed, evidence
raw_expectations = grading.get("expectations", [])
for exp in raw_expectations:
if "text" not in exp or "passed" not in exp:
print(f"Warning: expectation in {grading_file} missing required fields (text, passed, evidence): {exp}")
result["expectations"] = raw_expectations
# Extract notes from user_notes_summary
notes_summary = grading.get("user_notes_summary", {})
notes = []
notes.extend(notes_summary.get("uncertainties", []))
notes.extend(notes_summary.get("needs_review", []))
notes.extend(notes_summary.get("workarounds", []))
result["notes"] = notes
results[config].append(result)
return results
def aggregate_results(results: dict) -> dict:
"""
Aggregate run results into summary statistics.
Returns run_summary with stats for each configuration and delta.
"""
run_summary = {}
configs = list(results.keys())
for config in configs:
runs = results.get(config, [])
if not runs:
run_summary[config] = {
"pass_rate": {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0},
"time_seconds": {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0},
"tokens": {"mean": 0, "stddev": 0, "min": 0, "max": 0}
}
continue
pass_rates = [r["pass_rate"] for r in runs]
times = [r["time_seconds"] for r in runs]
tokens = [r.get("tokens", 0) for r in runs]
run_summary[config] = {
"pass_rate": calculate_stats(pass_rates),
"time_seconds": calculate_stats(times),
"tokens": calculate_stats(tokens)
}
# Calculate delta between the first two configs (if two exist)
if len(configs) >= 2:
primary = run_summary.get(configs[0], {})
baseline = run_summary.get(configs[1], {})
else:
primary = run_summary.get(configs[0], {}) if configs else {}
baseline = {}
delta_pass_rate = primary.get("pass_rate", {}).get("mean", 0) - baseline.get("pass_rate", {}).get("mean", 0)
delta_time = primary.get("time_seconds", {}).get("mean", 0) - baseline.get("time_seconds", {}).get("mean", 0)
delta_tokens = primary.get("tokens", {}).get("mean", 0) - baseline.get("tokens", {}).get("mean", 0)
run_summary["delta"] = {
"pass_rate": f"{delta_pass_rate:+.2f}",
"time_seconds": f"{delta_time:+.1f}",
"tokens": f"{delta_tokens:+.0f}"
}
return run_summary
def generate_benchmark(benchmark_dir: Path, skill_name: str = "", skill_path: str = "") -> dict:
"""
Generate complete benchmark.json from run results.
"""
results = load_run_results(benchmark_dir)
run_summary = aggregate_results(results)
# Build runs array for benchmark.json
runs = []
for config in results:
for result in results[config]:
runs.append({
"eval_id": result["eval_id"],
"configuration": config,
"run_number": result["run_number"],
"result": {
"pass_rate": result["pass_rate"],
"passed": result["passed"],
"failed": result["failed"],
"total": result["total"],
"time_seconds": result["time_seconds"],
"tokens": result.get("tokens", 0),
"tool_calls": result.get("tool_calls", 0),
"errors": result.get("errors", 0)
},
"expectations": result["expectations"],
"notes": result["notes"]
})
# Determine eval IDs from results
eval_ids = sorted(set(
r["eval_id"]
for config in results.values()
for r in config
))
benchmark = {
"metadata": {
"skill_name": skill_name or "<skill-name>",
"skill_path": skill_path or "<path/to/skill>",
"executor_model": "<model-name>",
"analyzer_model": "<model-name>",
"timestamp": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
"evals_run": eval_ids,
"runs_per_configuration": 3
},
"runs": runs,
"run_summary": run_summary,
"notes": [] # To be filled by analyzer
}
return benchmark
def generate_markdown(benchmark: dict) -> str:
"""Generate human-readable benchmark.md from benchmark data."""
metadata = benchmark["metadata"]
run_summary = benchmark["run_summary"]
# Determine config names (excluding "delta")
configs = [k for k in run_summary if k != "delta"]
config_a = configs[0] if len(configs) >= 1 else "config_a"
config_b = configs[1] if len(configs) >= 2 else "config_b"
label_a = config_a.replace("_", " ").title()
label_b = config_b.replace("_", " ").title()
lines = [
f"# Skill Benchmark: {metadata['skill_name']}",
"",
f"**Model**: {metadata['executor_model']}",
f"**Date**: {metadata['timestamp']}",
f"**Evals**: {', '.join(map(str, metadata['evals_run']))} ({metadata['runs_per_configuration']} runs each per configuration)",
"",
"## Summary",
"",
f"| Metric | {label_a} | {label_b} | Delta |",
"|--------|------------|---------------|-------|",
]
a_summary = run_summary.get(config_a, {})
b_summary = run_summary.get(config_b, {})
delta = run_summary.get("delta", {})
# Format pass rate
a_pr = a_summary.get("pass_rate", {})
b_pr = b_summary.get("pass_rate", {})
lines.append(f"| Pass Rate | {a_pr.get('mean', 0)*100:.0f}% ± {a_pr.get('stddev', 0)*100:.0f}% | {b_pr.get('mean', 0)*100:.0f}% ± {b_pr.get('stddev', 0)*100:.0f}% | {delta.get('pass_rate', '')} |")
# Format time
a_time = a_summary.get("time_seconds", {})
b_time = b_summary.get("time_seconds", {})
lines.append(f"| Time | {a_time.get('mean', 0):.1f}s ± {a_time.get('stddev', 0):.1f}s | {b_time.get('mean', 0):.1f}s ± {b_time.get('stddev', 0):.1f}s | {delta.get('time_seconds', '')}s |")
# Format tokens
a_tokens = a_summary.get("tokens", {})
b_tokens = b_summary.get("tokens", {})
lines.append(f"| Tokens | {a_tokens.get('mean', 0):.0f} ± {a_tokens.get('stddev', 0):.0f} | {b_tokens.get('mean', 0):.0f} ± {b_tokens.get('stddev', 0):.0f} | {delta.get('tokens', '')} |")
# Notes section
if benchmark.get("notes"):
lines.extend([
"",
"## Notes",
""
])
for note in benchmark["notes"]:
lines.append(f"- {note}")
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(
description="Aggregate benchmark run results into summary statistics"
)
parser.add_argument(
"benchmark_dir",
type=Path,
help="Path to the benchmark directory"
)
parser.add_argument(
"--skill-name",
default="",
help="Name of the skill being benchmarked"
)
parser.add_argument(
"--skill-path",
default="",
help="Path to the skill being benchmarked"
)
parser.add_argument(
"--output", "-o",
type=Path,
help="Output path for benchmark.json (default: <benchmark_dir>/benchmark.json)"
)
args = parser.parse_args()
if not args.benchmark_dir.exists():
print(f"Directory not found: {args.benchmark_dir}")
sys.exit(1)
# Generate benchmark
benchmark = generate_benchmark(args.benchmark_dir, args.skill_name, args.skill_path)
# Determine output paths
output_json = args.output or (args.benchmark_dir / "benchmark.json")
output_md = output_json.with_suffix(".md")
# Write benchmark.json
with open(output_json, "w") as f:
json.dump(benchmark, f, indent=2)
print(f"Generated: {output_json}")
# Write benchmark.md
markdown = generate_markdown(benchmark)
with open(output_md, "w") as f:
f.write(markdown)
print(f"Generated: {output_md}")
# Print summary
run_summary = benchmark["run_summary"]
configs = [k for k in run_summary if k != "delta"]
delta = run_summary.get("delta", {})
print(f"\nSummary:")
for config in configs:
pr = run_summary[config]["pass_rate"]["mean"]
label = config.replace("_", " ").title()
print(f" {label}: {pr*100:.1f}% pass rate")
print(f" Delta: {delta.get('pass_rate', '')}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,326 @@
#!/usr/bin/env python3
"""Generate an HTML report from run_loop.py output.
Takes the JSON output from run_loop.py and generates a visual HTML report
showing each description attempt with check/x for each test case.
Distinguishes between train and test queries.
"""
import argparse
import html
import json
import sys
from pathlib import Path
def generate_html(data: dict, auto_refresh: bool = False, skill_name: str = "") -> str:
"""Generate HTML report from loop output data. If auto_refresh is True, adds a meta refresh tag."""
history = data.get("history", [])
holdout = data.get("holdout", 0)
title_prefix = html.escape(skill_name + " \u2014 ") if skill_name else ""
# Get all unique queries from train and test sets, with should_trigger info
train_queries: list[dict] = []
test_queries: list[dict] = []
if history:
for r in history[0].get("train_results", history[0].get("results", [])):
train_queries.append({"query": r["query"], "should_trigger": r.get("should_trigger", True)})
if history[0].get("test_results"):
for r in history[0].get("test_results", []):
test_queries.append({"query": r["query"], "should_trigger": r.get("should_trigger", True)})
refresh_tag = ' <meta http-equiv="refresh" content="5">\n' if auto_refresh else ""
html_parts = ["""<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
""" + refresh_tag + """ <title>""" + title_prefix + """Skill Description Optimization</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Poppins:wght@500;600&family=Lora:wght@400;500&display=swap" rel="stylesheet">
<style>
body {
font-family: 'Lora', Georgia, serif;
max-width: 100%;
margin: 0 auto;
padding: 20px;
background: #faf9f5;
color: #141413;
}
h1 { font-family: 'Poppins', sans-serif; color: #141413; }
.explainer {
background: white;
padding: 15px;
border-radius: 6px;
margin-bottom: 20px;
border: 1px solid #e8e6dc;
color: #b0aea5;
font-size: 0.875rem;
line-height: 1.6;
}
.summary {
background: white;
padding: 15px;
border-radius: 6px;
margin-bottom: 20px;
border: 1px solid #e8e6dc;
}
.summary p { margin: 5px 0; }
.best { color: #788c5d; font-weight: bold; }
.table-container {
overflow-x: auto;
width: 100%;
}
table {
border-collapse: collapse;
background: white;
border: 1px solid #e8e6dc;
border-radius: 6px;
font-size: 12px;
min-width: 100%;
}
th, td {
padding: 8px;
text-align: left;
border: 1px solid #e8e6dc;
white-space: normal;
word-wrap: break-word;
}
th {
font-family: 'Poppins', sans-serif;
background: #141413;
color: #faf9f5;
font-weight: 500;
}
th.test-col {
background: #6a9bcc;
}
th.query-col { min-width: 200px; }
td.description {
font-family: monospace;
font-size: 11px;
word-wrap: break-word;
max-width: 400px;
}
td.result {
text-align: center;
font-size: 16px;
min-width: 40px;
}
td.test-result {
background: #f0f6fc;
}
.pass { color: #788c5d; }
.fail { color: #c44; }
.rate {
font-size: 9px;
color: #b0aea5;
display: block;
}
tr:hover { background: #faf9f5; }
.score {
display: inline-block;
padding: 2px 6px;
border-radius: 4px;
font-weight: bold;
font-size: 11px;
}
.score-good { background: #eef2e8; color: #788c5d; }
.score-ok { background: #fef3c7; color: #d97706; }
.score-bad { background: #fceaea; color: #c44; }
.train-label { color: #b0aea5; font-size: 10px; }
.test-label { color: #6a9bcc; font-size: 10px; font-weight: bold; }
.best-row { background: #f5f8f2; }
th.positive-col { border-bottom: 3px solid #788c5d; }
th.negative-col { border-bottom: 3px solid #c44; }
th.test-col.positive-col { border-bottom: 3px solid #788c5d; }
th.test-col.negative-col { border-bottom: 3px solid #c44; }
.legend { font-family: 'Poppins', sans-serif; display: flex; gap: 20px; margin-bottom: 10px; font-size: 13px; align-items: center; }
.legend-item { display: flex; align-items: center; gap: 6px; }
.legend-swatch { width: 16px; height: 16px; border-radius: 3px; display: inline-block; }
.swatch-positive { background: #141413; border-bottom: 3px solid #788c5d; }
.swatch-negative { background: #141413; border-bottom: 3px solid #c44; }
.swatch-test { background: #6a9bcc; }
.swatch-train { background: #141413; }
</style>
</head>
<body>
<h1>""" + title_prefix + """Skill Description Optimization</h1>
<div class="explainer">
<strong>Optimizing your skill's description.</strong> This page updates automatically as Claude tests different versions of your skill's description. Each row is an iteration — a new description attempt. The columns show test queries: green checkmarks mean the skill triggered correctly (or correctly didn't trigger), red crosses mean it got it wrong. The "Train" score shows performance on queries used to improve the description; the "Test" score shows performance on held-out queries the optimizer hasn't seen. When it's done, Claude will apply the best-performing description to your skill.
</div>
"""]
# Summary section
best_test_score = data.get('best_test_score')
best_train_score = data.get('best_train_score')
html_parts.append(f"""
<div class="summary">
<p><strong>Original:</strong> {html.escape(data.get('original_description', 'N/A'))}</p>
<p class="best"><strong>Best:</strong> {html.escape(data.get('best_description', 'N/A'))}</p>
<p><strong>Best Score:</strong> {data.get('best_score', 'N/A')} {'(test)' if best_test_score else '(train)'}</p>
<p><strong>Iterations:</strong> {data.get('iterations_run', 0)} | <strong>Train:</strong> {data.get('train_size', '?')} | <strong>Test:</strong> {data.get('test_size', '?')}</p>
</div>
""")
# Legend
html_parts.append("""
<div class="legend">
<span style="font-weight:600">Query columns:</span>
<span class="legend-item"><span class="legend-swatch swatch-positive"></span> Should trigger</span>
<span class="legend-item"><span class="legend-swatch swatch-negative"></span> Should NOT trigger</span>
<span class="legend-item"><span class="legend-swatch swatch-train"></span> Train</span>
<span class="legend-item"><span class="legend-swatch swatch-test"></span> Test</span>
</div>
""")
# Table header
html_parts.append("""
<div class="table-container">
<table>
<thead>
<tr>
<th>Iter</th>
<th>Train</th>
<th>Test</th>
<th class="query-col">Description</th>
""")
# Add column headers for train queries
for qinfo in train_queries:
polarity = "positive-col" if qinfo["should_trigger"] else "negative-col"
html_parts.append(f' <th class="{polarity}">{html.escape(qinfo["query"])}</th>\n')
# Add column headers for test queries (different color)
for qinfo in test_queries:
polarity = "positive-col" if qinfo["should_trigger"] else "negative-col"
html_parts.append(f' <th class="test-col {polarity}">{html.escape(qinfo["query"])}</th>\n')
html_parts.append(""" </tr>
</thead>
<tbody>
""")
# Find best iteration for highlighting
if test_queries:
best_iter = max(history, key=lambda h: h.get("test_passed") or 0).get("iteration")
else:
best_iter = max(history, key=lambda h: h.get("train_passed", h.get("passed", 0))).get("iteration")
# Add rows for each iteration
for h in history:
iteration = h.get("iteration", "?")
train_passed = h.get("train_passed", h.get("passed", 0))
train_total = h.get("train_total", h.get("total", 0))
test_passed = h.get("test_passed")
test_total = h.get("test_total")
description = h.get("description", "")
train_results = h.get("train_results", h.get("results", []))
test_results = h.get("test_results", [])
# Create lookups for results by query
train_by_query = {r["query"]: r for r in train_results}
test_by_query = {r["query"]: r for r in test_results} if test_results else {}
# Compute aggregate correct/total runs across all retries
def aggregate_runs(results: list[dict]) -> tuple[int, int]:
correct = 0
total = 0
for r in results:
runs = r.get("runs", 0)
triggers = r.get("triggers", 0)
total += runs
if r.get("should_trigger", True):
correct += triggers
else:
correct += runs - triggers
return correct, total
train_correct, train_runs = aggregate_runs(train_results)
test_correct, test_runs = aggregate_runs(test_results)
# Determine score classes
def score_class(correct: int, total: int) -> str:
if total > 0:
ratio = correct / total
if ratio >= 0.8:
return "score-good"
elif ratio >= 0.5:
return "score-ok"
return "score-bad"
train_class = score_class(train_correct, train_runs)
test_class = score_class(test_correct, test_runs)
row_class = "best-row" if iteration == best_iter else ""
html_parts.append(f""" <tr class="{row_class}">
<td>{iteration}</td>
<td><span class="score {train_class}">{train_correct}/{train_runs}</span></td>
<td><span class="score {test_class}">{test_correct}/{test_runs}</span></td>
<td class="description">{html.escape(description)}</td>
""")
# Add result for each train query
for qinfo in train_queries:
r = train_by_query.get(qinfo["query"], {})
did_pass = r.get("pass", False)
triggers = r.get("triggers", 0)
runs = r.get("runs", 0)
icon = "" if did_pass else ""
css_class = "pass" if did_pass else "fail"
html_parts.append(f' <td class="result {css_class}">{icon}<span class="rate">{triggers}/{runs}</span></td>\n')
# Add result for each test query (with different background)
for qinfo in test_queries:
r = test_by_query.get(qinfo["query"], {})
did_pass = r.get("pass", False)
triggers = r.get("triggers", 0)
runs = r.get("runs", 0)
icon = "" if did_pass else ""
css_class = "pass" if did_pass else "fail"
html_parts.append(f' <td class="result test-result {css_class}">{icon}<span class="rate">{triggers}/{runs}</span></td>\n')
html_parts.append(" </tr>\n")
html_parts.append(""" </tbody>
</table>
</div>
""")
html_parts.append("""
</body>
</html>
""")
return "".join(html_parts)
def main():
parser = argparse.ArgumentParser(description="Generate HTML report from run_loop output")
parser.add_argument("input", help="Path to JSON output from run_loop.py (or - for stdin)")
parser.add_argument("-o", "--output", default=None, help="Output HTML file (default: stdout)")
parser.add_argument("--skill-name", default="", help="Skill name to include in the report title")
args = parser.parse_args()
if args.input == "-":
data = json.load(sys.stdin)
else:
data = json.loads(Path(args.input).read_text())
html_output = generate_html(data, skill_name=args.skill_name)
if args.output:
Path(args.output).write_text(html_output)
print(f"Report written to {args.output}", file=sys.stderr)
else:
print(html_output)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,247 @@
#!/usr/bin/env python3
"""Improve a skill description based on eval results.
Takes eval results (from run_eval.py) and generates an improved description
by calling `claude -p` as a subprocess (same auth pattern as run_eval.py —
uses the session's Claude Code auth, no separate ANTHROPIC_API_KEY needed).
"""
import argparse
import json
import os
import re
import subprocess
import sys
from pathlib import Path
from scripts.utils import parse_skill_md
def _call_claude(prompt: str, model: str | None, timeout: int = 300) -> str:
"""Run `claude -p` with the prompt on stdin and return the text response.
Prompt goes over stdin (not argv) because it embeds the full SKILL.md
body and can easily exceed comfortable argv length.
"""
cmd = ["claude", "-p", "--output-format", "text"]
if model:
cmd.extend(["--model", model])
# Remove CLAUDECODE env var to allow nesting claude -p inside a
# Claude Code session. The guard is for interactive terminal conflicts;
# programmatic subprocess usage is safe. Same pattern as run_eval.py.
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"}
result = subprocess.run(
cmd,
input=prompt,
capture_output=True,
text=True,
env=env,
timeout=timeout,
)
if result.returncode != 0:
raise RuntimeError(
f"claude -p exited {result.returncode}\nstderr: {result.stderr}"
)
return result.stdout
def improve_description(
skill_name: str,
skill_content: str,
current_description: str,
eval_results: dict,
history: list[dict],
model: str,
test_results: dict | None = None,
log_dir: Path | None = None,
iteration: int | None = None,
) -> str:
"""Call Claude to improve the description based on eval results."""
failed_triggers = [
r for r in eval_results["results"]
if r["should_trigger"] and not r["pass"]
]
false_triggers = [
r for r in eval_results["results"]
if not r["should_trigger"] and not r["pass"]
]
# Build scores summary
train_score = f"{eval_results['summary']['passed']}/{eval_results['summary']['total']}"
if test_results:
test_score = f"{test_results['summary']['passed']}/{test_results['summary']['total']}"
scores_summary = f"Train: {train_score}, Test: {test_score}"
else:
scores_summary = f"Train: {train_score}"
prompt = f"""You are optimizing a skill description for a Claude Code skill called "{skill_name}". A "skill" is sort of like a prompt, but with progressive disclosure -- there's a title and description that Claude sees when deciding whether to use the skill, and then if it does use the skill, it reads the .md file which has lots more details and potentially links to other resources in the skill folder like helper files and scripts and additional documentation or examples.
The description appears in Claude's "available_skills" list. When a user sends a query, Claude decides whether to invoke the skill based solely on the title and on this description. Your goal is to write a description that triggers for relevant queries, and doesn't trigger for irrelevant ones.
Here's the current description:
<current_description>
"{current_description}"
</current_description>
Current scores ({scores_summary}):
<scores_summary>
"""
if failed_triggers:
prompt += "FAILED TO TRIGGER (should have triggered but didn't):\n"
for r in failed_triggers:
prompt += f' - "{r["query"]}" (triggered {r["triggers"]}/{r["runs"]} times)\n'
prompt += "\n"
if false_triggers:
prompt += "FALSE TRIGGERS (triggered but shouldn't have):\n"
for r in false_triggers:
prompt += f' - "{r["query"]}" (triggered {r["triggers"]}/{r["runs"]} times)\n'
prompt += "\n"
if history:
prompt += "PREVIOUS ATTEMPTS (do NOT repeat these — try something structurally different):\n\n"
for h in history:
train_s = f"{h.get('train_passed', h.get('passed', 0))}/{h.get('train_total', h.get('total', 0))}"
test_s = f"{h.get('test_passed', '?')}/{h.get('test_total', '?')}" if h.get('test_passed') is not None else None
score_str = f"train={train_s}" + (f", test={test_s}" if test_s else "")
prompt += f'<attempt {score_str}>\n'
prompt += f'Description: "{h["description"]}"\n'
if "results" in h:
prompt += "Train results:\n"
for r in h["results"]:
status = "PASS" if r["pass"] else "FAIL"
prompt += f' [{status}] "{r["query"][:80]}" (triggered {r["triggers"]}/{r["runs"]})\n'
if h.get("note"):
prompt += f'Note: {h["note"]}\n'
prompt += "</attempt>\n\n"
prompt += f"""</scores_summary>
Skill content (for context on what the skill does):
<skill_content>
{skill_content}
</skill_content>
Based on the failures, write a new and improved description that is more likely to trigger correctly. When I say "based on the failures", it's a bit of a tricky line to walk because we don't want to overfit to the specific cases you're seeing. So what I DON'T want you to do is produce an ever-expanding list of specific queries that this skill should or shouldn't trigger for. Instead, try to generalize from the failures to broader categories of user intent and situations where this skill would be useful or not useful. The reason for this is twofold:
1. Avoid overfitting
2. The list might get loooong and it's injected into ALL queries and there might be a lot of skills, so we don't want to blow too much space on any given description.
Concretely, your description should not be more than about 100-200 words, even if that comes at the cost of accuracy. There is a hard limit of 1024 characters — descriptions over that will be truncated, so stay comfortably under it.
Here are some tips that we've found to work well in writing these descriptions:
- The skill should be phrased in the imperative -- "Use this skill for" rather than "this skill does"
- The skill description should focus on the user's intent, what they are trying to achieve, vs. the implementation details of how the skill works.
- The description competes with other skills for Claude's attention — make it distinctive and immediately recognizable.
- If you're getting lots of failures after repeated attempts, change things up. Try different sentence structures or wordings.
I'd encourage you to be creative and mix up the style in different iterations since you'll have multiple opportunities to try different approaches and we'll just grab the highest-scoring one at the end.
Please respond with only the new description text in <new_description> tags, nothing else."""
text = _call_claude(prompt, model)
match = re.search(r"<new_description>(.*?)</new_description>", text, re.DOTALL)
description = match.group(1).strip().strip('"') if match else text.strip().strip('"')
transcript: dict = {
"iteration": iteration,
"prompt": prompt,
"response": text,
"parsed_description": description,
"char_count": len(description),
"over_limit": len(description) > 1024,
}
# Safety net: the prompt already states the 1024-char hard limit, but if
# the model blew past it anyway, make one fresh single-turn call that
# quotes the too-long version and asks for a shorter rewrite. (The old
# SDK path did this as a true multi-turn; `claude -p` is one-shot, so we
# inline the prior output into the new prompt instead.)
if len(description) > 1024:
shorten_prompt = (
f"{prompt}\n\n"
f"---\n\n"
f"A previous attempt produced this description, which at "
f"{len(description)} characters is over the 1024-character hard limit:\n\n"
f'"{description}"\n\n'
f"Rewrite it to be under 1024 characters while keeping the most "
f"important trigger words and intent coverage. Respond with only "
f"the new description in <new_description> tags."
)
shorten_text = _call_claude(shorten_prompt, model)
match = re.search(r"<new_description>(.*?)</new_description>", shorten_text, re.DOTALL)
shortened = match.group(1).strip().strip('"') if match else shorten_text.strip().strip('"')
transcript["rewrite_prompt"] = shorten_prompt
transcript["rewrite_response"] = shorten_text
transcript["rewrite_description"] = shortened
transcript["rewrite_char_count"] = len(shortened)
description = shortened
transcript["final_description"] = description
if log_dir:
log_dir.mkdir(parents=True, exist_ok=True)
log_file = log_dir / f"improve_iter_{iteration or 'unknown'}.json"
log_file.write_text(json.dumps(transcript, indent=2))
return description
def main():
parser = argparse.ArgumentParser(description="Improve a skill description based on eval results")
parser.add_argument("--eval-results", required=True, help="Path to eval results JSON (from run_eval.py)")
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
parser.add_argument("--history", default=None, help="Path to history JSON (previous attempts)")
parser.add_argument("--model", required=True, help="Model for improvement")
parser.add_argument("--verbose", action="store_true", help="Print thinking to stderr")
args = parser.parse_args()
skill_path = Path(args.skill_path)
if not (skill_path / "SKILL.md").exists():
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
sys.exit(1)
eval_results = json.loads(Path(args.eval_results).read_text())
history = []
if args.history:
history = json.loads(Path(args.history).read_text())
name, _, content = parse_skill_md(skill_path)
current_description = eval_results["description"]
if args.verbose:
print(f"Current: {current_description}", file=sys.stderr)
print(f"Score: {eval_results['summary']['passed']}/{eval_results['summary']['total']}", file=sys.stderr)
new_description = improve_description(
skill_name=name,
skill_content=content,
current_description=current_description,
eval_results=eval_results,
history=history,
model=args.model,
)
if args.verbose:
print(f"Improved: {new_description}", file=sys.stderr)
# Output as JSON with both the new description and updated history
output = {
"description": new_description,
"history": history + [{
"description": current_description,
"passed": eval_results["summary"]["passed"],
"failed": eval_results["summary"]["failed"],
"total": eval_results["summary"]["total"],
"results": eval_results["results"],
}],
}
print(json.dumps(output, indent=2))
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,136 @@
#!/usr/bin/env python3
"""
Skill Packager - Creates a distributable .skill file of a skill folder
Usage:
python utils/package_skill.py <path/to/skill-folder> [output-directory]
Example:
python utils/package_skill.py skills/public/my-skill
python utils/package_skill.py skills/public/my-skill ./dist
"""
import fnmatch
import sys
import zipfile
from pathlib import Path
from scripts.quick_validate import validate_skill
# Patterns to exclude when packaging skills.
EXCLUDE_DIRS = {"__pycache__", "node_modules"}
EXCLUDE_GLOBS = {"*.pyc"}
EXCLUDE_FILES = {".DS_Store"}
# Directories excluded only at the skill root (not when nested deeper).
ROOT_EXCLUDE_DIRS = {"evals"}
def should_exclude(rel_path: Path) -> bool:
"""Check if a path should be excluded from packaging."""
parts = rel_path.parts
if any(part in EXCLUDE_DIRS for part in parts):
return True
# rel_path is relative to skill_path.parent, so parts[0] is the skill
# folder name and parts[1] (if present) is the first subdir.
if len(parts) > 1 and parts[1] in ROOT_EXCLUDE_DIRS:
return True
name = rel_path.name
if name in EXCLUDE_FILES:
return True
return any(fnmatch.fnmatch(name, pat) for pat in EXCLUDE_GLOBS)
def package_skill(skill_path, output_dir=None):
"""
Package a skill folder into a .skill file.
Args:
skill_path: Path to the skill folder
output_dir: Optional output directory for the .skill file (defaults to current directory)
Returns:
Path to the created .skill file, or None if error
"""
skill_path = Path(skill_path).resolve()
# Validate skill folder exists
if not skill_path.exists():
print(f"❌ Error: Skill folder not found: {skill_path}")
return None
if not skill_path.is_dir():
print(f"❌ Error: Path is not a directory: {skill_path}")
return None
# Validate SKILL.md exists
skill_md = skill_path / "SKILL.md"
if not skill_md.exists():
print(f"❌ Error: SKILL.md not found in {skill_path}")
return None
# Run validation before packaging
print("🔍 Validating skill...")
valid, message = validate_skill(skill_path)
if not valid:
print(f"❌ Validation failed: {message}")
print(" Please fix the validation errors before packaging.")
return None
print(f"{message}\n")
# Determine output location
skill_name = skill_path.name
if output_dir:
output_path = Path(output_dir).resolve()
output_path.mkdir(parents=True, exist_ok=True)
else:
output_path = Path.cwd()
skill_filename = output_path / f"{skill_name}.skill"
# Create the .skill file (zip format)
try:
with zipfile.ZipFile(skill_filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
# Walk through the skill directory, excluding build artifacts
for file_path in skill_path.rglob('*'):
if not file_path.is_file():
continue
arcname = file_path.relative_to(skill_path.parent)
if should_exclude(arcname):
print(f" Skipped: {arcname}")
continue
zipf.write(file_path, arcname)
print(f" Added: {arcname}")
print(f"\n✅ Successfully packaged skill to: {skill_filename}")
return skill_filename
except Exception as e:
print(f"❌ Error creating .skill file: {e}")
return None
def main():
if len(sys.argv) < 2:
print("Usage: python utils/package_skill.py <path/to/skill-folder> [output-directory]")
print("\nExample:")
print(" python utils/package_skill.py skills/public/my-skill")
print(" python utils/package_skill.py skills/public/my-skill ./dist")
sys.exit(1)
skill_path = sys.argv[1]
output_dir = sys.argv[2] if len(sys.argv) > 2 else None
print(f"📦 Packaging skill: {skill_path}")
if output_dir:
print(f" Output directory: {output_dir}")
print()
result = package_skill(skill_path, output_dir)
if result:
sys.exit(0)
else:
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,103 @@
#!/usr/bin/env python3
"""
Quick validation script for skills - minimal version
"""
import sys
import os
import re
import yaml
from pathlib import Path
def validate_skill(skill_path):
"""Basic validation of a skill"""
skill_path = Path(skill_path)
# Check SKILL.md exists
skill_md = skill_path / 'SKILL.md'
if not skill_md.exists():
return False, "SKILL.md not found"
# Read and validate frontmatter
content = skill_md.read_text()
if not content.startswith('---'):
return False, "No YAML frontmatter found"
# Extract frontmatter
match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
if not match:
return False, "Invalid frontmatter format"
frontmatter_text = match.group(1)
# Parse YAML frontmatter
try:
frontmatter = yaml.safe_load(frontmatter_text)
if not isinstance(frontmatter, dict):
return False, "Frontmatter must be a YAML dictionary"
except yaml.YAMLError as e:
return False, f"Invalid YAML in frontmatter: {e}"
# Define allowed properties
ALLOWED_PROPERTIES = {'name', 'description', 'license', 'allowed-tools', 'metadata', 'compatibility'}
# Check for unexpected properties (excluding nested keys under metadata)
unexpected_keys = set(frontmatter.keys()) - ALLOWED_PROPERTIES
if unexpected_keys:
return False, (
f"Unexpected key(s) in SKILL.md frontmatter: {', '.join(sorted(unexpected_keys))}. "
f"Allowed properties are: {', '.join(sorted(ALLOWED_PROPERTIES))}"
)
# Check required fields
if 'name' not in frontmatter:
return False, "Missing 'name' in frontmatter"
if 'description' not in frontmatter:
return False, "Missing 'description' in frontmatter"
# Extract name for validation
name = frontmatter.get('name', '')
if not isinstance(name, str):
return False, f"Name must be a string, got {type(name).__name__}"
name = name.strip()
if name:
# Check naming convention (kebab-case: lowercase with hyphens)
if not re.match(r'^[a-z0-9-]+$', name):
return False, f"Name '{name}' should be kebab-case (lowercase letters, digits, and hyphens only)"
if name.startswith('-') or name.endswith('-') or '--' in name:
return False, f"Name '{name}' cannot start/end with hyphen or contain consecutive hyphens"
# Check name length (max 64 characters per spec)
if len(name) > 64:
return False, f"Name is too long ({len(name)} characters). Maximum is 64 characters."
# Extract and validate description
description = frontmatter.get('description', '')
if not isinstance(description, str):
return False, f"Description must be a string, got {type(description).__name__}"
description = description.strip()
if description:
# Check for angle brackets
if '<' in description or '>' in description:
return False, "Description cannot contain angle brackets (< or >)"
# Check description length (max 1024 characters per spec)
if len(description) > 1024:
return False, f"Description is too long ({len(description)} characters). Maximum is 1024 characters."
# Validate compatibility field if present (optional)
compatibility = frontmatter.get('compatibility', '')
if compatibility:
if not isinstance(compatibility, str):
return False, f"Compatibility must be a string, got {type(compatibility).__name__}"
if len(compatibility) > 500:
return False, f"Compatibility is too long ({len(compatibility)} characters). Maximum is 500 characters."
return True, "Skill is valid!"
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python quick_validate.py <skill_directory>")
sys.exit(1)
valid, message = validate_skill(sys.argv[1])
print(message)
sys.exit(0 if valid else 1)

View File

@@ -0,0 +1,310 @@
#!/usr/bin/env python3
"""Run trigger evaluation for a skill description.
Tests whether a skill's description causes Claude to trigger (read the skill)
for a set of queries. Outputs results as JSON.
"""
import argparse
import json
import os
import select
import subprocess
import sys
import time
import uuid
from concurrent.futures import ProcessPoolExecutor, as_completed
from pathlib import Path
from scripts.utils import parse_skill_md
def find_project_root() -> Path:
"""Find the project root by walking up from cwd looking for .claude/.
Mimics how Claude Code discovers its project root, so the command file
we create ends up where claude -p will look for it.
"""
current = Path.cwd()
for parent in [current, *current.parents]:
if (parent / ".claude").is_dir():
return parent
return current
def run_single_query(
query: str,
skill_name: str,
skill_description: str,
timeout: int,
project_root: str,
model: str | None = None,
) -> bool:
"""Run a single query and return whether the skill was triggered.
Creates a command file in .claude/commands/ so it appears in Claude's
available_skills list, then runs `claude -p` with the raw query.
Uses --include-partial-messages to detect triggering early from
stream events (content_block_start) rather than waiting for the
full assistant message, which only arrives after tool execution.
"""
unique_id = uuid.uuid4().hex[:8]
clean_name = f"{skill_name}-skill-{unique_id}"
project_commands_dir = Path(project_root) / ".claude" / "commands"
command_file = project_commands_dir / f"{clean_name}.md"
try:
project_commands_dir.mkdir(parents=True, exist_ok=True)
# Use YAML block scalar to avoid breaking on quotes in description
indented_desc = "\n ".join(skill_description.split("\n"))
command_content = (
f"---\n"
f"description: |\n"
f" {indented_desc}\n"
f"---\n\n"
f"# {skill_name}\n\n"
f"This skill handles: {skill_description}\n"
)
command_file.write_text(command_content)
cmd = [
"claude",
"-p", query,
"--output-format", "stream-json",
"--verbose",
"--include-partial-messages",
]
if model:
cmd.extend(["--model", model])
# Remove CLAUDECODE env var to allow nesting claude -p inside a
# Claude Code session. The guard is for interactive terminal conflicts;
# programmatic subprocess usage is safe.
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"}
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.DEVNULL,
cwd=project_root,
env=env,
)
triggered = False
start_time = time.time()
buffer = ""
# Track state for stream event detection
pending_tool_name = None
accumulated_json = ""
try:
while time.time() - start_time < timeout:
if process.poll() is not None:
remaining = process.stdout.read()
if remaining:
buffer += remaining.decode("utf-8", errors="replace")
break
ready, _, _ = select.select([process.stdout], [], [], 1.0)
if not ready:
continue
chunk = os.read(process.stdout.fileno(), 8192)
if not chunk:
break
buffer += chunk.decode("utf-8", errors="replace")
while "\n" in buffer:
line, buffer = buffer.split("\n", 1)
line = line.strip()
if not line:
continue
try:
event = json.loads(line)
except json.JSONDecodeError:
continue
# Early detection via stream events
if event.get("type") == "stream_event":
se = event.get("event", {})
se_type = se.get("type", "")
if se_type == "content_block_start":
cb = se.get("content_block", {})
if cb.get("type") == "tool_use":
tool_name = cb.get("name", "")
if tool_name in ("Skill", "Read"):
pending_tool_name = tool_name
accumulated_json = ""
else:
return False
elif se_type == "content_block_delta" and pending_tool_name:
delta = se.get("delta", {})
if delta.get("type") == "input_json_delta":
accumulated_json += delta.get("partial_json", "")
if clean_name in accumulated_json:
return True
elif se_type in ("content_block_stop", "message_stop"):
if pending_tool_name:
return clean_name in accumulated_json
if se_type == "message_stop":
return False
# Fallback: full assistant message
elif event.get("type") == "assistant":
message = event.get("message", {})
for content_item in message.get("content", []):
if content_item.get("type") != "tool_use":
continue
tool_name = content_item.get("name", "")
tool_input = content_item.get("input", {})
if tool_name == "Skill" and clean_name in tool_input.get("skill", ""):
triggered = True
elif tool_name == "Read" and clean_name in tool_input.get("file_path", ""):
triggered = True
return triggered
elif event.get("type") == "result":
return triggered
finally:
# Clean up process on any exit path (return, exception, timeout)
if process.poll() is None:
process.kill()
process.wait()
return triggered
finally:
if command_file.exists():
command_file.unlink()
def run_eval(
eval_set: list[dict],
skill_name: str,
description: str,
num_workers: int,
timeout: int,
project_root: Path,
runs_per_query: int = 1,
trigger_threshold: float = 0.5,
model: str | None = None,
) -> dict:
"""Run the full eval set and return results."""
results = []
with ProcessPoolExecutor(max_workers=num_workers) as executor:
future_to_info = {}
for item in eval_set:
for run_idx in range(runs_per_query):
future = executor.submit(
run_single_query,
item["query"],
skill_name,
description,
timeout,
str(project_root),
model,
)
future_to_info[future] = (item, run_idx)
query_triggers: dict[str, list[bool]] = {}
query_items: dict[str, dict] = {}
for future in as_completed(future_to_info):
item, _ = future_to_info[future]
query = item["query"]
query_items[query] = item
if query not in query_triggers:
query_triggers[query] = []
try:
query_triggers[query].append(future.result())
except Exception as e:
print(f"Warning: query failed: {e}", file=sys.stderr)
query_triggers[query].append(False)
for query, triggers in query_triggers.items():
item = query_items[query]
trigger_rate = sum(triggers) / len(triggers)
should_trigger = item["should_trigger"]
if should_trigger:
did_pass = trigger_rate >= trigger_threshold
else:
did_pass = trigger_rate < trigger_threshold
results.append({
"query": query,
"should_trigger": should_trigger,
"trigger_rate": trigger_rate,
"triggers": sum(triggers),
"runs": len(triggers),
"pass": did_pass,
})
passed = sum(1 for r in results if r["pass"])
total = len(results)
return {
"skill_name": skill_name,
"description": description,
"results": results,
"summary": {
"total": total,
"passed": passed,
"failed": total - passed,
},
}
def main():
parser = argparse.ArgumentParser(description="Run trigger evaluation for a skill description")
parser.add_argument("--eval-set", required=True, help="Path to eval set JSON file")
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
parser.add_argument("--description", default=None, help="Override description to test")
parser.add_argument("--num-workers", type=int, default=10, help="Number of parallel workers")
parser.add_argument("--timeout", type=int, default=30, help="Timeout per query in seconds")
parser.add_argument("--runs-per-query", type=int, default=3, help="Number of runs per query")
parser.add_argument("--trigger-threshold", type=float, default=0.5, help="Trigger rate threshold")
parser.add_argument("--model", default=None, help="Model to use for claude -p (default: user's configured model)")
parser.add_argument("--verbose", action="store_true", help="Print progress to stderr")
args = parser.parse_args()
eval_set = json.loads(Path(args.eval_set).read_text())
skill_path = Path(args.skill_path)
if not (skill_path / "SKILL.md").exists():
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
sys.exit(1)
name, original_description, content = parse_skill_md(skill_path)
description = args.description or original_description
project_root = find_project_root()
if args.verbose:
print(f"Evaluating: {description}", file=sys.stderr)
output = run_eval(
eval_set=eval_set,
skill_name=name,
description=description,
num_workers=args.num_workers,
timeout=args.timeout,
project_root=project_root,
runs_per_query=args.runs_per_query,
trigger_threshold=args.trigger_threshold,
model=args.model,
)
if args.verbose:
summary = output["summary"]
print(f"Results: {summary['passed']}/{summary['total']} passed", file=sys.stderr)
for r in output["results"]:
status = "PASS" if r["pass"] else "FAIL"
rate_str = f"{r['triggers']}/{r['runs']}"
print(f" [{status}] rate={rate_str} expected={r['should_trigger']}: {r['query'][:70]}", file=sys.stderr)
print(json.dumps(output, indent=2))
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,328 @@
#!/usr/bin/env python3
"""Run the eval + improve loop until all pass or max iterations reached.
Combines run_eval.py and improve_description.py in a loop, tracking history
and returning the best description found. Supports train/test split to prevent
overfitting.
"""
import argparse
import json
import random
import sys
import tempfile
import time
import webbrowser
from pathlib import Path
from scripts.generate_report import generate_html
from scripts.improve_description import improve_description
from scripts.run_eval import find_project_root, run_eval
from scripts.utils import parse_skill_md
def split_eval_set(eval_set: list[dict], holdout: float, seed: int = 42) -> tuple[list[dict], list[dict]]:
"""Split eval set into train and test sets, stratified by should_trigger."""
random.seed(seed)
# Separate by should_trigger
trigger = [e for e in eval_set if e["should_trigger"]]
no_trigger = [e for e in eval_set if not e["should_trigger"]]
# Shuffle each group
random.shuffle(trigger)
random.shuffle(no_trigger)
# Calculate split points
n_trigger_test = max(1, int(len(trigger) * holdout))
n_no_trigger_test = max(1, int(len(no_trigger) * holdout))
# Split
test_set = trigger[:n_trigger_test] + no_trigger[:n_no_trigger_test]
train_set = trigger[n_trigger_test:] + no_trigger[n_no_trigger_test:]
return train_set, test_set
def run_loop(
eval_set: list[dict],
skill_path: Path,
description_override: str | None,
num_workers: int,
timeout: int,
max_iterations: int,
runs_per_query: int,
trigger_threshold: float,
holdout: float,
model: str,
verbose: bool,
live_report_path: Path | None = None,
log_dir: Path | None = None,
) -> dict:
"""Run the eval + improvement loop."""
project_root = find_project_root()
name, original_description, content = parse_skill_md(skill_path)
current_description = description_override or original_description
# Split into train/test if holdout > 0
if holdout > 0:
train_set, test_set = split_eval_set(eval_set, holdout)
if verbose:
print(f"Split: {len(train_set)} train, {len(test_set)} test (holdout={holdout})", file=sys.stderr)
else:
train_set = eval_set
test_set = []
history = []
exit_reason = "unknown"
for iteration in range(1, max_iterations + 1):
if verbose:
print(f"\n{'='*60}", file=sys.stderr)
print(f"Iteration {iteration}/{max_iterations}", file=sys.stderr)
print(f"Description: {current_description}", file=sys.stderr)
print(f"{'='*60}", file=sys.stderr)
# Evaluate train + test together in one batch for parallelism
all_queries = train_set + test_set
t0 = time.time()
all_results = run_eval(
eval_set=all_queries,
skill_name=name,
description=current_description,
num_workers=num_workers,
timeout=timeout,
project_root=project_root,
runs_per_query=runs_per_query,
trigger_threshold=trigger_threshold,
model=model,
)
eval_elapsed = time.time() - t0
# Split results back into train/test by matching queries
train_queries_set = {q["query"] for q in train_set}
train_result_list = [r for r in all_results["results"] if r["query"] in train_queries_set]
test_result_list = [r for r in all_results["results"] if r["query"] not in train_queries_set]
train_passed = sum(1 for r in train_result_list if r["pass"])
train_total = len(train_result_list)
train_summary = {"passed": train_passed, "failed": train_total - train_passed, "total": train_total}
train_results = {"results": train_result_list, "summary": train_summary}
if test_set:
test_passed = sum(1 for r in test_result_list if r["pass"])
test_total = len(test_result_list)
test_summary = {"passed": test_passed, "failed": test_total - test_passed, "total": test_total}
test_results = {"results": test_result_list, "summary": test_summary}
else:
test_results = None
test_summary = None
history.append({
"iteration": iteration,
"description": current_description,
"train_passed": train_summary["passed"],
"train_failed": train_summary["failed"],
"train_total": train_summary["total"],
"train_results": train_results["results"],
"test_passed": test_summary["passed"] if test_summary else None,
"test_failed": test_summary["failed"] if test_summary else None,
"test_total": test_summary["total"] if test_summary else None,
"test_results": test_results["results"] if test_results else None,
# For backward compat with report generator
"passed": train_summary["passed"],
"failed": train_summary["failed"],
"total": train_summary["total"],
"results": train_results["results"],
})
# Write live report if path provided
if live_report_path:
partial_output = {
"original_description": original_description,
"best_description": current_description,
"best_score": "in progress",
"iterations_run": len(history),
"holdout": holdout,
"train_size": len(train_set),
"test_size": len(test_set),
"history": history,
}
live_report_path.write_text(generate_html(partial_output, auto_refresh=True, skill_name=name))
if verbose:
def print_eval_stats(label, results, elapsed):
pos = [r for r in results if r["should_trigger"]]
neg = [r for r in results if not r["should_trigger"]]
tp = sum(r["triggers"] for r in pos)
pos_runs = sum(r["runs"] for r in pos)
fn = pos_runs - tp
fp = sum(r["triggers"] for r in neg)
neg_runs = sum(r["runs"] for r in neg)
tn = neg_runs - fp
total = tp + tn + fp + fn
precision = tp / (tp + fp) if (tp + fp) > 0 else 1.0
recall = tp / (tp + fn) if (tp + fn) > 0 else 1.0
accuracy = (tp + tn) / total if total > 0 else 0.0
print(f"{label}: {tp+tn}/{total} correct, precision={precision:.0%} recall={recall:.0%} accuracy={accuracy:.0%} ({elapsed:.1f}s)", file=sys.stderr)
for r in results:
status = "PASS" if r["pass"] else "FAIL"
rate_str = f"{r['triggers']}/{r['runs']}"
print(f" [{status}] rate={rate_str} expected={r['should_trigger']}: {r['query'][:60]}", file=sys.stderr)
print_eval_stats("Train", train_results["results"], eval_elapsed)
if test_summary:
print_eval_stats("Test ", test_results["results"], 0)
if train_summary["failed"] == 0:
exit_reason = f"all_passed (iteration {iteration})"
if verbose:
print(f"\nAll train queries passed on iteration {iteration}!", file=sys.stderr)
break
if iteration == max_iterations:
exit_reason = f"max_iterations ({max_iterations})"
if verbose:
print(f"\nMax iterations reached ({max_iterations}).", file=sys.stderr)
break
# Improve the description based on train results
if verbose:
print(f"\nImproving description...", file=sys.stderr)
t0 = time.time()
# Strip test scores from history so improvement model can't see them
blinded_history = [
{k: v for k, v in h.items() if not k.startswith("test_")}
for h in history
]
new_description = improve_description(
skill_name=name,
skill_content=content,
current_description=current_description,
eval_results=train_results,
history=blinded_history,
model=model,
log_dir=log_dir,
iteration=iteration,
)
improve_elapsed = time.time() - t0
if verbose:
print(f"Proposed ({improve_elapsed:.1f}s): {new_description}", file=sys.stderr)
current_description = new_description
# Find the best iteration by TEST score (or train if no test set)
if test_set:
best = max(history, key=lambda h: h["test_passed"] or 0)
best_score = f"{best['test_passed']}/{best['test_total']}"
else:
best = max(history, key=lambda h: h["train_passed"])
best_score = f"{best['train_passed']}/{best['train_total']}"
if verbose:
print(f"\nExit reason: {exit_reason}", file=sys.stderr)
print(f"Best score: {best_score} (iteration {best['iteration']})", file=sys.stderr)
return {
"exit_reason": exit_reason,
"original_description": original_description,
"best_description": best["description"],
"best_score": best_score,
"best_train_score": f"{best['train_passed']}/{best['train_total']}",
"best_test_score": f"{best['test_passed']}/{best['test_total']}" if test_set else None,
"final_description": current_description,
"iterations_run": len(history),
"holdout": holdout,
"train_size": len(train_set),
"test_size": len(test_set),
"history": history,
}
def main():
parser = argparse.ArgumentParser(description="Run eval + improve loop")
parser.add_argument("--eval-set", required=True, help="Path to eval set JSON file")
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
parser.add_argument("--description", default=None, help="Override starting description")
parser.add_argument("--num-workers", type=int, default=10, help="Number of parallel workers")
parser.add_argument("--timeout", type=int, default=30, help="Timeout per query in seconds")
parser.add_argument("--max-iterations", type=int, default=5, help="Max improvement iterations")
parser.add_argument("--runs-per-query", type=int, default=3, help="Number of runs per query")
parser.add_argument("--trigger-threshold", type=float, default=0.5, help="Trigger rate threshold")
parser.add_argument("--holdout", type=float, default=0.4, help="Fraction of eval set to hold out for testing (0 to disable)")
parser.add_argument("--model", required=True, help="Model for improvement")
parser.add_argument("--verbose", action="store_true", help="Print progress to stderr")
parser.add_argument("--report", default="auto", help="Generate HTML report at this path (default: 'auto' for temp file, 'none' to disable)")
parser.add_argument("--results-dir", default=None, help="Save all outputs (results.json, report.html, log.txt) to a timestamped subdirectory here")
args = parser.parse_args()
eval_set = json.loads(Path(args.eval_set).read_text())
skill_path = Path(args.skill_path)
if not (skill_path / "SKILL.md").exists():
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
sys.exit(1)
name, _, _ = parse_skill_md(skill_path)
# Set up live report path
if args.report != "none":
if args.report == "auto":
timestamp = time.strftime("%Y%m%d_%H%M%S")
live_report_path = Path(tempfile.gettempdir()) / f"skill_description_report_{skill_path.name}_{timestamp}.html"
else:
live_report_path = Path(args.report)
# Open the report immediately so the user can watch
live_report_path.write_text("<html><body><h1>Starting optimization loop...</h1><meta http-equiv='refresh' content='5'></body></html>")
webbrowser.open(str(live_report_path))
else:
live_report_path = None
# Determine output directory (create before run_loop so logs can be written)
if args.results_dir:
timestamp = time.strftime("%Y-%m-%d_%H%M%S")
results_dir = Path(args.results_dir) / timestamp
results_dir.mkdir(parents=True, exist_ok=True)
else:
results_dir = None
log_dir = results_dir / "logs" if results_dir else None
output = run_loop(
eval_set=eval_set,
skill_path=skill_path,
description_override=args.description,
num_workers=args.num_workers,
timeout=args.timeout,
max_iterations=args.max_iterations,
runs_per_query=args.runs_per_query,
trigger_threshold=args.trigger_threshold,
holdout=args.holdout,
model=args.model,
verbose=args.verbose,
live_report_path=live_report_path,
log_dir=log_dir,
)
# Save JSON output
json_output = json.dumps(output, indent=2)
print(json_output)
if results_dir:
(results_dir / "results.json").write_text(json_output)
# Write final HTML report (without auto-refresh)
if live_report_path:
live_report_path.write_text(generate_html(output, auto_refresh=False, skill_name=name))
print(f"\nReport: {live_report_path}", file=sys.stderr)
if results_dir and live_report_path:
(results_dir / "report.html").write_text(generate_html(output, auto_refresh=False, skill_name=name))
if results_dir:
print(f"Results saved to: {results_dir}", file=sys.stderr)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,47 @@
"""Shared utilities for skill-creator scripts."""
from pathlib import Path
def parse_skill_md(skill_path: Path) -> tuple[str, str, str]:
"""Parse a SKILL.md file, returning (name, description, full_content)."""
content = (skill_path / "SKILL.md").read_text()
lines = content.split("\n")
if lines[0].strip() != "---":
raise ValueError("SKILL.md missing frontmatter (no opening ---)")
end_idx = None
for i, line in enumerate(lines[1:], start=1):
if line.strip() == "---":
end_idx = i
break
if end_idx is None:
raise ValueError("SKILL.md missing frontmatter (no closing ---)")
name = ""
description = ""
frontmatter_lines = lines[1:end_idx]
i = 0
while i < len(frontmatter_lines):
line = frontmatter_lines[i]
if line.startswith("name:"):
name = line[len("name:"):].strip().strip('"').strip("'")
elif line.startswith("description:"):
value = line[len("description:"):].strip()
# Handle YAML multiline indicators (>, |, >-, |-)
if value in (">", "|", ">-", "|-"):
continuation_lines: list[str] = []
i += 1
while i < len(frontmatter_lines) and (frontmatter_lines[i].startswith(" ") or frontmatter_lines[i].startswith("\t")):
continuation_lines.append(frontmatter_lines[i].strip())
i += 1
description = " ".join(continuation_lines)
continue
else:
description = value.strip('"').strip("'")
i += 1
return name, description, content

View File

@@ -0,0 +1,154 @@
---
name: vue-best-practices
description: MUST be used for Vue.js tasks. Strongly recommends Composition API with `<script setup>` and TypeScript as the standard approach. Covers Vue 3, SSR, Volar, vue-tsc. Load for any Vue, .vue files, Vue Router, Pinia, or Vite with Vue work. ALWAYS use Composition API unless the project explicitly requires Options API.
license: MIT
metadata:
author: github.com/vuejs-ai
version: "18.0.0"
---
# Vue Best Practices Workflow
Use this skill as an instruction set. Follow the workflow in order unless the user explicitly asks for a different order.
## Core Principles
- **Keep state predictable:** one source of truth, derive everything else.
- **Make data flow explicit:** Props down, Events up for most cases.
- **Favor small, focused components:** easier to test, reuse, and maintain.
- **Avoid unnecessary re-renders:** use computed properties and watchers wisely.
- **Readability counts:** write clear, self-documenting code.
## 1) Confirm architecture before coding (required)
- Default stack: Vue 3 + Composition API + `<script setup lang="ts">`.
- If the project explicitly uses Options API, load `vue-options-api-best-practices` skill if available.
- If the project explicitly uses JSX, load `vue-jsx-best-practices` skill if available.
### 1.1 Must-read core references (required)
- Before implementing any Vue task, make sure to read and apply these core references:
- `references/reactivity.md`
- `references/sfc.md`
- `references/component-data-flow.md`
- `references/composables.md`
- Keep these references in active working context for the entire task, not only when a specific issue appears.
### 1.2 Plan component boundaries before coding (required)
Create a brief component map before implementation for any non-trivial feature.
- Define each component's single responsibility in one sentence.
- Keep entry/root and route-level view components as composition surfaces by default.
- Move feature UI and feature logic out of entry/root/view components unless the task is intentionally a tiny single-file demo.
- Define props/emits contracts for each child component in the map.
- Prefer a feature folder layout (`components/<feature>/...`, `composables/use<Feature>.ts`) when adding more than one component.
## 2) Apply essential Vue foundations (required)
These are essential, must-know foundations. Apply all of them in every Vue task using the core references already loaded in section `1.1`.
### Reactivity
- Must-read reference from `1.1`: [reactivity](references/reactivity.md)
- Keep source state minimal (`ref`/`reactive`), derive everything possible with `computed`.
- Use watchers for side effects if needed.
- Avoid recomputing expensive logic in templates.
### SFC structure and template safety
- Must-read reference from `1.1`: [sfc](references/sfc.md)
- Keep SFC sections in this order: `<script>``<template>``<style>`.
- Keep SFC responsibilities focused; split large components.
- Keep templates declarative; move branching/derivation to script.
- Apply Vue template safety rules (`v-html`, list rendering, conditional rendering choices).
### Keep components focused
Split a component when it has **more than one clear responsibility** (e.g. data orchestration + UI, or multiple independent UI sections).
- Prefer **smaller components + composables** over one “mega component”
- Move **UI sections** into child components (props in, events out).
- Move **state/side effects** into composables (`useXxx()`).
Apply objective split triggers. Split the component if **any** condition is true:
- It owns both orchestration/state and substantial presentational markup for multiple sections.
- It has 3+ distinct UI sections (for example: form, filters, list, footer/status).
- A template block is repeated or could become reusable (item rows, cards, list entries).
Entry/root and route view rule:
- Keep entry/root and route view components thin: app shell/layout, provider wiring, and feature composition.
- Do not place full feature implementations in entry/root/view components when those features contain independent parts.
- For CRUD/list features (todo, table, catalog, inbox), split at least into:
- feature container component
- input/form component
- list (and/or item) component
- footer/actions or filter/status component
- Allow a single-file implementation only for very small throwaway demos; if chosen, explicitly justify why splitting is unnecessary.
### Component data flow
- Must-read reference from `1.1`: [component-data-flow](references/component-data-flow.md)
- Use props down, events up as the primary model.
- Use `v-model` only for true two-way component contracts.
- Use provide/inject only for deep-tree dependencies or shared context.
- Keep contracts explicit and typed with `defineProps`, `defineEmits`, and `InjectionKey` as needed.
### Composables
- Must-read reference from `1.1`: [composables](references/composables.md)
- Extract logic into composables when it is reused, stateful, or side-effect heavy.
- Keep composable APIs small, typed, and predictable.
- Separate feature logic from presentational components.
## 3) Consider optional features only when requirements call for them
### 3.1 Standard optional features
Do not add these by default. Load the matching reference only when the requirement exists.
- Slots: parent needs to control child content/layout -> [component-slots](references/component-slots.md)
- Fallthrough attributes: wrapper/base components must forward attrs/events safely -> [component-fallthrough-attrs](references/component-fallthrough-attrs.md)
- Built-in component `<KeepAlive>` for stateful view caching -> [component-keep-alive](references/component-keep-alive.md)
- Built-in component `<Teleport>` for overlays/portals -> [component-teleport](references/component-teleport.md)
- Built-in component `<Suspense>` for async subtree fallback boundaries -> [component-suspense](references/component-suspense.md)
- Animation-related features: pick the simplest approach that matches the required motion behavior.
- Built-in component `<Transition>` for enter/leave effects -> [transition](references/component-transition.md)
- Built-in component `<TransitionGroup>` for animated list mutations -> [transition-group](references/component-transition-group.md)
- Class-based animation for non-enter/leave effects -> [animation-class-based-technique](references/animation-class-based-technique.md)
- State-driven animation for user-input-driven animation -> [animation-state-driven-technique](references/animation-state-driven-technique.md)
### 3.2 Less-common optional features
Use these only when there is explicit product or technical need.
- Directives: behavior is DOM-specific and not a good composable/component fit -> [directives](references/directives.md)
- Async components: heavy/rarely-used UI should be lazy loaded -> [component-async](references/component-async.md)
- Render functions only when templates cannot express the requirement -> [render-functions](references/render-functions.md)
- Plugins when behavior must be installed app-wide -> [plugins](references/plugins.md)
- State management patterns: app-wide shared state crosses feature boundaries -> [state-management](references/state-management.md)
## 4) Run performance optimization after behavior is correct
Performance work is a post-functionality pass. Do not optimize before core behavior is implemented and verified.
- Large list rendering bottlenecks -> [perf-virtualize-large-lists](references/perf-virtualize-large-lists.md)
- Static subtrees re-rendering unnecessarily -> [perf-v-once-v-memo-directives](references/perf-v-once-v-memo-directives.md)
- Over-abstraction in hot list paths -> [perf-avoid-component-abstraction-in-lists](references/perf-avoid-component-abstraction-in-lists.md)
- Expensive updates triggered too often -> [updated-hook-performance](references/updated-hook-performance.md)
## 5) Final self-check before finishing
- Core behavior works and matches requirements.
- All must-read references were read and applied.
- Reactivity model is minimal and predictable.
- SFC structure and template rules are followed.
- Components are focused and well-factored, splitting when needed.
- Entry/root and route view components remain composition surfaces unless there is an explicit small-demo exception.
- Component split decisions are explicit and defensible (responsibility boundaries are clear).
- Data flow contracts are explicit and typed.
- Composables are used where reuse/complexity justifies them.
- Moved state/side effects into composables if applicable
- Optional features are used only when requirements demand them.
- Performance changes were applied only after functionality was complete.

View File

@@ -0,0 +1,254 @@
---
title: Use Class-based Animations for Non-Enter/Leave Effects
impact: LOW
impactDescription: Class-based animations are simpler and more performant for elements that remain in the DOM
type: best-practice
tags: [vue3, animation, css, class-binding, state]
---
# Use Class-based Animations for Non-Enter/Leave Effects
**Impact: LOW** - For animations on elements that are not entering or leaving the DOM, use CSS class-based animations triggered by Vue's reactive state. This is simpler than `<Transition>` and more appropriate for feedback animations like shake, pulse, or highlight effects.
## Task List
- Use class-based animations for elements staying in the DOM
- Use `<Transition>` only for enter/leave animations
- Combine CSS animations with Vue's class bindings (`:class`)
- Consider using `setTimeout` to auto-remove animation classes
**When to Use Class-based Animations:**
- User feedback (shake on error, pulse on success)
- Attention-grabbing effects (highlight changes)
- Hover/focus states that need more than CSS transitions
- Any animation where the element stays mounted
**When to Use Transition Component:**
- Elements entering/leaving the DOM (v-if/v-show)
- Route transitions
- List item additions/removals
## Basic Pattern
```vue
<template>
<div :class="{ shake: showError }">
<button @click="submitForm">Submit</button>
<span v-if="showError">This feature is disabled!</span>
</div>
</template>
<script setup>
import { ref } from 'vue'
const showError = ref(false)
function submitForm() {
if (!isValid()) {
// Trigger shake animation
showError.value = true
// Auto-remove class after animation completes
setTimeout(() => {
showError.value = false
}, 820) // Match animation duration
}
}
</script>
<style>
.shake {
animation: shake 0.82s cubic-bezier(0.36, 0.07, 0.19, 0.97) both;
transform: translate3d(0, 0, 0); /* Enable GPU acceleration */
}
@keyframes shake {
10%, 90% { transform: translate3d(-1px, 0, 0); }
20%, 80% { transform: translate3d(2px, 0, 0); }
30%, 50%, 70% { transform: translate3d(-4px, 0, 0); }
40%, 60% { transform: translate3d(4px, 0, 0); }
}
</style>
```
## Common Animation Patterns
### Pulse on Success
```vue
<template>
<button
@click="save"
:class="{ pulse: saved }"
>
{{ saved ? 'Saved!' : 'Save' }}
</button>
</template>
<script setup>
import { ref } from 'vue'
const saved = ref(false)
async function save() {
await saveData()
saved.value = true
setTimeout(() => saved.value = false, 1000)
}
</script>
<style>
.pulse {
animation: pulse 0.5s ease-in-out;
}
@keyframes pulse {
0%, 100% { transform: scale(1); }
50% { transform: scale(1.05); }
}
</style>
```
### Highlight on Change
```vue
<template>
<div
:class="{ highlight: justUpdated }"
>
Value: {{ value }}
</div>
</template>
<script setup>
import { ref, watch } from 'vue'
const value = ref(0)
const justUpdated = ref(false)
watch(value, () => {
justUpdated.value = true
setTimeout(() => justUpdated.value = false, 1000)
})
</script>
<style>
.highlight {
animation: highlight 1s ease-out;
}
@keyframes highlight {
0% { background-color: yellow; }
100% { background-color: transparent; }
}
</style>
```
### Bounce Attention
```vue
<template>
<div
:class="{ bounce: needsAttention }"
@animationend="needsAttention = false"
>
<BellIcon />
</div>
</template>
<script setup>
import { ref } from 'vue'
const needsAttention = ref(false)
function notifyUser() {
needsAttention.value = true
// No setTimeout needed - using animationend event
}
</script>
<style>
.bounce {
animation: bounce 0.5s ease;
}
@keyframes bounce {
0%, 100% { transform: translateY(0); }
50% { transform: translateY(-10px); }
}
</style>
```
## Using animationend Event
Instead of `setTimeout`, use the `animationend` event for cleaner code:
```vue
<template>
<div
:class="{ animate: isAnimating }"
@animationend="isAnimating = false"
>
Content
</div>
</template>
<script setup>
import { ref } from 'vue'
const isAnimating = ref(false)
function triggerAnimation() {
isAnimating.value = true
// Class is automatically removed when animation ends
}
</script>
```
## Composable for Reusable Animations
```javascript
// composables/useAnimation.js
import { ref } from 'vue'
export function useAnimation(duration = 500) {
const isAnimating = ref(false)
function trigger() {
isAnimating.value = true
setTimeout(() => {
isAnimating.value = false
}, duration)
}
return {
isAnimating,
trigger
}
}
```
```vue
<script setup>
import { useAnimation } from '@/composables/useAnimation'
const shake = useAnimation(820)
const pulse = useAnimation(500)
</script>
<template>
<button
:class="{ shake: shake.isAnimating.value }"
@click="shake.trigger()"
>
Shake me
</button>
<button
:class="{ pulse: pulse.isAnimating.value }"
@click="pulse.trigger()"
>
Pulse me
</button>
</template>
```

View File

@@ -0,0 +1,291 @@
---
title: State-driven Animations with CSS Transitions and Style Bindings
impact: LOW
impactDescription: Combining Vue's reactive style bindings with CSS transitions creates smooth, interactive animations
type: best-practice
tags: [vue3, animation, css, transition, style-binding, state, interactive]
---
# State-driven Animations with CSS Transitions and Style Bindings
**Impact: LOW** - For responsive, interactive animations that react to user input or state changes, combine Vue's dynamic style bindings with CSS transitions. This creates smooth animations that interpolate values in real-time based on state.
## Task List
- Use `:style` binding for dynamic properties that change frequently
- Add CSS `transition` property to smoothly animate between values
- Consider using `transform` and `opacity` for GPU-accelerated animations
- For complex value interpolation, use watchers with animation libraries
## Basic Pattern
```vue
<template>
<div
@mousemove="onMousemove"
:style="{ backgroundColor: `hsl(${hue}, 80%, 50%)` }"
class="interactive-area"
>
<p>Move your mouse across this div...</p>
<p>Hue: {{ hue }}</p>
</div>
</template>
<script setup>
import { ref } from 'vue'
const hue = ref(0)
function onMousemove(e) {
// Map mouse X position to hue (0-360)
const rect = e.currentTarget.getBoundingClientRect()
hue.value = Math.round((e.clientX - rect.left) / rect.width * 360)
}
</script>
<style>
.interactive-area {
transition: background-color 0.3s ease;
height: 200px;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
}
</style>
```
## Common Use Cases
### Following Mouse Position
```vue
<template>
<div
class="container"
@mousemove="onMousemove"
>
<div
class="follower"
:style="{
transform: `translate(${x}px, ${y}px)`
}"
/>
</div>
</template>
<script setup>
import { ref } from 'vue'
const x = ref(0)
const y = ref(0)
function onMousemove(e) {
const rect = e.currentTarget.getBoundingClientRect()
x.value = e.clientX - rect.left
y.value = e.clientY - rect.top
}
</script>
<style>
.container {
position: relative;
height: 300px;
}
.follower {
position: absolute;
width: 20px;
height: 20px;
background: blue;
border-radius: 50%;
/* Smooth following with transition */
transition: transform 0.1s ease-out;
/* Prevent the follower from triggering mousemove */
pointer-events: none;
}
</style>
```
### Progress Animation
```vue
<template>
<div class="progress-container">
<div
class="progress-bar"
:style="{ width: `${progress}%` }"
/>
</div>
<input
type="range"
v-model.number="progress"
min="0"
max="100"
/>
</template>
<script setup>
import { ref } from 'vue'
const progress = ref(0)
</script>
<style>
.progress-container {
height: 20px;
background: #e0e0e0;
border-radius: 10px;
overflow: hidden;
}
.progress-bar {
height: 100%;
background: linear-gradient(90deg, #4CAF50, #8BC34A);
transition: width 0.3s ease;
}
</style>
```
### Scroll-based Animation
```vue
<template>
<div
class="hero"
:style="{
opacity: heroOpacity,
transform: `translateY(${scrollOffset}px)`
}"
>
<h1>Scroll Down</h1>
</div>
</template>
<script setup>
import { ref, computed, onMounted, onUnmounted } from 'vue'
const scrollY = ref(0)
const heroOpacity = computed(() => {
return Math.max(0, 1 - scrollY.value / 300)
})
const scrollOffset = computed(() => {
return scrollY.value * 0.5 // Parallax effect
})
function handleScroll() {
scrollY.value = window.scrollY
}
onMounted(() => {
window.addEventListener('scroll', handleScroll, { passive: true })
})
onUnmounted(() => {
window.removeEventListener('scroll', handleScroll)
})
</script>
<style>
.hero {
height: 100vh;
display: flex;
align-items: center;
justify-content: center;
/* Note: No transition for scroll-based animations - they should be instant */
}
</style>
```
### Color Theme Transition
```vue
<template>
<div
class="app"
:style="themeStyles"
>
<button @click="toggleTheme">Toggle Theme</button>
<p>Current theme: {{ isDark ? 'Dark' : 'Light' }}</p>
</div>
</template>
<script setup>
import { ref, computed } from 'vue'
const isDark = ref(false)
const themeStyles = computed(() => ({
'--bg-color': isDark.value ? '#1a1a1a' : '#ffffff',
'--text-color': isDark.value ? '#ffffff' : '#1a1a1a',
backgroundColor: 'var(--bg-color)',
color: 'var(--text-color)'
}))
function toggleTheme() {
isDark.value = !isDark.value
}
</script>
<style>
.app {
min-height: 100vh;
transition: background-color 0.5s ease, color 0.5s ease;
}
</style>
```
## Advanced: Numerical Tweening with Watchers
For smooth number animations (counters, stats), use watchers with animation libraries:
```vue
<template>
<div>
<input v-model.number="targetNumber" type="number" />
<p class="counter">{{ displayNumber.toFixed(0) }}</p>
</div>
</template>
<script setup>
import { computed, ref, reactive, watch } from 'vue'
import gsap from 'gsap'
const targetNumber = ref(0)
const tweened = reactive({ value: 0 })
// Computed for display
const displayNumber = computed(() => tweened.value)
watch(targetNumber, (newValue) => {
gsap.to(tweened, {
duration: 0.5,
value: Number(newValue) || 0,
ease: 'power2.out'
})
})
</script>
```
## Performance Considerations
```vue
<style>
/* GOOD: GPU-accelerated properties */
.element {
transition: transform 0.3s ease, opacity 0.3s ease;
}
/* AVOID: Properties that trigger layout recalculation */
.element {
transition: width 0.3s ease, height 0.3s ease, margin 0.3s ease;
}
/* For high-frequency updates, consider will-change */
.frequently-animated {
will-change: transform;
}
</style>
```

View File

@@ -0,0 +1,97 @@
---
title: Async Component Best Practices
impact: MEDIUM
impactDescription: Poor async component strategy can delay interactivity in SSR apps and create loading UI flicker
type: best-practice
tags: [vue3, async-components, ssr, hydration, performance, ux]
---
# Async Component Best Practices
**Impact: MEDIUM** - Async components should reduce JavaScript cost without degrading perceived performance. Focus on hydration timing in SSR and stable loading UX.
## Task List
- Use lazy hydration strategies for non-critical SSR component trees
- Import only the hydration helpers you actually use
- Keep `loadingComponent` delay near the default `200ms` unless real UX data suggests otherwise
- Configure `delay` and `timeout` together for predictable loading behavior
## Use Lazy Hydration Strategies in SSR
In Vue 3.5+, async components can delay hydration until idle time, visibility, media query match, or user interaction.
**BAD:**
```vue
<script setup lang="ts">
import { defineAsyncComponent } from 'vue'
const AsyncComments = defineAsyncComponent({
loader: () => import('./Comments.vue')
})
</script>
```
**GOOD:**
```vue
<script setup lang="ts">
import {
defineAsyncComponent,
hydrateOnVisible,
hydrateOnIdle
} from 'vue'
const AsyncComments = defineAsyncComponent({
loader: () => import('./Comments.vue'),
hydrate: hydrateOnVisible({ rootMargin: '100px' })
})
const AsyncFooter = defineAsyncComponent({
loader: () => import('./Footer.vue'),
hydrate: hydrateOnIdle(5000)
})
</script>
```
## Prevent Loading Spinner Flicker
Avoid showing loading UI immediately for components that usually resolve quickly.
**BAD:**
```vue
<script setup lang="ts">
import { defineAsyncComponent } from 'vue'
import LoadingSpinner from './LoadingSpinner.vue'
const AsyncDashboard = defineAsyncComponent({
loader: () => import('./Dashboard.vue'),
loadingComponent: LoadingSpinner,
delay: 0
})
</script>
```
**GOOD:**
```vue
<script setup lang="ts">
import { defineAsyncComponent } from 'vue'
import LoadingSpinner from './LoadingSpinner.vue'
import ErrorDisplay from './ErrorDisplay.vue'
const AsyncDashboard = defineAsyncComponent({
loader: () => import('./Dashboard.vue'),
loadingComponent: LoadingSpinner,
errorComponent: ErrorDisplay,
delay: 200,
timeout: 30000
})
</script>
```
## Delay Guidelines
| Scenario | Recommended Delay |
|----------|-------------------|
| Small component, fast network | `200ms` |
| Known heavy component | `100ms` |
| Background or non-critical UI | `300-500ms` |

View File

@@ -0,0 +1,307 @@
---
title: Component Data Flow Best Practices
impact: HIGH
impactDescription: Clear data flow between components prevents state bugs, stale UI, and brittle coupling
type: best-practice
tags: [vue3, props, emits, v-model, provide-inject, data-flow, typescript]
---
# Component Data Flow Best Practices
**Impact: HIGH** - Vue components stay reliable when data flow is explicit: props go down, events go up, `v-model` handles two-way bindings, and provide/inject supports cross-tree dependencies. Blurring these boundaries leads to stale state, hidden coupling, and hard-to-debug UI.
The main principle of data flow in Vue.js is **Props Down / Events Up**. This is the most maintainable default, and one-way flow scales well.
## Task List
- Treat props as read-only inputs
- Use props/emit for component communication; reserve refs for imperative actions
- When refs are required for imperative APIs, type them with template refs
- Emit events instead of mutating parent state directly
- Use `defineModel` for v-model in modern Vue (3.4+)
- Handle v-model modifiers deliberately in child components
- Use symbols for provide/inject keys to avoid props drilling (over ~3 layers)
- Keep mutations in the provider or expose explicit actions
- In TypeScript projects, prefer type-based `defineProps`, `defineEmits`, and `InjectionKey`
## Props: One-Way Data Down
Props are inputs. Do not mutate them in the child.
**BAD:**
```vue
<script setup>
const props = defineProps({ count: Number })
function increment() {
props.count++
}
</script>
```
**GOOD:**
If state needs to change, emit an event, use `v-model` or create a local copy.
## Prefer props/emit over component refs
**BAD:**
```vue
<script setup>
import { ref } from 'vue'
import UserForm from './UserForm.vue'
const formRef = ref(null)
function submitForm() {
if (formRef.value.isValid) {
formRef.value.submit()
}
}
</script>
<template>
<UserForm ref="formRef" />
<button @click="submitForm">Submit</button>
</template>
```
**GOOD:**
```vue
<script setup>
import UserForm from './UserForm.vue'
function handleSubmit(formData) {
api.submit(formData)
}
</script>
<template>
<UserForm @submit="handleSubmit" />
</template>
```
## Type component refs when imperative access is required
Prefer props/emits by default. When a parent must call an exposed child method, type the ref explicitly and expose only the intended API from the child with `defineExpose`.
**BAD:**
```vue
<script setup lang="ts">
import { ref, onMounted } from 'vue'
import DialogPanel from './DialogPanel.vue'
const panelRef = ref(null)
onMounted(() => {
panelRef.value.open()
})
</script>
<template>
<DialogPanel ref="panelRef" />
</template>
```
**GOOD:**
```vue
<!-- DialogPanel.vue -->
<script setup lang="ts">
function open() {}
defineExpose({ open })
</script>
```
```vue
<!-- Parent.vue -->
<script setup lang="ts">
import { onMounted, useTemplateRef } from 'vue'
import DialogPanel from './DialogPanel.vue'
// Vue 3.5+ with useTemplateRef
const panelRef = useTemplateRef('panelRef')
// Before Vue 3.5 with manual typing and ref
// const panelRef = ref<InstanceType<typeof DialogPanel> | null>(null)
onMounted(() => {
panelRef.value?.open()
})
</script>
<template>
<DialogPanel ref="panelRef" />
</template>
```
## Emits: Explicit Events Up
Component events do not bubble. If a parent needs to know about an event, re-emit it explicitly.
**BAD:**
```vue
<!-- Parent expects "saved" from grandchild, but it won't bubble -->
<Child @saved="onSaved" />
```
**GOOD:**
```vue
<!-- Child.vue -->
<script setup>
const emit = defineEmits(['saved'])
function onGrandchildSaved(payload) {
emit('saved', payload)
}
</script>
<template>
<Grandchild @saved="onGrandchildSaved" />
</template>
```
**Event naming:** use kebab-case in templates and camelCase in script:
```vue
<script setup>
const emit = defineEmits(['updateUser'])
</script>
<template>
<ProfileForm @update-user="emit('updateUser', $event)" />
</template>
```
## `v-model`: Predictable Two-Way Bindings
Use `defineModel` by default for component bindings and emit updates on input. Only use the `modelValue` + `update:modelValue` pattern if you are on Vue < 3.4.
**BAD:**
```vue
<script setup>
const props = defineProps({ value: String })
</script>
<template>
<input :value="props.value" @input="$emit('input', $event.target.value)" />
</template>
```
**GOOD (Vue 3.4+):**
```vue
<script setup>
const model = defineModel({ type: String })
</script>
<template>
<input v-model="model" />
</template>
```
**GOOD (Vue < 3.4):**
```vue
<script setup>
const props = defineProps({ modelValue: String })
const emit = defineEmits(['update:modelValue'])
</script>
<template>
<input
:value="props.modelValue"
@input="emit('update:modelValue', $event.target.value)"
/>
</template>
```
If you need the updated value immediately after a change, use the input event value or `nextTick` in the parent.
## Provide/Inject: Shared Context Without Prop Drilling
Use provide/inject for cross-tree state, but keep mutations centralized in the provider and expose explicit actions.
**BAD:**
```vue
// Provider.vue
provide('theme', reactive({ dark: false }))
// Consumer.vue
const theme = inject('theme')
// Mutating shared state from any depth becomes hard to track
theme.dark = true
```
**GOOD:**
```vue
// Provider.vue
const theme = reactive({ dark: false })
const toggleTheme = () => { theme.dark = !theme.dark }
provide(themeKey, readonly(theme))
provide(themeActionsKey, { toggleTheme })
// Consumer.vue
const theme = inject(themeKey)
const { toggleTheme } = inject(themeActionsKey)
```
Use symbols for keys to avoid collisions in large apps:
```ts
export const themeKey = Symbol('theme')
export const themeActionsKey = Symbol('theme-actions')
```
## Use TypeScript Contracts for Public Component APIs
In TypeScript projects, type component boundaries directly with `defineProps`, `defineEmits`, and `InjectionKey` so invalid payloads and mismatched injections fail at compile time.
**BAD:**
```vue
<script setup lang="ts">
import { inject } from 'vue'
const props = defineProps({
userId: String
})
const emit = defineEmits(['save'])
const settings = inject('settings')
// Payload shape is not checked here
emit('save', 123)
// Key is string-based and not type-safe
settings?.theme = 'dark'
</script>
```
**GOOD:**
```vue
<script setup lang="ts">
import { inject, provide } from 'vue'
import type { InjectionKey } from 'vue'
interface Props {
userId: string
}
interface Emits {
save: [payload: { id: string; draft: boolean }]
}
interface Settings {
theme: 'light' | 'dark'
}
const settingsKey: InjectionKey<Settings> = Symbol('settings')
const props = defineProps<Props>()
const emit = defineEmits<Emits>()
provide(settingsKey, { theme: 'light' })
const settings = inject(settingsKey)
if (settings) {
emit('save', { id: props.userId, draft: false })
}
</script>
```

View File

@@ -0,0 +1,174 @@
---
title: Component Fallthrough Attributes Best Practices
impact: MEDIUM
impactDescription: Incorrect $attrs access and reactivity assumptions can cause undefined values and watchers that never run
type: best-practice
tags: [vue3, attrs, fallthrough-attributes, composition-api, reactivity]
---
# Component Fallthrough Attributes Best Practices
**Impact: MEDIUM** - Fallthrough attributes are straightforward once you follow Vue's conventions: hyphenated names use bracket notation, listener keys are camelCase `onX`, and `useAttrs()` is current-but-not-reactive.
## Task List
- Access hyphenated attribute names with bracket notation (for example `attrs['data-testid']`)
- Access event listeners with camelCase `onX` keys (for example `attrs.onClick`)
- Do not `watch()` values returned from `useAttrs()`; those watchers do not trigger on attr changes
- Use `onUpdated()` for attr-driven side effects
- Promote frequently observed attrs to props when reactive observation is required
## Access Attribute and Listener Keys Correctly
Hyphenated attribute names preserve their original casing in JavaScript, so dot notation does not work for keys that include `-`.
**BAD:**
```vue
<script setup>
import { useAttrs } from 'vue'
const attrs = useAttrs()
console.log(attrs.data-testid) // Syntax error
console.log(attrs.dataTestid) // undefined for data-testid
console.log(attrs['on-click']) // undefined
console.log(attrs['@click']) // undefined
</script>
```
**GOOD:**
```vue
<script setup>
import { useAttrs } from 'vue'
const attrs = useAttrs()
console.log(attrs['data-testid'])
console.log(attrs['aria-label'])
console.log(attrs['foo-bar'])
console.log(attrs.onClick)
console.log(attrs.onCustomEvent)
console.log(attrs.onMouseEnter)
</script>
```
### Naming Reference
| Parent Usage | Access in `attrs` |
|--------------|-------------------|
| `class="foo"` | `attrs.class` |
| `data-id="123"` | `attrs['data-id']` |
| `aria-label="..."` | `attrs['aria-label']` |
| `foo-bar="baz"` | `attrs['foo-bar']` |
| `@click="fn"` | `attrs.onClick` |
| `@custom-event="fn"` | `attrs.onCustomEvent` |
| `@update:modelValue="fn"` | `attrs['onUpdate:modelValue']` |
## `useAttrs()` Is Not Reactive
`useAttrs()` always reflects the latest values, but it is intentionally not reactive for watcher tracking.
**BAD:**
```vue
<script setup>
import { watch, watchEffect, useAttrs } from 'vue'
const attrs = useAttrs()
watch(
() => attrs.someAttr,
(newValue) => {
console.log('Changed:', newValue) // Never runs on attr changes
}
)
watchEffect(() => {
console.log(attrs.class) // Runs on setup, not on attr updates
})
</script>
```
**GOOD:**
```vue
<script setup>
import { onUpdated, useAttrs } from 'vue'
const attrs = useAttrs()
onUpdated(() => {
console.log('Latest attrs:', attrs)
})
</script>
```
**GOOD:**
```vue
<script setup>
import { watch } from 'vue'
const props = defineProps({
someAttr: String
})
watch(
() => props.someAttr,
(newValue) => {
console.log('Changed:', newValue)
}
)
</script>
```
## Common Patterns
### Check for optional attrs safely
```vue
<script setup>
import { computed, useAttrs } from 'vue'
const attrs = useAttrs()
const hasTestId = computed(() => 'data-testid' in attrs)
const ariaLabel = computed(() => attrs['aria-label'] ?? 'Default label')
</script>
```
### Forward listeners after internal logic
```vue
<script setup>
import { useAttrs } from 'vue'
defineOptions({ inheritAttrs: false })
const attrs = useAttrs()
function handleClick(event) {
console.log('Internal handling first')
attrs.onClick?.(event)
}
</script>
<template>
<button @click="handleClick">
<slot />
</button>
</template>
```
## TypeScript Notes
`useAttrs()` is typed as `Record<string, unknown>`, so cast individual keys when needed.
```vue
<script setup lang="ts">
import { useAttrs } from 'vue'
const attrs = useAttrs()
const testId = attrs['data-testid'] as string | undefined
const onClick = attrs.onClick as ((event: MouseEvent) => void) | undefined
</script>
```

View File

@@ -0,0 +1,137 @@
---
title: KeepAlive Component Best Practices
impact: HIGH
impactDescription: KeepAlive caches component instances; misuse causes stale data, memory growth, or unexpected lifecycle behavior
type: best-practice
tags: [vue3, keepalive, cache, performance, router, dynamic-components]
---
# KeepAlive Component Best Practices
**Impact: HIGH** - `<KeepAlive>` caches component instances instead of destroying them. Use it to preserve state across switches, but manage cache size and freshness explicitly to avoid memory growth or stale UI.
## Task List
- Use KeepAlive only where state preservation improves UX
- Set a reasonable `max` to cap cache size
- Declare component names for include/exclude matching
- Use `onActivated`/`onDeactivated` for cache-aware logic
- Decide how and when cached views refresh their data
- Avoid caching memory-heavy or security-sensitive views
## When to Use KeepAlive
Use KeepAlive when switching between views where state should persist (tabs, multi-step forms, dashboards). Avoid it when each visit should start fresh.
**BAD:**
```vue
<template>
<!-- State resets on every switch -->
<component :is="currentTab" />
</template>
```
**GOOD:**
```vue
<template>
<!-- State preserved between switches -->
<KeepAlive>
<component :is="currentTab" />
</KeepAlive>
</template>
```
## When NOT to Use KeepAlive
- Search or filter pages where users expect fresh results
- Memory-heavy components (maps, large tables, media players)
- Sensitive flows where data must be cleared on exit
- Components with heavy background activity you cannot pause
## Limit and Control the Cache
Always cap cache size with `max` and restrict caching to specific components when possible.
```vue
<template>
<KeepAlive :max="5" include="Dashboard,Settings">
<component :is="currentView" />
</KeepAlive>
</template>
```
## Ensure Component Names Match include/exclude
`include` and `exclude` match the component `name` option. Explicitly set names for reliable caching.
```vue
<!-- TabA.vue -->
<script setup>
defineOptions({ name: 'TabA' })
</script>
```
```vue
<template>
<KeepAlive include="TabA,TabB">
<component :is="currentTab" />
</KeepAlive>
</template>
```
## Cache Invalidation Strategies
Vue 3 has no direct API to remove a specific cached instance. Use keys or dynamic include/exclude to force refreshes.
```vue
<script setup>
import { ref, reactive } from 'vue'
const currentView = ref('Dashboard')
const viewKeys = reactive({ Dashboard: 0, Settings: 0 })
function invalidateCache(view) {
viewKeys[view]++
}
</script>
<template>
<KeepAlive>
<component :is="currentView" :key="`${currentView}-${viewKeys[currentView]}`" />
</KeepAlive>
</template>
```
## Lifecycle Hooks for Cached Components
Cached components are not destroyed on switch. Use activation hooks for refresh and cleanup.
```vue
<script setup>
import { onActivated, onDeactivated } from 'vue'
onActivated(() => {
refreshData()
})
onDeactivated(() => {
pauseTimers()
})
</script>
```
## Router Caching and Freshness
Decide whether navigation should show cached state or a fresh view. A common pattern is to key by route when params change.
```vue
<template>
<router-view v-slot="{ Component, route }">
<KeepAlive>
<component :is="Component" :key="route.fullPath" />
</KeepAlive>
</router-view>
</template>
```
If you want cache reuse but fresh data, refresh in `onActivated` and compare query/params before fetching.

View File

@@ -0,0 +1,216 @@
---
title: Component Slots Best Practices
impact: MEDIUM
impactDescription: Poor slot API design causes empty DOM wrappers, weak TypeScript safety, brittle defaults, and unnecessary component overhead
type: best-practice
tags: [vue3, slots, components, typescript, composables]
---
# Component Slots Best Practices
**Impact: MEDIUM** - Slots are a core component API surface in Vue. Structure them intentionally so templates stay predictable, typed, and performant.
## Task List
- Use shorthand syntax for named slots (`#` instead of `v-slot:`)
- Render optional slot wrapper elements only when slot content exists (`$slots` checks)
- Type scoped slot contracts with `defineSlots` in TypeScript components
- Provide fallback content for optional slots
- Prefer composables over renderless components for pure logic reuse
## Shorthand syntax for named slots
**BAD:**
```vue
<MyComponent>
<template v-slot:header> ... </template>
</MyComponent>
```
**GOOD:**
```vue
<MyComponent>
<template #header> ... </template>
</MyComponent>
```
## Conditionally Render Optional Slot Wrappers
Use `$slots` checks when wrapper elements add spacing, borders, or layout constraints.
**BAD:**
```vue
<!-- Card.vue -->
<template>
<article class="card">
<header class="card-header">
<slot name="header" />
</header>
<section class="card-body">
<slot />
</section>
<footer class="card-footer">
<slot name="footer" />
</footer>
</article>
</template>
```
**GOOD:**
```vue
<!-- Card.vue -->
<template>
<article class="card">
<header v-if="$slots.header" class="card-header">
<slot name="header" />
</header>
<section v-if="$slots.default" class="card-body">
<slot />
</section>
<footer v-if="$slots.footer" class="card-footer">
<slot name="footer" />
</footer>
</article>
</template>
```
## Type Scoped Slot Props with defineSlots
In `<script setup lang="ts">`, use `defineSlots` so slot consumers get autocomplete and static checks.
**BAD:**
```vue
<!-- ProductList.vue -->
<script setup lang="ts">
interface Product {
id: number
name: string
}
defineProps<{ products: Product[] }>()
</script>
<template>
<ul>
<li v-for="(product, index) in products" :key="product.id">
<slot :product="product" :index="index" />
</li>
</ul>
</template>
```
**GOOD:**
```vue
<!-- ProductList.vue -->
<script setup lang="ts">
interface Product {
id: number
name: string
}
defineProps<{ products: Product[] }>()
defineSlots<{
default(props: { product: Product; index: number }): any
empty(): any
}>()
</script>
<template>
<ul v-if="products.length">
<li v-for="(product, index) in products" :key="product.id">
<slot :product="product" :index="index" />
</li>
</ul>
<slot v-else name="empty" />
</template>
```
## Provide Slot Fallback Content
Fallback content makes components resilient when parents omit optional slots.
**BAD:**
```vue
<!-- SubmitButton.vue -->
<template>
<button type="submit" class="btn-primary">
<slot />
</button>
</template>
```
**GOOD:**
```vue
<!-- SubmitButton.vue -->
<template>
<button type="submit" class="btn-primary">
<slot>Submit</slot>
</button>
</template>
```
## Prefer Composables for Pure Logic Reuse
Renderless components are still useful for slot-driven composition, but composables are usually cleaner for logic-only reuse.
**BAD:**
```vue
<!-- MouseTracker.vue -->
<script setup lang="ts">
import { ref, onMounted, onUnmounted } from 'vue'
const x = ref(0)
const y = ref(0)
function onMove(event: MouseEvent) {
x.value = event.pageX
y.value = event.pageY
}
onMounted(() => window.addEventListener('mousemove', onMove))
onUnmounted(() => window.removeEventListener('mousemove', onMove))
</script>
<template>
<slot :x="x" :y="y" />
</template>
```
**GOOD:**
```ts
// composables/useMouse.ts
import { ref, onMounted, onUnmounted } from 'vue'
export function useMouse() {
const x = ref(0)
const y = ref(0)
function onMove(event: MouseEvent) {
x.value = event.pageX
y.value = event.pageY
}
onMounted(() => window.addEventListener('mousemove', onMove))
onUnmounted(() => window.removeEventListener('mousemove', onMove))
return { x, y }
}
```
```vue
<!-- MousePosition.vue -->
<script setup lang="ts">
import { useMouse } from '@/composables/useMouse'
const { x, y } = useMouse()
</script>
<template>
<p>{{ x }}, {{ y }}</p>
</template>
```

View File

@@ -0,0 +1,228 @@
---
title: Suspense Component Best Practices
impact: MEDIUM
impactDescription: Suspense coordinates async dependencies with fallback UI; misconfiguration leads to missing loading states or confusing UX
type: best-practice
tags: [vue3, suspense, async-components, async-setup, loading, fallback, router, transition, keepalive]
---
# Suspense Component Best Practices
**Impact: MEDIUM** - `<Suspense>` coordinates async dependencies (async components or async setup) and renders a fallback while they resolve. Misconfiguration leads to missing loading states, empty renders, or subtle UX bugs.
## Task List
- Wrap default and fallback slot content in a single root node
- Use `timeout` when you need the fallback to appear on reverts
- Force root replacement with `:key` when you need Suspense to re-trigger
- Add `suspensible` to nested Suspense boundaries (Vue 3.3+)
- Use `@pending`, `@resolve`, and `@fallback` for programmatic loading state
- Nest `RouterView` -> `Transition` -> `KeepAlive` -> `Suspense` in that order
- Keep Suspense usage centralized and documented in production
## Single Root in Default and Fallback Slots
Suspense tracks a single immediate child in both slots. Wrap multiple elements in a single element or component.
**BAD:**
```vue
<template>
<Suspense>
<AsyncHeader />
<AsyncList />
<template #fallback>
<LoadingSpinner />
<LoadingHint />
</template>
</Suspense>
</template>
```
**GOOD:**
```vue
<template>
<Suspense>
<div>
<AsyncHeader />
<AsyncList />
</div>
<template #fallback>
<div>
<LoadingSpinner />
<LoadingHint />
</div>
</template>
</Suspense>
</template>
```
## Fallback Timing on Reverts (`timeout`)
When Suspense is already resolved and new async work starts, the previous content remains visible until the timeout elapses. Use `timeout="0"` for immediate fallback or a short delay to avoid flicker.
**BAD:**
```vue
<template>
<Suspense>
<component :is="currentView" :key="viewKey" />
<template #fallback>
Loading...
</template>
</Suspense>
</template>
```
**GOOD:**
```vue
<template>
<Suspense :timeout="200">
<component :is="currentView" :key="viewKey" />
<template #fallback>
Loading...
</template>
</Suspense>
</template>
```
## Pending State Only Re-triggers on Root Replacement
Once resolved, Suspense only re-enters pending when the root node of the default slot changes. If async work happens deeper in the tree, no fallback appears.
**BAD:**
```vue
<template>
<Suspense>
<TabContainer>
<AsyncDashboard v-if="tab === 'dashboard'" />
<AsyncSettings v-else />
</TabContainer>
<template #fallback>
Loading...
</template>
</Suspense>
</template>
```
**GOOD:**
```vue
<template>
<Suspense>
<component :is="tabs[tab]" :key="tab" />
<template #fallback>
Loading...
</template>
</Suspense>
</template>
```
## Use `suspensible` for Nested Suspense (Vue 3.3+)
Nested Suspense boundaries need `suspensible` on the inner boundary so the parent can coordinate loading state. Without it, inner async content may render empty nodes until resolved.
**BAD:**
```vue
<template>
<Suspense>
<LayoutShell>
<Suspense>
<AsyncWidget />
<template #fallback>Loading widget...</template>
</Suspense>
</LayoutShell>
<template #fallback>Loading layout...</template>
</Suspense>
</template>
```
**GOOD:**
```vue
<template>
<Suspense>
<LayoutShell>
<Suspense suspensible>
<AsyncWidget />
<template #fallback>Loading widget...</template>
</Suspense>
</LayoutShell>
<template #fallback>Loading layout...</template>
</Suspense>
</template>
```
## Track Loading with Suspense Events
Use `@pending`, `@resolve`, and `@fallback` for analytics, global loading indicators, or coordinating UI outside the Suspense boundary.
```vue
<script setup>
import { ref } from 'vue'
const isLoading = ref(false)
const onPending = () => {
isLoading.value = true
}
const onResolve = () => {
isLoading.value = false
}
</script>
<template>
<LoadingBar v-if="isLoading" />
<Suspense @pending="onPending" @resolve="onResolve">
<AsyncPage />
<template #fallback>
<PageSkeleton />
</template>
</Suspense>
</template>
```
## Recommended Nesting with RouterView, Transition, KeepAlive
When combining these components, the nesting order should be `RouterView` -> `Transition` -> `KeepAlive` -> `Suspense` so each wrapper works correctly.
**BAD:**
```vue
<template>
<RouterView v-slot="{ Component }">
<Suspense>
<KeepAlive>
<Transition mode="out-in">
<component :is="Component" />
</Transition>
</KeepAlive>
</Suspense>
</RouterView>
</template>
```
**GOOD:**
```vue
<template>
<RouterView v-slot="{ Component }">
<Transition mode="out-in">
<KeepAlive>
<Suspense>
<component :is="Component" />
<template #fallback>Loading...</template>
</Suspense>
</KeepAlive>
</Transition>
</RouterView>
</template>
```
## Treat Suspense Cautiously in Production
In production code, keep Suspense boundaries minimal, document where they are used, and have a fallback loading strategy if you ever need to replace or refactor them.

View File

@@ -0,0 +1,108 @@
---
title: Teleport Component Best Practices
impact: MEDIUM
impactDescription: Teleport renders content outside the component's DOM position, which is essential for overlays but affects styling and layout
type: best-practice
tags: [vue3, teleport, modal, overlay, positioning, responsive]
---
# Teleport Component Best Practices
**Impact: MEDIUM** - `<Teleport>` renders part of a component's template in a different place in the DOM while preserving the Vue component hierarchy. Use it for overlays (modals, toasts, tooltips) or any UI that must escape stacking contexts, overflow, or fixed positioning constraints.
## Task List
- Teleport overlays to `body` or a dedicated container outside the app root
- Keep a shared target for similar UI (`#modals`, `#notifications`) and control layering with order or z-index
- Use `:disabled` for responsive layouts that should render inline on small screens
- Remember props, emits, and provide/inject still work through teleport
- Avoid relying on parent stacking contexts or transforms for teleported UI
## Teleport Overlays Out of Transformed Containers
When an ancestor has `transform`, `filter`, or `perspective`, fixed-position overlays can behave like they are locally positioned. Teleport escapes that context.
**BAD:**
```vue
<template>
<div class="animated-container">
<button @click="open = true">Open</button>
<!-- Broken: fixed positioning is scoped to the transformed parent -->
<div v-if="open" class="modal">Modal</div>
</div>
</template>
<style>
.animated-container {
transform: translateZ(0);
}
.modal {
position: fixed;
inset: 0;
z-index: 9999;
}
</style>
```
**GOOD:**
```vue
<template>
<div class="animated-container">
<button @click="open = true">Open</button>
<Teleport to="body">
<div v-if="open" class="modal">Modal</div>
</Teleport>
</div>
</template>
```
## Responsive Layouts with `disabled`
Use `:disabled` to render inline on mobile and teleport on larger screens:
```vue
<script setup>
import { useMediaQuery } from '@vueuse/core'
const isMobile = useMediaQuery('(max-width: 768px)')
</script>
<template>
<Teleport to="body" :disabled="isMobile">
<nav class="sidebar">Navigation</nav>
</Teleport>
</template>
```
## Logical Hierarchy Is Preserved
Teleport changes DOM position, not the Vue component tree. Props, emits, slots, and provide/inject still work:
```vue
<template>
<Teleport to="body">
<ChildPanel :message="message" @close="open = false" />
</Teleport>
</template>
```
## Multiple Teleports to the Same Target
Teleports to the same target append in declaration order:
```vue
<template>
<Teleport to="#notifications">
<div>First</div>
</Teleport>
<Teleport to="#notifications">
<div>Second</div>
</Teleport>
</template>
```
Use a shared container to keep stacking predictable, and apply z-index only when you need explicit layering.

View File

@@ -0,0 +1,128 @@
---
title: TransitionGroup Component Best Practices
impact: MEDIUM
impactDescription: TransitionGroup animates list items; missing keys or misuse leads to broken list transitions
type: best-practice
tags: [vue3, transition-group, animation, lists, keys]
---
# TransitionGroup Component Best Practices
**Impact: MEDIUM** - `<TransitionGroup>` animates lists of items entering, leaving, and moving. Use it for `v-for` lists or dynamic collections where individual items change over time.
## Task List
- Use `<TransitionGroup>` only for lists and repeated items
- Provide unique, stable keys for every direct child
- Use `tag` when you need semantic or layout wrappers
- Avoid the `mode` prop (not supported)
- Use JavaScript hooks for staggered effects
## Use TransitionGroup for Lists
`<TransitionGroup>` is designed for list items. Use `tag` to control the wrapper element when needed.
**BAD:**
```vue
<template>
<TransitionGroup name="fade">
<ComponentA />
<ComponentB />
</TransitionGroup>
</template>
```
**GOOD:**
```vue
<template>
<TransitionGroup name="list" tag="ul">
<li v-for="item in items" :key="item.id">
{{ item.name }}
</li>
</TransitionGroup>
</template>
```
## Always Provide Stable Keys
Keys are required. Without stable keys, Vue cannot track item positions and animations break.
**BAD:**
```vue
<template>
<TransitionGroup name="list" tag="ul">
<li v-for="(item, index) in items" :key="index">
{{ item.name }}
</li>
</TransitionGroup>
</template>
```
**GOOD:**
```vue
<template>
<TransitionGroup name="list" tag="ul">
<li v-for="item in items" :key="item.id">
{{ item.name }}
</li>
</TransitionGroup>
</template>
```
## Do Not Use `mode` on TransitionGroup
`mode` is only for `<Transition>` because it swaps a single element. Use `<Transition>` if you need in/out sequencing.
**BAD:**
```vue
<template>
<TransitionGroup name="list" tag="div" mode="out-in">
<div v-for="item in items" :key="item.id">{{ item.name }}</div>
</TransitionGroup>
</template>
```
**GOOD:**
```vue
<template>
<Transition name="fade" mode="out-in">
<component :is="currentView" :key="currentView" />
</Transition>
</template>
```
## Stagger List Animations with Data Attributes
For cascading list animations, pass the index to JavaScript hooks and compute delay per item.
```vue
<template>
<TransitionGroup
tag="ul"
:css="false"
@before-enter="onBeforeEnter"
@enter="onEnter"
>
<li v-for="(item, index) in items" :key="item.id" :data-index="index">
{{ item.name }}
</li>
</TransitionGroup>
</template>
<script setup>
function onBeforeEnter(el) {
el.style.opacity = 0
el.style.transform = 'translateY(12px)'
}
function onEnter(el, done) {
const delay = Number(el.dataset.index) * 80
setTimeout(() => {
el.style.transition = 'all 0.25s ease'
el.style.opacity = 1
el.style.transform = 'translateY(0)'
setTimeout(done, 250)
}, delay)
}
</script>
```

View File

@@ -0,0 +1,125 @@
---
title: Transition Component Best Practices
impact: MEDIUM
impactDescription: Transition animates a single element or component; incorrect structure or keys prevent animations
type: best-practice
tags: [vue3, transition, animation, performance, keys]
---
# Transition Component Best Practices
**Impact: MEDIUM** - `<Transition>` animates entering/leaving of a single element or component. It is ideal for toggling UI states, swapping views, or animating one component at a time.
## Task List
- Wrap a single element or component inside `<Transition>`
- Provide a `key` when switching between same element types
- Use `mode="out-in"` when you need sequential swaps
- Prefer `transform` and `opacity` for smooth animations
## Use Transition for a Single Root Element
`<Transition>` only supports one direct child. Wrap multiple nodes in a single element or component.
**BAD:**
```vue
<template>
<Transition name="fade">
<h3>Title</h3>
<p>Description</p>
</Transition>
</template>
```
**GOOD:**
```vue
<template>
<Transition name="fade">
<div>
<h3>Title</h3>
<p>Description</p>
</div>
</Transition>
</template>
```
## Force Transitions Between Same Element Types
Vue reuses the same DOM element when the tag type does not change. Add `key` so Vue treats it as a new element and triggers enter/leave.
**BAD:**
```vue
<template>
<Transition name="fade">
<p v-if="isActive">Active</p>
<p v-else>Inactive</p>
</Transition>
</template>
```
**GOOD:**
```vue
<template>
<Transition name="fade" mode="out-in">
<p v-if="isActive" key="active">Active</p>
<p v-else key="inactive">Inactive</p>
</Transition>
</template>
```
## Use `mode` to Avoid Overlap During Swaps
When swapping components or views, use `mode="out-in"` to prevent both from being visible at the same time.
**BAD:**
```vue
<template>
<Transition name="fade">
<component :is="currentView" />
</Transition>
</template>
```
**GOOD:**
```vue
<template>
<Transition name="fade" mode="out-in">
<component :is="currentView" :key="currentView" />
</Transition>
</template>
```
## Animate `transform` and `opacity` for Performance
Avoid layout-triggering properties such as `height`, `margin`, or `top`. Use `transform` and `opacity` for smooth, GPU-friendly transitions.
**BAD:**
```css
.slide-enter-active,
.slide-leave-active {
transition: height 0.3s ease;
}
.slide-enter-from,
.slide-leave-to {
height: 0;
}
```
**GOOD:**
```css
.slide-enter-active,
.slide-leave-active {
transition: transform 0.3s ease, opacity 0.3s ease;
}
.slide-enter-from {
transform: translateX(-12px);
opacity: 0;
}
.slide-leave-to {
transform: translateX(12px);
opacity: 0;
}
```

View File

@@ -0,0 +1,290 @@
---
title: Composable Organization Patterns
impact: MEDIUM
impactDescription: Well-structured composables improve maintainability, reusability, and update performance
type: best-practice
tags: [vue3, composables, composition-api, code-organization, api-design, readonly, utilities]
---
# Composable Organization Patterns
**Impact: MEDIUM** - Treat composables as reusable, stateful building blocks and keep their code organized by feature concern. This keeps large components maintainable and prevents hard-to-debug mutation and API design issues.
## Task List
- Compose complex behavior from small, focused composables
- Use options objects for composables with multiple optional parameters
- Return readonly state when updates must flow through explicit actions
- Keep pure utility functions as plain utilities, not composables
- Organize composable and component code by feature concern, and extract composables when components grow
## Compose Composables from Smaller Primitives
**BAD:**
```vue
<script setup>
import { ref, computed, onMounted, onUnmounted } from 'vue'
const x = ref(0)
const y = ref(0)
const inside = ref(false)
const el = ref(null)
function onMove(e) {
x.value = e.pageX
y.value = e.pageY
if (!el.value) return
const r = el.value.getBoundingClientRect()
inside.value = x.value >= r.left && x.value <= r.right &&
y.value >= r.top && y.value <= r.bottom
}
onMounted(() => window.addEventListener('mousemove', onMove))
onUnmounted(() => window.removeEventListener('mousemove', onMove))
</script>
```
**GOOD:**
```javascript
// composables/useEventListener.js
import { onMounted, onUnmounted, toValue } from 'vue'
export function useEventListener(target, event, callback) {
onMounted(() => toValue(target).addEventListener(event, callback))
onUnmounted(() => toValue(target).removeEventListener(event, callback))
}
```
```javascript
// composables/useMouse.js
import { ref } from 'vue'
import { useEventListener } from './useEventListener'
export function useMouse() {
const x = ref(0)
const y = ref(0)
useEventListener(window, 'mousemove', (e) => {
x.value = e.pageX
y.value = e.pageY
})
return { x, y }
}
```
```javascript
// composables/useMouseInElement.js
import { computed } from 'vue'
import { useMouse } from './useMouse'
export function useMouseInElement(elementRef) {
const { x, y } = useMouse()
const isOutside = computed(() => {
if (!elementRef.value) return true
const rect = elementRef.value.getBoundingClientRect()
return x.value < rect.left || x.value > rect.right ||
y.value < rect.top || y.value > rect.bottom
})
return { x, y, isOutside }
}
```
## Use Options Object Pattern for Composable Parameters
**BAD:**
```javascript
export function useFetch(url, method, headers, timeout, retries, immediate) {
// hard to read and easy to misorder
}
useFetch('/api/users', 'GET', null, 5000, 3, true)
```
**GOOD:**
```javascript
export function useFetch(url, options = {}) {
const {
method = 'GET',
headers = {},
timeout = 30000,
retries = 0,
immediate = true
} = options
// implementation
return { method, headers, timeout, retries, immediate }
}
useFetch('/api/users', {
method: 'POST',
timeout: 5000,
retries: 3
})
```
```typescript
interface UseCounterOptions {
initial?: number
min?: number
max?: number
step?: number
}
export function useCounter(options: UseCounterOptions = {}) {
const { initial = 0, min = -Infinity, max = Infinity, step = 1 } = options
// implementation
}
```
## Return Readonly State with Explicit Actions
**BAD:**
```javascript
export function useCart() {
const items = ref([])
const total = computed(() => items.value.reduce((sum, item) => sum + item.price, 0))
return { items, total } // any consumer can mutate directly
}
const { items } = useCart()
items.value.push({ id: 1, price: 10 })
```
**GOOD:**
```javascript
import { ref, computed, readonly } from 'vue'
export function useCart() {
const _items = ref([])
const total = computed(() =>
_items.value.reduce((sum, item) => sum + item.price * item.quantity, 0)
)
function addItem(product, quantity = 1) {
const existing = _items.value.find(item => item.id === product.id)
if (existing) {
existing.quantity += quantity
return
}
_items.value.push({ ...product, quantity })
}
function removeItem(productId) {
_items.value = _items.value.filter(item => item.id !== productId)
}
return {
items: readonly(_items),
total,
addItem,
removeItem
}
}
```
## Keep Utilities as Utilities
**BAD:**
```javascript
export function useFormatters() {
const formatDate = (date) => new Intl.DateTimeFormat('en-US').format(date)
const formatCurrency = (amount) =>
new Intl.NumberFormat('en-US', { style: 'currency', currency: 'USD' }).format(amount)
return { formatDate, formatCurrency }
}
const { formatDate } = useFormatters()
```
**GOOD:**
```javascript
// utils/formatters.js
export function formatDate(date) {
return new Intl.DateTimeFormat('en-US').format(date)
}
export function formatCurrency(amount) {
return new Intl.NumberFormat('en-US', {
style: 'currency',
currency: 'USD'
}).format(amount)
}
```
```javascript
// composables/useInvoiceSummary.js
import { computed } from 'vue'
import { formatCurrency } from '@/utils/formatters'
export function useInvoiceSummary(invoiceRef) {
const totalLabel = computed(() => formatCurrency(invoiceRef.value.total))
return { totalLabel }
}
```
## Organize Composable and Component Code by Feature Concern
**BAD:**
```vue
<script setup>
import { ref, computed, watch, onMounted } from 'vue'
const searchQuery = ref('')
const items = ref([])
const selected = ref(null)
const showModal = ref(false)
const sortBy = ref('name')
const filter = ref('all')
const loading = ref(false)
const filtered = computed(() => items.value.filter(i => i.category === filter.value))
function openModal() { showModal.value = true }
const sorted = computed(() => [...filtered.value].sort(/* ... */))
watch(searchQuery, () => { /* ... */ })
onMounted(() => { /* ... */ })
</script>
```
**GOOD:**
```vue
<script setup>
import { useItems } from '@/composables/useItems'
import { useSearch } from '@/composables/useSearch'
import { useSelectionModal } from '@/composables/useSelectionModal'
// Data
const { items, loading, fetchItems } = useItems()
// Search/filter/sort
const { query, visibleItems } = useSearch(items)
// Selection + modal
const { selectedItem, isModalOpen, selectItem, closeModal } = useSelectionModal()
</script>
```
```javascript
// composables/useItems.js
import { ref, onMounted } from 'vue'
export function useItems() {
const items = ref([])
const loading = ref(false)
async function fetchItems() {
loading.value = true
try {
items.value = await api.getItems()
} finally {
loading.value = false
}
}
onMounted(fetchItems)
return { items, loading, fetchItems }
}
```

View File

@@ -0,0 +1,162 @@
---
title: Directive Best Practices
impact: MEDIUM
impactDescription: Custom directives are powerful but easy to misuse; following patterns prevents leaks, invalid usage, and unclear abstractions
type: best-practice
tags: [vue3, directives, custom-directives, composition, typescript]
---
# Directive Best Practices
**Impact: MEDIUM** - Directives are for low-level DOM access. Use them sparingly, keep them side-effect safe, and prefer components or composables when you need stateful or reusable UI behavior.
## Task List
- Use directives only when you need direct DOM access
- Do not mutate directive arguments or binding objects
- Clean up timers, listeners, and observers in `unmounted`
- Register directives in `<script setup>` with the `v-` prefix
- In TypeScript projects, type directive values and augment template directive types
- Prefer components or composables for complex behavior
## Treat Directive Arguments as Read-Only
Directive bindings are not reactive storage. Dont write to them.
```ts
const vFocus = {
mounted(el, binding) {
// binding.value is read-only
el.focus()
}
}
```
## Avoid Directives on Components
Directives apply to DOM elements. When used on components, they attach to the root element and can break if the root changes.
**BAD:**
```vue
<MyInput v-focus />
```
**GOOD:**
```vue
<!-- MyInput.vue -->
<script setup>
const vFocus = (el) => el.focus()
</script>
<template>
<input v-focus />
</template>
```
## Clean Up Side Effects in `unmounted`
Any timers, listeners, or observers must be removed to avoid leaks.
```ts
const vResize = {
mounted(el) {
const observer = new ResizeObserver(() => {})
observer.observe(el)
el._observer = observer
},
unmounted(el) {
el._observer?.disconnect()
}
}
```
## Prefer Function Shorthand for Single-Hook Directives
If you only need `mounted`/`updated`, use the function form.
```ts
const vAutofocus = (el) => el.focus()
```
## Use the `v-` Prefix and Script Setup Registration
```vue
<script setup>
const vFocus = (el) => el.focus()
</script>
<template>
<input v-focus />
</template>
```
## Type Custom Directives in TypeScript Projects
Use `Directive<Element, ValueType>` so `binding.value` is typed, and augment Vue's template types so directives are recognized in SFC templates.
**BAD:**
```ts
// Untyped directive value and no template type augmentation
export const vHighlight = {
mounted(el, binding) {
el.style.backgroundColor = binding.value
}
}
```
**GOOD:**
```ts
import type { Directive } from 'vue'
type HighlightValue = string
export const vHighlight = {
mounted(el, binding) {
el.style.backgroundColor = binding.value
}
} satisfies Directive<HTMLElement, HighlightValue>
declare module 'vue' {
interface ComponentCustomProperties {
vHighlight: typeof vHighlight
}
}
```
## Handle SSR with `getSSRProps`
Directive hooks such as `mounted` and `updated` do not run during SSR. If a directive sets attributes/classes that affect rendered HTML, provide an SSR equivalent via `getSSRProps` to avoid hydration mismatches.
**BAD:**
```ts
const vTooltip = {
mounted(el, binding) {
el.setAttribute('data-tooltip', binding.value)
el.classList.add('has-tooltip')
}
}
```
**GOOD:**
```ts
const vTooltip = {
mounted(el, binding) {
el.setAttribute('data-tooltip', binding.value)
el.classList.add('has-tooltip')
},
getSSRProps(binding) {
return {
'data-tooltip': binding.value,
class: 'has-tooltip'
}
}
}
```
## Prefer Declarative Templates When Possible
If a standard attribute or binding works, use it instead of a directive.
## Decide Between Directives and Components
Use a directive for DOM-level behavior. Use a component when behavior affects structure, state, or rendering.

View File

@@ -0,0 +1,159 @@
---
title: Avoid Excessive Component Abstraction in Large Lists
impact: MEDIUM
impactDescription: Each component instance has memory and render overhead - abstractions multiply this in lists
type: efficiency
tags: [vue3, performance, components, abstraction, lists, optimization]
---
# Avoid Excessive Component Abstraction in Large Lists
**Impact: MEDIUM** - Component instances are more expensive than plain DOM nodes. While abstractions improve code organization, unnecessary nesting creates overhead. In large lists, this overhead multiplies - 100 items with 3 levels of abstraction means 300+ component instances instead of 100.
Don't avoid abstraction entirely, but be mindful of component depth in frequently-rendered elements like list items.
## Task List
- Review list item components for unnecessary wrapper components
- Consider flattening component hierarchies in hot paths
- Use native elements when a component adds no value
- Profile component counts using Vue DevTools
- Focus optimization efforts on the most-rendered components
**BAD:**
```vue
<!-- BAD: Deep abstraction in list items -->
<template>
<div class="user-list">
<!-- For 100 users: Creates 400 component instances -->
<UserCard v-for="user in users" :key="user.id" :user="user" />
</div>
</template>
<!-- UserCard.vue -->
<template>
<Card> <!-- Wrapper component #1 -->
<CardHeader> <!-- Wrapper component #2 -->
<UserAvatar :src="user.avatar" /> <!-- Wrapper component #3 -->
</CardHeader>
<CardBody> <!-- Wrapper component #4 -->
<Text>{{ user.name }}</Text>
</CardBody>
</Card>
</template>
<!-- Each UserCard creates: Card + CardHeader + CardBody + UserAvatar + Text
100 users = 500+ component instances -->
```
**GOOD:**
```vue
<!-- GOOD: Flattened structure in list items -->
<template>
<div class="user-list">
<!-- For 100 users: Creates 100 component instances -->
<UserCard v-for="user in users" :key="user.id" :user="user" />
</div>
</template>
<!-- UserCard.vue - Flattened, uses native elements -->
<template>
<div class="card">
<div class="card-header">
<img :src="user.avatar" :alt="user.name" class="avatar" />
</div>
<div class="card-body">
<span class="user-name">{{ user.name }}</span>
</div>
</div>
</template>
<script setup>
defineProps({
user: Object
})
</script>
<style scoped>
/* Styles that would have been in Card, CardHeader, etc. */
.card { /* ... */ }
.card-header { /* ... */ }
.card-body { /* ... */ }
.avatar { /* ... */ }
</style>
```
## When Abstraction Is Still Worth It
```vue
<!-- Component abstraction is valuable when: -->
<!-- 1. Complex behavior is encapsulated -->
<UserStatusIndicator :user="user" /> <!-- Has logic, tooltips, etc. -->
<!-- 2. Reused outside of the hot path -->
<Card> <!-- OK to use in one-off places, not in 100-item lists -->
<!-- 3. The list itself is small -->
<template v-if="items.length < 20">
<FancyItem v-for="item in items" :key="item.id" />
</template>
<!-- 4. Virtualization is used (only ~20 items rendered at once) -->
<RecycleScroller :items="items">
<template #default="{ item }">
<ComplexItem :item="item" /> <!-- OK - only 20 instances exist -->
</template>
</RecycleScroller>
```
## Measuring Component Overhead
```javascript
// In development, profile component counts
import { onMounted, getCurrentInstance } from 'vue'
onMounted(() => {
const instance = getCurrentInstance()
let count = 0
function countComponents(vnode) {
if (vnode.component) count++
if (vnode.children) {
vnode.children.forEach(child => {
if (child.component || child.children) countComponents(child)
})
}
}
// Use Vue DevTools instead for accurate counts
console.log('Check Vue DevTools Components tab for instance counts')
})
```
## Alternatives to Wrapper Components
```vue
<!-- Instead of a <Button> component for styling: -->
<button class="btn btn-primary">Click</button>
<!-- Instead of a <Text> component: -->
<span class="text-body">{{ content }}</span>
<!-- Instead of layout wrapper components in lists: -->
<div class="flex items-center gap-2">
<!-- content -->
</div>
<!-- Use CSS classes or Tailwind instead of component abstractions for styling -->
```
## Impact Calculation
| List Size | Components per Item | Total Instances | Memory Impact |
|-----------|---------------------|-----------------|---------------|
| 100 items | 1 (flat) | 100 | Baseline |
| 100 items | 3 (nested) | 300 | ~3x memory |
| 100 items | 5 (deeply nested) | 500 | ~5x memory |
| 1000 items | 1 (flat) | 1000 | High |
| 1000 items | 5 (deeply nested) | 5000 | Very High |

View File

@@ -0,0 +1,182 @@
---
title: Use v-once and v-memo to Skip Unnecessary Updates
impact: MEDIUM
impactDescription: v-once skips all future updates for static content; v-memo conditionally memoizes subtrees
type: efficiency
tags: [vue3, performance, v-once, v-memo, optimization, directives]
---
# Use v-once and v-memo to Skip Unnecessary Updates
**Impact: MEDIUM** - Vue re-evaluates templates on every reactive change. For content that never changes or changes infrequently, `v-once` and `v-memo` tell Vue to skip updates, reducing render work.
Use `v-once` for truly static content and `v-memo` for conditionally-static content in lists.
## Task List
- Apply `v-once` to elements that use runtime data but never need updating
- Apply `v-memo` to list items that should only update on specific condition changes
- Verify memoized content doesn't need to respond to other state changes
- Profile with Vue DevTools to confirm update skipping
## v-once: Render Once, Never Update
**BAD:**
```vue
<template>
<!-- BAD: Re-evaluated on every parent re-render -->
<div class="terms-content">
<h1>Terms of Service</h1>
<p>Version: {{ termsVersion }}</p>
<div v-html="termsContent"></div>
</div>
<!-- This content NEVER changes, but Vue checks it every render -->
<footer>
<p>Copyright {{ copyrightYear }} {{ companyName }}</p>
</footer>
</template>
```
**GOOD:**
```vue
<template>
<!-- GOOD: Rendered once, skipped on all future updates -->
<div class="terms-content" v-once>
<h1>Terms of Service</h1>
<p>Version: {{ termsVersion }}</p>
<div v-html="termsContent"></div>
</div>
<!-- v-once tells Vue this never needs to update -->
<footer v-once>
<p>Copyright {{ copyrightYear }} {{ companyName }}</p>
</footer>
</template>
<script setup>
// These values are set once at component creation
const termsVersion = '2.1'
const termsContent = fetchedTermsHTML
const copyrightYear = 2024
const companyName = 'Acme Corp'
</script>
```
## v-memo: Conditional Memoization for Lists
**BAD:**
```vue
<template>
<!-- BAD: All items re-render when selectedId changes -->
<div v-for="item in list" :key="item.id">
<div :class="{ selected: item.id === selectedId }">
<ExpensiveComponent :data="item" />
</div>
</div>
</template>
```
**GOOD:**
```vue
<template>
<!-- GOOD: Items only re-render when their selection state changes -->
<div
v-for="item in list"
:key="item.id"
v-memo="[item.id === selectedId]"
>
<div :class="{ selected: item.id === selectedId }">
<ExpensiveComponent :data="item" />
</div>
</div>
</template>
<script setup>
import { ref } from 'vue'
const list = ref([/* many items */])
const selectedId = ref(null)
// When selectedId changes:
// - Only the previously-selected item re-renders (selected: true -> false)
// - Only the newly-selected item re-renders (selected: false -> true)
// - All other items are SKIPPED (v-memo values unchanged)
</script>
```
## v-memo with Multiple Dependencies
```vue
<template>
<!-- Re-render only when item's selection OR editing state changes -->
<div
v-for="item in items"
:key="item.id"
v-memo="[item.id === selectedId, item.id === editingId]"
>
<ItemCard
:item="item"
:selected="item.id === selectedId"
:editing="item.id === editingId"
/>
</div>
</template>
<script setup>
const selectedId = ref(null)
const editingId = ref(null)
const items = ref([/* ... */])
</script>
```
## v-memo with Empty Array = v-once
```vue
<template>
<!-- v-memo="[]" is equivalent to v-once -->
<div v-for="item in staticList" :key="item.id" v-memo="[]">
{{ item.name }}
</div>
</template>
```
## When NOT to Use These Directives
```vue
<template>
<!-- DON'T: Content that DOES need to update -->
<div v-once>
<span>Count: {{ count }}</span> <!-- count won't update! -->
</div>
<!-- DON'T: When child components have their own reactive state -->
<div v-memo="[selected]">
<InputField v-model="item.name" /> <!-- v-model won't work properly -->
</div>
<!-- DON'T: When the memoization benefit is minimal -->
<span v-once>{{ simpleText }}</span> <!-- Overhead not worth it -->
</template>
```
## Performance Comparison
| Scenario | Without Directive | With v-once/v-memo |
|----------|-------------------|-------------------|
| Static header, parent re-renders 100x | Re-evaluated 100x | Evaluated 1x |
| 1000 items, selection changes | 1000 items re-render | 2 items re-render |
| Complex child component | Full re-render | Skipped if memoized |
## Debugging Memoized Components
```vue
<script setup>
import { onUpdated } from 'vue'
// This won't fire if v-memo prevents update
onUpdated(() => {
console.log('Component updated')
})
</script>
```

View File

@@ -0,0 +1,187 @@
---
title: Virtualize Large Lists to Avoid DOM Overload
impact: HIGH
impactDescription: Rendering thousands of list items creates excessive DOM nodes, causing slow renders and high memory usage
type: efficiency
tags: [vue3, performance, virtual-list, large-data, dom, optimization]
---
# Virtualize Large Lists to Avoid DOM Overload
**Impact: HIGH** - Rendering all items in a large list (hundreds or thousands) creates massive amounts of DOM nodes. Each node consumes memory, slows down initial render, and makes updates expensive. List virtualization only renders visible items, dramatically improving performance.
Use a virtualization library when dealing with lists that could exceed 50-100 items, especially if items have complex content.
## Task List
- Identify lists that render more than 50-100 items
- Install a virtualization library (vue-virtual-scroller, @tanstack/vue-virtual)
- Replace standard `v-for` with virtualized component
- Ensure list items have consistent or estimable heights
- Test with realistic data volumes during development
## Recommended Libraries
| Library | Best For | Notes |
|---------|----------|-------|
| `vue-virtual-scroller` | General use, easy setup | Most popular, good defaults |
| `@tanstack/vue-virtual` | Complex layouts, headless | Framework-agnostic, flexible |
| `vue-virtual-scroll-grid` | Grid layouts | 2D virtualization |
| `vueuc/VVirtualList` | Naive UI projects | Part of Naive UI ecosystem |
**BAD:**
```vue
<template>
<!-- BAD: Renders ALL 10,000 items immediately -->
<div class="user-list">
<UserCard
v-for="user in users"
:key="user.id"
:user="user"
/>
</div>
</template>
<script setup>
import { ref, onMounted } from 'vue'
import UserCard from './UserCard.vue'
const users = ref([])
onMounted(async () => {
// 10,000 DOM nodes created, browser struggles
users.value = await fetchAllUsers()
})
</script>
```
**GOOD:**
```vue
<template>
<!-- GOOD: Only renders ~20 visible items at a time -->
<RecycleScroller
class="user-list"
:items="users"
:item-size="80"
key-field="id"
v-slot="{ item }"
>
<UserCard :user="item" />
</RecycleScroller>
</template>
<script setup>
import { ref, onMounted } from 'vue'
import { RecycleScroller } from 'vue-virtual-scroller'
import 'vue-virtual-scroller/dist/vue-virtual-scroller.css'
import UserCard from './UserCard.vue'
const users = ref([])
onMounted(async () => {
// 10,000 items in memory, but only ~20 DOM nodes
users.value = await fetchAllUsers()
})
</script>
<style scoped>
.user-list {
height: 600px; /* Container must have fixed height */
}
</style>
```
## Using @tanstack/vue-virtual
```vue
<template>
<div ref="parentRef" class="list-container">
<div
:style="{
height: `${rowVirtualizer.getTotalSize()}px`,
position: 'relative'
}"
>
<div
v-for="virtualRow in rowVirtualizer.getVirtualItems()"
:key="virtualRow.key"
:style="{
position: 'absolute',
top: 0,
left: 0,
width: '100%',
height: `${virtualRow.size}px`,
transform: `translateY(${virtualRow.start}px)`
}"
>
<UserCard :user="users[virtualRow.index]" />
</div>
</div>
</div>
</template>
<script setup>
import { ref } from 'vue'
import { useVirtualizer } from '@tanstack/vue-virtual'
const users = ref([/* 10,000 users */])
const parentRef = ref(null)
const rowVirtualizer = useVirtualizer({
count: users.value.length,
getScrollElement: () => parentRef.value,
estimateSize: () => 80, // Estimated row height
overscan: 5 // Render 5 extra items above/below viewport
})
</script>
<style scoped>
.list-container {
height: 600px;
overflow: auto;
}
</style>
```
## Dynamic Heights with vue-virtual-scroller
```vue
<template>
<!-- For variable height items, use DynamicScroller -->
<DynamicScroller
:items="messages"
:min-item-size="54"
key-field="id"
>
<template #default="{ item, index, active }">
<DynamicScrollerItem
:item="item"
:active="active"
:data-index="index"
>
<ChatMessage :message="item" />
</DynamicScrollerItem>
</template>
</DynamicScroller>
</template>
<script setup>
import { DynamicScroller, DynamicScrollerItem } from 'vue-virtual-scroller'
</script>
```
## Performance Comparison
| Approach | 100 Items | 1,000 Items | 10,000 Items |
|----------|-----------|-------------|--------------|
| Regular v-for | ~100 DOM nodes | ~1,000 DOM nodes | ~10,000 DOM nodes |
| Virtualized | ~20 DOM nodes | ~20 DOM nodes | ~20 DOM nodes |
| Initial render | Fast | Slow | Very slow / crashes |
| Virtualized render | Fast | Fast | Fast |
## When NOT to Virtualize
- Lists under 50 items with simple content
- Lists where all items must be accessible to screen readers simultaneously
- Print layouts where all content must render
- SEO-critical content that must be in initial HTML

View File

@@ -0,0 +1,166 @@
---
title: Vue Plugin Best Practices
impact: MEDIUM
impactDescription: Incorrect plugin structure or injection key strategy causes install failures, collisions, and unsafe APIs
type: best-practice
tags: [vue3, plugins, provide-inject, typescript, dependency-injection]
---
# Vue Plugin Best Practices
**Impact: MEDIUM** - Vue plugins should follow the `app.use()` contract, expose explicit capabilities, and use collision-safe injection keys. This keeps plugin setup predictable and composable across large apps.
## Task List
- Export plugins as an object with `install()` or as an install function
- Use the `app` instance in `install()` to register components/directives/provides
- Type plugin APIs with `Plugin` (and options tuple types when needed)
- Use symbol keys (prefer `InjectionKey<T>`) for `provide/inject` in plugins
- Add a small typed composable wrapper for required injections to fail fast
## Structure Plugins for `app.use()`
A Vue plugin must be either:
- An object with `install(app, options?)`
- A function with the same signature
**BAD:**
```ts
const notAPlugin = {
doSomething() {}
}
app.use(notAPlugin)
```
**GOOD:**
```ts
import type { App } from 'vue'
interface PluginOptions {
prefix?: string
debug?: boolean
}
const myPlugin = {
install(app: App, options: PluginOptions = {}) {
const { prefix = 'my', debug = false } = options
if (debug) {
console.log('Installing myPlugin with prefix:', prefix)
}
app.provide('myPlugin', { prefix })
}
}
app.use(myPlugin, { prefix: 'custom', debug: true })
```
**GOOD:**
```ts
import type { App } from 'vue'
function simplePlugin(app: App, options?: { message: string }) {
app.config.globalProperties.$greet = () => options?.message ?? 'Hello!'
}
app.use(simplePlugin, { message: 'Welcome!' })
```
## Register Capabilities Explicitly in `install()`
Inside `install()`, wire behavior through Vue application APIs:
- `app.component()` for global components
- `app.directive()` for global directives
- `app.provide()` for injectable services and config
- `app.config.globalProperties` for optional global helpers (sparingly)
**BAD:**
```ts
const uselessPlugin = {
install(app, options) {
const service = createService(options)
}
}
```
**GOOD:**
```ts
const usefulPlugin = {
install(app, options) {
const service = createService(options)
app.provide(serviceKey, service)
}
}
```
## Type Plugin Contracts
Use Vue's `Plugin` type to keep install signatures and options type-safe.
```ts
import type { App, Plugin } from 'vue'
interface MyOptions {
apiKey: string
}
const myPlugin: Plugin<[MyOptions]> = {
install(app: App, options: MyOptions) {
app.provide(apiKeyKey, options.apiKey)
}
}
```
## Use Symbol Injection Keys in Plugins
String keys can collide (`'http'`, `'config'`, `'i18n'`). Use symbol keys with `InjectionKey<T>` so injections are unique and typed.
**BAD:**
```ts
export default {
install(app) {
app.provide('http', axios)
app.provide('config', appConfig)
}
}
```
**GOOD:**
```ts
import type { InjectionKey } from 'vue'
import type { AxiosInstance } from 'axios'
interface AppConfig {
apiUrl: string
timeout: number
}
export const httpKey: InjectionKey<AxiosInstance> = Symbol('http')
export const configKey: InjectionKey<AppConfig> = Symbol('appConfig')
export default {
install(app) {
app.provide(httpKey, axios)
app.provide(configKey, { apiUrl: '/api', timeout: 5000 })
}
}
```
## Provide Required Injection Helpers
Wrap required injections in composables that throw clear setup errors.
```ts
import { inject } from 'vue'
import { authKey, type AuthService } from '@/injection-keys'
export function useAuth(): AuthService {
const auth = inject(authKey)
if (!auth) {
throw new Error('Auth plugin not installed. Did you forget app.use(authPlugin)?')
}
return auth
}
```

View File

@@ -0,0 +1,344 @@
---
title: Reactivity Core Patterns (ref, reactive, shallowRef, computed, watch)
impact: MEDIUM
impactDescription: Clear reactivity choices keep state predictable and reduce unnecessary updates in Vue 3 apps
type: efficiency
tags: [vue3, reactivity, ref, reactive, shallowRef, computed, watch, watchEffect, external-state, best-practice]
---
# Reactivity Core Patterns (ref, reactive, shallowRef, computed, watch)
**Impact: MEDIUM** - Choose the right reactive primitive first, derive with `computed`, and use watchers only for side effects.
This reference covers the core reactivity decisions for local state, external data, derived values, and effects.
## Task List
- Declare reactive state correctly
- Always use `shallowRef()` instead of `ref()` for primitive values
- Choose the correct reactive declaration method for objects/arrays/map/set
- Follow best practices for `reactive`
- Avoid destructuring from `reactive()` directly
- Watch correctly for `reactive`
- Follow best practices for `computed`
- Prefer `computed` over watcher-assigned derived refs
- Keep filtered/sorted derivations out of templates
- Use `computed` for reusable class/style logic
- Keep computed getters pure (no side effects) and put side effects in watchers
- Follow best practices for watchers
- Use `immediate: true` instead of duplicate initial calls
- Clean up async effects for watchers
## Declare reactive state correctly
### Always use `shallowRef()` instead of `ref()` for primitive values (string, number, boolean, null, etc.) for better performance.
**Incorrect:**
```ts
import { ref } from 'vue'
const count = ref(0)
```
**Correct:**
```ts
import { shallowRef } from 'vue'
const count = shallowRef(0)
```
### Choose the correct reactive declaration method for objects/arrays/map/set
Use `ref()` when you often **replace the entire value** (`state.value = newObj`) and still want deep reactivity inside it, usually used for:
- Frequently reassigned state (replace fetched object/list, reset to defaults, switch presets).
- Composable return values where updates happen mostly via `.value` reassignment.
Use `reactive()` when you mainly **mutate properties** and full replacement is uncommon, usually used for:
- “Single state object” patterns (stores/forms): `state.count++`, `state.items.push(...)`, `state.user.name = ...`.
- Situations where you want to avoid `.value` and update nested fields in place.
```ts
import { reactive } from 'vue'
const state = reactive({
count: 0,
user: { name: 'Alice', age: 30 }
})
state.count++ // ✅ reactive
state.user.age = 31 // ✅ reactive
// ❌ avoid replacing the reactive object reference:
// state = reactive({ count: 1 })
```
Use `shallowRef()` when the value is **opaque / should not be proxied** (class instances, external library objects, very large nested data) and you only want updates to trigger when you **replace** `state.value` (no deep tracking), usually used for:
- Storing external instances/handles (SDK clients, class instances) without Vue proxying internals.
- Large data where you update by replacing the root reference (immutable-style updates).
```ts
import { shallowRef } from 'vue'
const user = shallowRef({ name: 'Alice', age: 30 })
user.value.age = 31 // ❌ not reactive
user.value = { name: 'Bob', age: 25 } // ✅ triggers update
```
Use `shallowReactive()` when you want **only top-level properties** reactive; nested objects remain raw, usually used for:
- Container objects where only top-level keys change and nested payloads should stay unmanaged/unproxied.
- Mixed structures where Vue tracks the wrapper object, but not deeply nested or foreign objects.
```ts
import { shallowReactive } from 'vue'
const state = shallowReactive({
count: 0,
user: { name: 'Alice', age: 30 }
})
state.count++ // ✅ reactive
state.user.age = 31 // ❌ not reactive
```
## Best practices for `reactive`
### Avoid destructuring from `reactive()` directly
**BAD:**
```ts
import { reactive } from 'vue'
const state = reactive({ count: 0 })
const { count } = state // ❌ disconnected from reactivity
```
### Watch correctly for reactive
**BAD:**
passing a non-getter value into `watch()`
```ts
import { reactive, watch } from 'vue'
const state = reactive({ count: 0 })
// ❌ watch expects a getter, ref, reactive object, or array of these
watch(state.count, () => { /* ... */ })
```
**GOOD:**
preserve reactivity with `toRefs()` and use a getter for `watch()`
```ts
import { reactive, toRefs, watch } from 'vue'
const state = reactive({ count: 0 })
const { count } = toRefs(state) // ✅ count is a ref
watch(count, () => { /* ... */ }) // ✅
watch(() => state.count, () => { /* ... */ }) // ✅
```
## Best practices for `computed`
### Prefer `computed` over watcher-assigned derived refs
**BAD:**
```ts
import { ref, watchEffect } from 'vue'
const items = ref([{ price: 10 }, { price: 20 }])
const total = ref(0)
watchEffect(() => {
total.value = items.value.reduce((sum, item) => sum + item.price, 0)
})
```
**GOOD:**
```ts
import { ref, computed } from 'vue'
const items = ref([{ price: 10 }, { price: 20 }])
const total = computed(() =>
items.value.reduce((sum, item) => sum + item.price, 0)
)
```
### Keep filtered/sorted derivations out of templates
**BAD:**
```vue
<template>
<li v-for="item in items.filter(item => item.active)" :key="item.id">
{{ item.name }}
</li>
<li v-for="item in getSortedItems()" :key="item.id">
{{ item.name }}
</li>
</template>
<script setup>
import { ref } from 'vue'
const items = ref([
{ id: 1, name: 'B', active: true },
{ id: 2, name: 'A', active: false }
])
function getSortedItems() {
return [...items.value].sort((a, b) => a.name.localeCompare(b.name))
}
</script>
```
**GOOD:**
```vue
<script setup>
import { ref, computed } from 'vue'
const items = ref([
{ id: 1, name: 'B', active: true },
{ id: 2, name: 'A', active: false }
])
const visibleItems = computed(() =>
items.value
.filter(item => item.active)
.sort((a, b) => a.name.localeCompare(b.name))
)
</script>
<template>
<li v-for="item in visibleItems" :key="item.id">
{{ item.name }}
</li>
</template>
```
### Use `computed` for reusable class/style logic
**BAD:**
```vue
<template>
<button :class="{ btn: true, 'btn-primary': type === 'primary' && !disabled, 'btn-disabled': disabled }">
{{ label }}
</button>
</template>
```
**GOOD:**
```vue
<script setup>
import { computed } from 'vue'
const props = defineProps({
type: { type: String, default: 'primary' },
disabled: Boolean,
label: String
})
const buttonClasses = computed(() => ({
btn: true,
[`btn-${props.type}`]: !props.disabled,
'btn-disabled': props.disabled
}))
</script>
<template>
<button :class="buttonClasses">
{{ label }}
</button>
</template>
```
### Keep computed getters pure (no side effects) and put side effects in watchers instead
A computed getter should only derive a value. No mutation, no API calls, no storage writes, no event emits.
([Reference](https://vuejs.org/guide/essentials/computed.html#best-practices))
**BAD:**
side effects inside computed
```ts
const count = ref(0)
const doubled = computed(() => {
// ❌ side effect
if (count.value > 10) console.warn('Too big!')
return count.value * 2
})
```
**GOOD:**
pure computed + `watch()` for side effects
```ts
const count = ref(0)
const doubled = computed(() => count.value * 2)
watch(count, (value) => {
if (value > 10) console.warn('Too big!')
})
```
## Best practices for watchers
### Use `immediate: true` instead of duplicate initial calls
**BAD:**
```ts
import { ref, watch, onMounted } from 'vue'
const userId = ref(1)
function loadUser(id) {
// ...
}
onMounted(() => loadUser(userId.value))
watch(userId, (id) => loadUser(id))
```
**GOOD:**
```ts
import { ref, watch } from 'vue'
const userId = ref(1)
watch(
userId,
(id) => loadUser(id),
{ immediate: true }
)
```
### Clean up async effects for watchers
When reacting to rapid changes (search boxes, filters), cancel the previous request.
**GOOD:**
```ts
const query = ref('')
const results = ref<string[]>([])
watch(query, async (q, _prev, onCleanup) => {
const controller = new AbortController()
onCleanup(() => controller.abort())
const res = await fetch(`/api/search?q=${encodeURIComponent(q)}`, {
signal: controller.signal,
})
results.value = await res.json()
})
```

Some files were not shown because too many files have changed in this diff Show More