第 37 章：编写测试

发表于 2026-05-19 更新于 2026-05-20 分类于 ClaudeCode源码解析阅读次数：

为 Claude Code 编写高质量测试的完整指南——单元测试中使用 Vitest Mock 隔离外部依赖的最佳实践、集成测试环境搭建与外部资源管理、端到端测试的自动化脚本与 CI 集成，以及测试覆盖率要求与代码审查中的质量门禁标准。

源码验证日期：2026-05-15，基于 commit 0d81bb6

上一章，你完成了一个完整的插件。功能跑通了，验证也通过了。但有一个问题始终悬在头上：你怎么知道它下次修改后还能跑通？

人类验证是好的——但人类会忘记、会偷懒、会在改了 A 之后忘了测 B。测试不会。好的测试是你留在代码里的保险：每次改动之后，它们自动跑一遍，替你确认一切还在正常工作。

这一章，我们讨论如何在 Claude Code 的代码库里编写测试。

路线图

graph LR
    CH36["第 36 章<br/>开发完整插件"] --> CH37["🔧 第 37 章<br/>编写测试"]
    CH37 --> CH38["第 38 章<br/>调试的艺术"]

    style CH37 fill:#4CAF50,color:#fff,stroke:#333
    style CH36 fill:#e8f5e9,stroke:#333
    style CH38 fill:#e1f5fe,stroke:#333

认识项目的测试现状

先说一个事实：从 npm 包提取出来的源码里，没有独立的测试文件。

这听起来可能让人意外。但想一想就明白了——这个仓库不是 Anthropic 的内部 monorepo，而是从打包后的 cli.js.map 的 sourcesContent 里提取出来的。打包过程只包含运行时源码，不包含测试文件、测试配置、CI 脚本这些开发基础设施。

但你仍然能在源码中找到测试的痕迹：

TestingPermissionTool——src/tools/testing/TestingPermissionTool.tsx 是一个专门的测试工具：

// 文件：src/tools/testing/TestingPermissionTool.tsx
export const TestingPermissionTool: Tool<InputSchema, string> = buildTool({
  name: 'TestingPermission',
  isEnabled() {
    return "production" === 'test';  // 只有测试环境才启用
  },
  async checkPermissions() {
    return {
      behavior: 'ask' as const,
      message: 'Run test?'
    };
  },
  async call() {
    return { data: 'TestingPermission executed successfully' };
  },
})

注意 isEnabled() 的实现——"production" === 'test'。这串代码在正常构建里永远是 false。但在测试构建中，Bun 的 feature() 或构建替换会把 "production" 替换成 "test"，让这个工具启用。

这是一个经典的模式：在生产代码中嵌入测试入口，通过构建时变量控制启用。

setup.ts 中的测试分支：

// 文件：src/setup.ts
if (process.env.NODE_ENV === 'test') {
  // 测试环境下的特殊行为
}

debug.ts 中的测试过滤：

// 文件：src/utils/debug.ts
if (process.env.NODE_ENV === 'test' && !isDebugToStderr()) {
  return false;  // 测试时不写 debug 日志
}

这些痕迹告诉我们：项目内部确实有一套测试体系，只是它没有被打包进发布产物。

选择测试框架

既然源码里没有测试配置，我们需要自己选择。推荐两个选项：

Vitest——与 Vite 生态无缝集成，原生支持 TypeScript，API 和 Jest 几乎一样（describe、it、expect）。

Bun 内置测试——bun test 命令内置，零配置，API 兼容 Jest。如果你用 Bun 运行项目（Claude Code 就是），这是最自然的选择。

两者的 API 几乎一样，选哪个取决于你的偏好。下面的代码示例使用 Vitest 的 API。

安装和配置

在源码根目录下初始化测试环境：

1 2	# 安装 Vitest npm install -D vitest

创建 vitest.config.ts：

import { defineConfig } from 'vitest/config'

export default defineConfig({
  test: {
    include: ['src/**/*.test.ts'],
  },
})

在 package.json 里加上测试脚本：

{
  "scripts": {
    "test": "vitest run",
    "test:watch": "vitest"
  }
}

测试 Zod Schema

你最应该测试的东西之一是输入验证。工具的 inputSchema 是 AI 和你的工具之间的契约。AI 传过来的 JSON 可能不符合预期——你的 schema 必须正确地接受合法输入、拒绝非法输入。

回忆一下第 31 章创建的 TimestampTool。让我们为它的 schema 写测试：

// 文件：src/tools/TimestampTool/TimestampTool.test.ts
import { describe, it, expect } from 'vitest'
import { z } from 'zod/v4'

const inputSchema = z.strictObject({
  format: z
    .enum(['iso', 'unix', 'locale'])
    .optional()
    .describe('Output format'),
})

describe('TimestampTool inputSchema', () => {
  it('接受无参数调用', () => {
    const result = inputSchema.safeParse({})
    expect(result.success).toBe(true)
  })

  it('接受 iso 格式', () => {
    const result = inputSchema.safeParse({ format: 'iso' })
    expect(result.success).toBe(true)
    if (result.success) {
      expect(result.data.format).toBe('iso')
    }
  })

  it('接受 unix 格式', () => {
    const result = inputSchema.safeParse({ format: 'unix' })
    expect(result.success).toBe(true)
  })

  it('拒绝非法格式', () => {
    const result = inputSchema.safeParse({ format: 'rfc2822' })
    expect(result.success).toBe(false)
  })

  it('拒绝额外字段（strictObject）', () => {
    const result = inputSchema.safeParse({ format: 'iso', extra: 'nope' })
    expect(result.success).toBe(false)
  })

  it('拒绝错误的类型', () => {
    const result = inputSchema.safeParse({ format: 123 })
    expect(result.success).toBe(false)
  })
})

运行测试：

1	npx vitest run src/tools/TimestampTool/TimestampTool.test.ts

这个测试文件覆盖了所有分支：合法值、缺失参数、非法值、额外字段、错误类型。

为什么要测 schema？ 因为 schema 是你的工具和 AI 之间的接口。一个有漏洞的 schema（比如忘了 strictObject）会让 AI 传入你没想到的字段，引发下游 bug。测试确保你的防线是完整的。

测试 call() 逻辑

工具的核心是 call() 方法。测试它时，关注的是：给定输入 X，输出是否正确？

import { describe, it, expect, vi } from 'vitest'

// 提取为纯函数便于测试
function formatTimestamp(format?: string): { timestamp: string; format: string } {
  const actualFormat = format ?? 'iso'
  switch (actualFormat) {
    case 'unix':
      return { timestamp: String(Math.floor(Date.now() / 1000)), format: 'unix' }
    case 'locale':
      return { timestamp: new Date().toLocaleString(), format: 'locale' }
    case 'iso':
    default:
      return { timestamp: new Date().toISOString(), format: 'iso' }
  }
}

describe('TimestampTool call 逻辑', () => {
  it('默认返回 ISO 格式', () => {
    const result = formatTimestamp()
    expect(result.format).toBe('iso')
    expect(result.timestamp).toMatch(/^\d{4}-\d{2}-\d{2}T/)
  })

  it('返回 Unix 时间戳', () => {
    const result = formatTimestamp('unix')
    expect(result.format).toBe('unix')
    expect(result.timestamp).toMatch(/^\d+$/)
    expect(Number(result.timestamp)).toBeGreaterThan(1262304000)
  })

  it('使用固定时间来避免时区问题', () => {
    vi.useFakeTimers()
    vi.setSystemTime(new Date('2026-01-15T10:30:00.000Z'))

    const result = formatTimestamp('iso')
    expect(result.timestamp).toBe('2026-01-15T10:30:00.000Z')

    vi.useRealTimers()
  })
})

注意最后一个测试——vi.useFakeTimers()。这是一个重要的技巧：涉及时间的测试必须固定时间，否则测试结果取决于运行时的时刻，可能在不同时区运行时失败。

模拟外部依赖

在 Claude Code 的工具里，很多操作涉及外部依赖——文件系统、API 调用、子进程执行。测试时需要模拟这些依赖。

原则：模拟边界，不模拟核心。

文件系统是边界。网络是边界。子进程是边界。你的核心逻辑（数据处理、决策逻辑、状态转换）不是边界，不应该被模拟。

import { describe, it, expect, vi } from 'vitest'
import { readFile } from 'fs/promises'

// 模拟 fs.readFile
vi.mock('fs/promises', () => ({
  readFile: vi.fn(),
}))

describe('loadConfig', () => {
  it('解析合法的配置文件', async () => {
    vi.mocked(readFile).mockResolvedValue(
      JSON.stringify({ name: 'test', version: '1.0.0' })
    )

    const result = await loadConfig('/fake/path/config.json')

    expect(result).toEqual({ name: 'test', version: '1.0.0' })
    expect(readFile).toHaveBeenCalledWith('/fake/path/config.json', 'utf-8')
  })

  it('文件不存在时抛出错误', async () => {
    vi.mocked(readFile).mockRejectedValue(
      new Error('ENOENT: no such file or directory')
    )

    await expect(loadConfig('/no/such/file')).rejects.toThrow('ENOENT')
  })

  it('非法 JSON 时抛出错误', async () => {
    vi.mocked(readFile).mockResolvedValue('not json at all')

    await expect(loadConfig('/bad/config')).rejects.toThrow()
  })
})

三种场景覆盖了：正常路径、文件缺失、数据损坏。你不需要真的创建文件——vi.mock 替你处理了文件系统的交互。

测试权限逻辑

第 33 章我们学了 checkPermissions。权限逻辑是安全关键路径，必须测试。

describe('DBQueryTool checkPermissions', () => {
  const tool = new DBQueryTool()
  const mockContext = {} as ToolUseContext

  it('SELECT 查询自动放行', async () => {
    const result = await tool.checkPermissions(
      { query: 'SELECT * FROM users' },
      mockContext,
    )
    expect(result.behavior).toBe('allow')
  })

  it('DROP 操作直接拒绝', async () => {
    const result = await tool.checkPermissions(
      { query: 'DROP TABLE users' },
      mockContext,
    )
    expect(result.behavior).toBe('deny')
    expect(result.message).toContain('Destructive')
  })

  it('INSERT 操作需要确认', async () => {
    const result = await tool.checkPermissions(
      { query: 'INSERT INTO users VALUES (1, "test")' },
      mockContext,
    )
    expect(result.behavior).toBe('passthrough')
  })

  it('大小写不敏感', async () => {
    const result = await tool.checkPermissions(
      { query: 'select * from users' },
      mockContext,
    )
    expect(result.behavior).toBe('allow')
  })
})

权限测试的关键是覆盖所有行为分支。allow、deny、passthrough——每种行为至少有一个测试。

集成测试——测试工具注册

单元测试验证单个函数的行为。集成测试验证组件之间的协作。

describe('工具注册', () => {
  it('TimestampTool 出现在工具列表中', () => {
    const tools = getAllBaseTools()
    const names = tools.map(t => t.name)
    expect(names).toContain('Timestamp')
  })

  it('所有工具都有 name 属性', () => {
    const tools = getAllBaseTools()
    for (const tool of tools) {
      expect(tool.name).toBeTruthy()
      expect(typeof tool.name).toBe('string')
    }
  })

  it('所有工具都有 inputSchema', () => {
    const tools = getAllBaseTools()
    for (const tool of tools) {
      expect(tool.inputSchema).toBeDefined()
    }
  })

  it('工具名不重复', () => {
    const tools = getAllBaseTools()
    const names = tools.map(t => t.name)
    const uniqueNames = new Set(names)
    expect(uniqueNames.size).toBe(names.length)
  })
})

这类测试帮你捕捉”注册了但忘了加到列表”、”名字打错了”、”schema 忘了定义”这类错误。

测试异步生成器

Claude Code 的工具有些返回 AsyncGenerator（如 AgentTool）。测试这类工具需要用 for await...of 收集所有 yield 的值：

it('测试 async generator 工具', async () => {
  const results = []
  for await (const item of tool.call(input, context, canUseTool, message)) {
    results.push(item)
  }
  expect(results).toHaveLength(1)
  expect(results[0].data).toBeDefined()
})

常见错误

常见错误	检查方法
`Cannot find module '../../Tool.js'`	ESM 约定用 `.js` 后缀，Vitest 默认能处理，检查 `moduleResolution` 设置
`Cannot find module 'bun:bundle'`	`bun:bundle` 是 Bun 运行时特有模块，在测试中需要 `vi.mock('bun:bundle', ...)`
`Cannot find module 'src/bootstrap/state.js'`	在 `vitest.config.ts` 中添加路径别名 `resolve.alias`
测试异步生成器报错	用 `for await...of` 收集结果，不要直接 await
时间相关的测试偶尔失败	用 `vi.useFakeTimers()` 固定时间

试试看

为第 32 章升级后的 TimestampTool 写完整测试。覆盖所有四种格式（iso、unix、locale、rfc2822），测试 timezone 参数的跨字段验证逻辑。用 vi.useFakeTimers() 固定时间。
为一个文件搜索工具写 schema 测试。要求 pattern（必填）、path（可选）、case_sensitive（用 semanticBoolean）。测试合法输入、缺失必需字段、"true" 字符串容错。
写一个集成测试。验证你注册的所有工具的 name 都不以 mcp__ 开头（内置工具和 MCP 工具应该能通过名字区分）。

检查点

测试现状——从 npm 包提取的源码没有独立测试文件，但源码中有 TestingPermissionTool、NODE_ENV === 'test' 分支等测试痕迹
框架选择——推荐 Vitest 或 Bun test，两者 API 兼容
Schema 测试——最先测试的东西应该是 Zod schema，它是工具和 AI 之间的契约
核心逻辑测试——call() 的纯逻辑应该提取为可测试的纯函数，用 vi.useFakeTimers() 处理时间依赖
模拟策略——模拟边界（文件系统、网络、子进程），不模拟核心逻辑
权限测试——checkPermissions 的每个行为分支都需要测试
集成测试——验证工具注册、名字唯一性、schema 存在性等跨组件属性

测试不是负担，是投资。每多写一个测试，你就多了一个不会疲倦的守卫者，在你改代码的时候替你盯着旧功能是否还在正常工作。

上一章：开发完整插件 | 下一章：调试的艺术