Add repository harness for coding agents (#1157)

Co-authored-by: xingzhi <chuzihao.czh@alibaba-inc.com>
This commit is contained in:
sir1st
2026-05-30 18:57:22 +08:00
committed by GitHub
parent ce04b10eee
commit 6da5cd605a
9 changed files with 490 additions and 0 deletions
+40
View File
@@ -0,0 +1,40 @@
# Harness Overview
This harness turns recurring project knowledge into files and checks that an
agent can discover without chat history.
## Goals
- Make repository context legible through short maps and deeper docs.
- Keep architecture constraints close to the code they protect.
- Give agents a deterministic validation path before opening or updating a PR.
- Prefer mechanical checks over reminder text when a rule can be verified.
## Entry Points
- `AGENTS.md` is the root map for coding agents.
- `ARCHITECTURE.md` documents package boundaries and state ownership.
- `DEVELOPMENT.md` remains the contributor rules and command reference.
- `docs/harness/validation.md` maps change types to checks.
- `docs/harness/worktree-runbook.md` explains isolated worktree development.
- `docs/harness/pr-review.md` provides a PR self-review checklist.
- `scripts/harness-check.mjs` enforces baseline repository invariants.
## Operating Model
1. Read the root map and the specific doc for the task.
2. Make the smallest scoped change.
3. Add or update focused tests when behavior changes.
4. Run `npm run harness:check` and the relevant validation commands.
5. If a failure pattern repeats, improve this harness with docs, tests, scripts,
or CI instead of relying on a longer prompt.
## What Belongs In The Harness
- Facts that future agents must know to work safely.
- Checklists that prevent repeated PR review comments.
- Scripts that fail fast on repository-wide invariants.
- Runbooks for local, CI, release, and desktop packaging flows.
Do not put long implementation notes in `AGENTS.md`. Add them under `docs/` and
link to them from the map.
+41
View File
@@ -0,0 +1,41 @@
# PR Self-Review
Use this checklist before pushing or updating a pull request.
## Scope
- The PR title states the behavior being changed.
- The diff is limited to the requested task and required harness updates.
- Unrelated formatting or refactors are not bundled into the change.
- User-facing text has locale coverage.
## Architecture
- Client code uses shared API helpers and existing UI patterns.
- Server routes stay thin and delegate reusable behavior to controllers/services.
- Web UI state uses `config.appHome` or documented helpers.
- Hermes Agent state and Web UI state remain separate.
- Subprocess calls use argument arrays instead of shell string construction.
## Tests And Validation
- A focused test was added or updated for behavior changes.
- Browser-visible flows have e2e coverage when the risk justifies it.
- `npm run harness:check` passes.
- The PR body lists validation commands that actually ran.
- Known limitations or follow-ups are called out.
## Release And CI
- Workflow changes were checked with `npm run harness:check`.
- Desktop release artifacts remain platform-specific.
- `fail_on_unmatched_files: true` is preserved when each matrix target has its
own expected artifact list.
- Package manifest changes have matching lockfile changes when dependencies
change.
## Before Merge
- CI is green or failures are explained as unrelated.
- The branch is mergeable.
- The PR does not depend on hidden local state, credentials, or uncommitted files.
+68
View File
@@ -0,0 +1,68 @@
# Validation Guide
Run the smallest relevant checks while iterating. Escalate to the broad checks
when touching shared behavior, release automation, auth, persistence, or chat.
## Always Run For PRs
```bash
npm run harness:check
```
For broad or shared changes, also run:
```bash
npm run test:coverage
npm run test:e2e
npm run build
```
## Change-Type Matrix
| Change | Minimum local validation |
| --- | --- |
| Docs only | `npm run harness:check` |
| Client component/store/API | focused `npm run test -- <pattern>`, then `npm run build` |
| User-visible browser flow | focused Vitest plus `npm run test:e2e` |
| Server controller/service/db | focused `npm run test -- tests/server/<file>` |
| Auth, profile, or credential behavior | focused server tests plus relevant e2e auth tests |
| Chat, Socket.IO, group chat | focused server tests plus relevant e2e chat tests |
| Desktop packaging | `npm run harness:check`, `npm run build`, and a platform-specific desktop build when practical |
| GitHub workflow | `npm run harness:check` and `actionlint` when available |
| Package manifests | `npm ci --ignore-scripts` and lockfile workflow expectations |
## CI Mapping
- Build workflow: installs dependencies, runs coverage, builds production assets,
then runs a Linux desktop smoke test on pull requests.
- Playwright workflow: runs browser e2e tests.
- NPM lockfile workflow: verifies `package-lock.json` is synchronized.
- Desktop release workflow: builds and uploads platform-specific desktop artifacts
for release tags.
- Docker workflow: builds and publishes release images.
## Release Workflow Guardrail
Desktop release jobs must upload only the artifacts that their matrix target can
produce. Keep artifact globs in matrix data and keep `fail_on_unmatched_files:
true` so missing expected files still fail.
Expected desktop release outputs:
| Target | Required release globs |
| --- | --- |
| macOS | `*.dmg`, `*.dmg.blockmap`, `latest*.yml` |
| Windows | `*.exe`, `*.exe.blockmap`, `latest*.yml` |
| Linux x64 | `*.AppImage`, `*.deb`, `latest*.yml` |
| Linux arm64 | `*.AppImage`, `latest*.yml` |
## Failure Handling
When a command fails:
1. Read the first actionable error, not just the final stack trace.
2. Check whether the failure indicates missing context, missing test coverage,
or a missing mechanical rule.
3. Fix the product bug when there is one.
4. Update docs or `scripts/harness-check.mjs` when the same class of mistake
should be prevented next time.
+64
View File
@@ -0,0 +1,64 @@
# Worktree Runbook
Use a separate git worktree for agent changes so local user work remains
untouched.
## Create A Worktree
```bash
git fetch origin --prune
git worktree add -b codex/<short-topic> ../worktrees/hermes-web-ui-<short-topic> origin/main
cd ../worktrees/hermes-web-ui-<short-topic>
```
If the repository uses a fork remote, push to the remote requested by the task.
Do not rewrite or reset unrelated branches.
## Install
```bash
npm ci --ignore-scripts
npm rebuild node-pty
```
Desktop package dependencies are separate:
```bash
npm ci --prefix packages/desktop --no-audit --no-fund
```
## Isolated Runtime
Use per-worktree state and ports to avoid colliding with a running local app:
```bash
export PORT=18648
export HERMES_WEB_UI_HOME="$PWD/.tmp/hermes-web-ui"
export HERMES_WEBUI_STATE_DIR="$HERMES_WEB_UI_HOME"
export UPLOAD_DIR="$PWD/.tmp/uploads"
npm run dev
```
Do not point `HERMES_WEB_UI_HOME` at a user's real `~/.hermes-web-ui` when a task
only needs local verification.
## Browser Checks
For browser-visible changes:
```bash
npm run test:e2e
```
Prefer existing Playwright fixtures and mocked backend services. Add real-service
requirements only when the behavior cannot be represented with mocks.
## Cleanup
After a PR is pushed and no more local work is needed:
```bash
git worktree remove ../worktrees/hermes-web-ui-<short-topic>
```
Only remove the worktree you created.