feat: 灵犀 Studio Web UI 定制版
Build / build (push) Has been cancelled
NPM Lockfile Check / npm ci --ignore-scripts (push) Has been cancelled
Playwright / e2e (push) Has been cancelled

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
yi
2026-06-05 11:29:11 +08:00
commit 7d10320a82
643 changed files with 164406 additions and 0 deletions
+182
View File
@@ -0,0 +1,182 @@
---
name: apikey-image-gen
description: "Generate or edit images through Hermes Web UI using the selected/requested profile's fun-codex provider from config.yaml."
version: 1.0.0
author: Ekko
license: MIT
platforms: [linux, macos, windows, termux]
metadata:
hermes:
tags: [api.apikey.fun, image-generation, image-editing, media]
prerequisites:
commands: [curl]
---
# APIKEY Image Generation
Use this skill when the user wants to generate an image, generate an image from a reference image, or edit an existing image.
Always call Hermes Web UI's media endpoint. Do not call `api.apikey.fun` directly, and do not ask the user for an API key. The server reads the selected/requested profile's `config.yaml` and uses the `custom_providers` entry named `fun-codex`:
Do not use any built-in image generation tool as a fallback. If the Hermes Web UI endpoint returns `401`, `403`, connection failure, or any other error, stop and report the Hermes Web UI error to the user.
```yaml
custom_providers:
- name: fun-codex
base_url: https://api.apikey.fun/v1
api_key: ...
model: gpt-5.5
api_mode: codex_responses
```
Endpoint:
```bash
POST <Hermes Web UI base URL>/api/hermes/media/apikey-image-generate
```
Resolve the Hermes Web UI base URL in this order:
1. `HERMES_WEB_UI_URL` environment variable, if set.
2. `http://127.0.0.1:${PORT}`, if `PORT` is set.
3. `http://127.0.0.1:8648` for local development.
When Hermes Web UI is running from Docker Compose, the default external URL is `http://127.0.0.1:6060`.
Authentication:
Send the Hermes Web UI server bearer token. This token is accepted only by Hermes Web UI media generation endpoints for agent skills; it is not a general Web UI login token.
Resolve the token in this order:
1. `AUTH_TOKEN` environment variable, if set.
2. `${HERMES_WEB_UI_HOME}/.token`, if `HERMES_WEB_UI_HOME` is set.
3. `${HERMES_WEBUI_STATE_DIR}/.token`, if `HERMES_WEBUI_STATE_DIR` is set.
4. `~/.hermes-web-ui/.token`.
Profile selection:
Use the current Hermes profile from the run instructions by sending `X-Hermes-Profile`.
If the run instructions include `[Current Hermes profile: <name>]`, include:
```bash
-H "X-Hermes-Profile: <name>"
```
Replace `<name>` with the exact profile name from the run instructions. Never send a placeholder value such as `<name>` or `<current-hermes-profile>`.
If no current profile is provided, omit the header and let the server fall back to the current Hermes active profile.
## Modes
### Text To Image
Use when there is no input image.
```json
{
"mode": "text",
"prompt": "A high quality product image of a matte black mechanical keyboard on a clean desk",
"size": "1024x1024",
"output_path": "/absolute/path/to/output.png"
}
```
The server calls `POST /v1/images/generations` against the `fun-codex` base URL.
### Image To Image
Use when the user provides a reference image and wants a new image based on it.
```json
{
"mode": "image",
"prompt": "Use this reference composition and generate a refined technology brand poster",
"image_path": "/absolute/path/to/reference.png",
"size": "1024x1024",
"output_path": "/absolute/path/to/output.png"
}
```
The server calls `POST /v1/responses` against the `fun-codex` base URL.
### Image Edit
Use when the user wants to modify an existing image while preserving parts of it.
```json
{
"mode": "edit",
"prompt": "Change the background to blue and keep the subject unchanged",
"image_path": "/absolute/path/to/source.png",
"size": "1024x1024",
"output_path": "/absolute/path/to/edited.png"
}
```
The server calls `POST /v1/images/edits` against the `fun-codex` base URL.
## Request Fields
- `mode`: `text`, `image`, or `edit`.
- `prompt`: required.
- `image_path`: local png, jpeg, or webp path. Required for `image` and `edit` unless using `image_url` or `image_base64`.
- `image_url`: optional alternative image input.
- `image_base64`: optional alternative image input. If it is not a data URI, include `mime_type`.
- `n`: number of images. Defaults to `1`.
- `size`: defaults to `1024x1024`. Common values: `1024x1024`, `1536x1024`, `1024x1536`, `2048x2048`, `3840x2160`, `2160x3840`, `auto`.
- `quality`: defaults to `auto`.
- `model`: optional override. Text/edit default to `gpt-image-2`; image mode defaults to the `fun-codex` model in `config.yaml`.
- `image_model`: optional image tool model for image mode. Defaults to `gpt-image-2`.
- `output_path`: optional absolute output file path. If omitted, the server saves to `${HERMES_WEB_UI_HOME:-~/.hermes-web-ui}/media/*.png`.
- `timeout_ms`: defaults to `600000`.
## Curl Template
```bash
TOKEN="${AUTH_TOKEN:-}"
if [ -z "$TOKEN" ] && [ -n "${HERMES_WEB_UI_HOME:-}" ] && [ -f "$HERMES_WEB_UI_HOME/.token" ]; then
TOKEN="$(cat "$HERMES_WEB_UI_HOME/.token")"
fi
if [ -z "$TOKEN" ] && [ -n "${HERMES_WEBUI_STATE_DIR:-}" ] && [ -f "$HERMES_WEBUI_STATE_DIR/.token" ]; then
TOKEN="$(cat "$HERMES_WEBUI_STATE_DIR/.token")"
fi
if [ -z "$TOKEN" ] && [ -f "$HOME/.hermes-web-ui/.token" ]; then
TOKEN="$(cat "$HOME/.hermes-web-ui/.token")"
fi
if [ -z "$TOKEN" ]; then
echo "Missing Hermes Web UI token. Check AUTH_TOKEN, HERMES_WEB_UI_HOME, HERMES_WEBUI_STATE_DIR, or ~/.hermes-web-ui/.token." >&2
exit 1
fi
BASE_URL="${HERMES_WEB_UI_URL:-}"
if [ -z "$BASE_URL" ]; then
BASE_URL="http://127.0.0.1:${PORT:-8648}"
fi
BASE_URL="${BASE_URL%/}"
curl -sS -X POST "$BASE_URL/api/hermes/media/apikey-image-generate" \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{
"mode": "text",
"prompt": "A cinematic 4K photo of a silver robot hand holding a small glowing cube",
"size": "3840x2160",
"output_path": "/absolute/path/to/output.png"
}'
```
Successful responses include:
```json
{
"ok": true,
"mode": "text",
"output_paths": ["/absolute/path/to/output.png"],
"provider": "fun-codex",
"base_url": "https://api.apikey.fun/v1"
}
```
If the response code is `missing_fun_codex_provider`, tell the user to configure `fun-codex` in the selected/requested profile's `config.yaml`.
@@ -0,0 +1,112 @@
---
name: grok-image-to-video
description: "Animate a local image into a short mp4 video through Hermes Web UI using xAI Grok Imagine."
version: 1.0.0
author: Ekko
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
tags: [xAI, Grok, image-to-video, video-generation, media]
prerequisites:
commands: [curl]
---
# Grok Image To Video
Use this skill when the user wants to animate a local image into a short video with xAI Grok Imagine.
Do not use any built-in image or video generation tool as a fallback. If the Hermes Web UI endpoint returns `401`, `403`, connection failure, or any other error, stop and report the Hermes Web UI error to the user.
## Workflow
Call the local Hermes Web UI media endpoint. Pass a local image path; the server will check for xAI credentials, read the file, convert it to a base64 data URI, call xAI, poll until completion, and optionally save the generated mp4.
Endpoint:
```bash
POST <Hermes Web UI base URL>/api/hermes/media/grok-image-to-video
```
Resolve the Hermes Web UI base URL in this order:
1. `HERMES_WEB_UI_URL` environment variable, if set.
2. `http://127.0.0.1:${PORT}`, if `PORT` is set.
3. `http://127.0.0.1:8648` for local development.
When Hermes Web UI is running from the provided Docker Compose setup, the default external URL is `http://127.0.0.1:6060`.
Authentication:
The endpoint is protected by Hermes Web UI auth. Always send the Hermes Web UI server bearer token. This token is accepted only by Hermes Web UI media generation endpoints for agent skills; it is not a general Web UI login token.
Resolve the token in this order:
1. `AUTH_TOKEN` environment variable, if set.
2. `${HERMES_WEB_UI_HOME}/.token`, if `HERMES_WEB_UI_HOME` is set.
3. `${HERMES_WEBUI_STATE_DIR}/.token`, if `HERMES_WEBUI_STATE_DIR` is set.
4. `~/.hermes-web-ui/.token`.
Profile selection:
Use the current Hermes profile from the run instructions by sending `X-Hermes-Profile`.
If the run instructions include `[Current Hermes profile: <name>]`, include:
```bash
-H "X-Hermes-Profile: <name>"
```
Replace `<name>` with the exact profile name from the run instructions. Never send a placeholder value such as `<name>` or `<current-hermes-profile>`.
If no current profile is provided, omit the header and let the server fall back to the current Hermes active profile.
Required JSON fields:
- `image_path`: local path to a png, jpeg, or webp image.
- `prompt`: motion and style instructions for the generated video.
Optional JSON fields:
- `duration`: seconds, 1 to 15. Defaults to 8.
- `output_path`: local path where the server should save the mp4. If omitted, the server saves to `${HERMES_WEB_UI_HOME:-~/.hermes-web-ui}/media/<request_id>.mp4` and creates the `media` directory if needed.
- `timeout_ms`: maximum wait time. Defaults to 600000.
Example:
```bash
TOKEN="${AUTH_TOKEN:-}"
if [ -z "$TOKEN" ] && [ -n "${HERMES_WEB_UI_HOME:-}" ] && [ -f "$HERMES_WEB_UI_HOME/.token" ]; then
TOKEN="$(cat "$HERMES_WEB_UI_HOME/.token")"
fi
if [ -z "$TOKEN" ] && [ -n "${HERMES_WEBUI_STATE_DIR:-}" ] && [ -f "$HERMES_WEBUI_STATE_DIR/.token" ]; then
TOKEN="$(cat "$HERMES_WEBUI_STATE_DIR/.token")"
fi
if [ -z "$TOKEN" ] && [ -f "$HOME/.hermes-web-ui/.token" ]; then
TOKEN="$(cat "$HOME/.hermes-web-ui/.token")"
fi
if [ -z "$TOKEN" ]; then
echo "Missing Hermes Web UI token. Check AUTH_TOKEN, HERMES_WEB_UI_HOME, HERMES_WEBUI_STATE_DIR, or ~/.hermes-web-ui/.token." >&2
exit 1
fi
BASE_URL="${HERMES_WEB_UI_URL:-}"
if [ -z "$BASE_URL" ]; then
BASE_URL="http://127.0.0.1:${PORT:-8648}"
fi
BASE_URL="${BASE_URL%/}"
curl -sS -X POST "$BASE_URL/api/hermes/media/grok-image-to-video" \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{
"image_path": "/absolute/path/to/input.png",
"prompt": "Animate the subject with a slow cinematic push-in and subtle natural motion.",
"duration": 8,
"output_path": "/absolute/path/to/output.mp4"
}'
```
If the response has `code: "missing_xai_token"`, tell the user to set `XAI_API_KEY` or complete xAI OAuth login in Hermes Web UI before retrying.
Return the generated `output_path`.
+87
View File
@@ -0,0 +1,87 @@
---
name: hyperframes
description: "Create AI videos with HyperFrames in Hermes using HTML, CSS, and JavaScript compositions, then validate and render them to MP4. Use for short video intros, cinematic trailers, product promos, subtitle animations, HUD/tech visuals, web-to-video work, and motion graphics."
version: 1.0.0
author: Ekko
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
tags: [hyperframes, ai-video, html-video, animation, motion-graphics, mp4]
prerequisites:
commands: [node, npx]
---
# HyperFrames
Use this skill when the user asks Hermes to make a video with HyperFrames, such as a 30-second vertical video, a short intro, a cinematic micro-trailer, a product promo, animated captions, HUD-style tech visuals, a website-to-video piece, or an HTML/CSS/JS motion graphics render.
HyperFrames treats HTML as the video source of truth. Build video scenes as HTML compositions with CSS layout and JavaScript animation, validate the layout, then render the result to MP4.
## Setup
If HyperFrames is not installed or the official skill is missing, install it first:
```bash
hermes skills install official/creative/hyperframes
```
Use `npx hyperframes` for project operations. HyperFrames requires Node.js and FFmpeg. If rendering or preview fails, run:
```bash
npx hyperframes doctor
```
## Workflow
1. Convert the user's request into a short production brief: duration, aspect ratio, target platform, language, style, music or voiceover needs, and final output path.
2. For incomplete briefs, make reasonable defaults. Use 1080x1920 for vertical short video, 1920x1080 for horizontal video, 30 fps, and MP4 output.
3. Create or reuse a HyperFrames project:
```bash
npx hyperframes init my-video --non-interactive
```
4. Write the composition in HTML/CSS/JS. Make the static hero frame layout correct before adding animation.
5. Validate before rendering:
```bash
npx hyperframes lint
npx hyperframes inspect --samples 15
```
6. Preview when useful:
```bash
npx hyperframes preview
```
7. Render the final video:
```bash
npx hyperframes render --output final.mp4 --quality standard
```
Use `--quality draft` for fast iteration and `--quality high` for final delivery when the user asks for a polished export.
## Composition Rules
- Use a root element with `data-composition-id`, `data-width`, and `data-height`.
- Use `data-start`, `data-duration`, and `data-track-index` for timed clips.
- Register GSAP timelines synchronously on `window.__timelines`.
- Use CSS as the final layout state, then animate from or to that state.
- Keep media playback under the HyperFrames runtime. Do not manually call `play()`, `pause()`, or seek media.
- Avoid nondeterministic animation logic such as `Math.random()` or `Date.now()` unless using a seeded generator.
- Do not use infinite repeats. Calculate finite repeat counts from the composition duration.
- Check that text, captions, UI panels, and HUD elements stay inside the frame on every inspected timestamp.
## Delivery
When finished, tell the user:
- the rendered MP4 path;
- the preview URL if a preview server is running;
- any assumptions made about duration, aspect ratio, style, narration, or music;
- any validation issues that remain unresolved.
Do not stop after writing HTML. A HyperFrames task is only complete after the composition has been checked with `lint` and `inspect`, and rendered to an MP4 unless the user explicitly asks for source files only.
+86
View File
@@ -0,0 +1,86 @@
---
name: markdown-viewer
description: "Create rich diagrams, data visualizations, technical architecture views, and editorial content cards directly in Markdown using the Markdown Viewer Agent Skills pack. Use for Mermaid-like diagram requests, PlantUML architecture diagrams, Vega charts, JSON Canvas maps, infographics, UML, cloud/network/security/data/IoT diagrams, and polished Markdown documentation visuals."
version: 1.0.0
author: Ekko
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
source: markdown-viewer/skills
tags: [markdown-viewer, diagrams, visualization, plantuml, vega, infographic, documentation]
prerequisites:
commands: [node, npx]
---
# Markdown Viewer
Use this skill when the user wants a diagram, visualization, architecture view, data chart, technical documentation graphic, infographic, mind map, or editorial-quality content card directly inside Markdown.
Markdown Viewer Agent Skills is an opinionated skill pack for AI coding agents. It covers diagram generation, data visualization, and technical documentation using multiple Markdown-rendered engines, including PlantUML, Vega/Vega-Lite, JSON Canvas, infographic blocks, and direct HTML/CSS embeds.
## Setup
If the upstream skill pack is not installed, install it first:
```bash
npx skills add markdown-viewer/skills
```
After installation, prefer reading the specific upstream skill for the requested output type before writing complex diagrams. The pack includes detailed syntax rules, examples, and common pitfalls for each renderer.
## Skill Selection
Choose the smallest renderer that fits the user's goal:
| User goal | Use |
| --- | --- |
| Bar, line, scatter, heatmap, area, radar, word cloud, or data-driven chart | `vega` / `vega-lite` |
| KPI card, roadmap, timeline, SWOT, funnel, org chart, or structured visual summary | `infographic` |
| Free-position mind map, concept map, knowledge graph, or planning board | `canvas` |
| System layers, microservices, app/data/infrastructure layers | `architecture` |
| Editorial knowledge card, event card, data highlight, or polished content tile | `infocard` |
| UML class, sequence, activity, state, component, deployment, package, or use-case diagram | `uml` |
| AWS, Azure, GCP, Alibaba Cloud, Kubernetes, serverless, or multi-cloud diagram | `cloud` |
| LAN/WAN, data center, enterprise network, or device topology | `network` |
| Threat model, zero-trust, IAM, firewall, encryption, or compliance view | `security` |
| Enterprise architecture with business/application/technology layers | `archimate` |
| BPMN workflow, swim lanes, integration pattern, or value stream map | `bpmn` |
| ETL/ELT, warehouse, lakehouse, ML pipeline, or analytics workflow | `data-analytics` |
| Sensors, edge computing, smart factory/home, fleet, or digital twin view | `iot` |
| Hierarchical brainstorm tree or study outline | `mindmap` |
## Output Rules
- Write the result in Markdown unless the user asks for a separate file.
- Use the correct code fence for the chosen renderer:
- `vega-lite` or `vega` for data charts.
- `infographic` for infographic YAML blocks.
- `canvas` for JSON Canvas maps.
- `plantuml` or `puml` for UML, cloud, network, security, ArchiMate, BPMN, data analytics, IoT, and PlantUML mind maps.
- For `architecture` and `infocard`, embed the HTML/CSS directly in Markdown when that renderer expects raw HTML instead of a code fence.
- Keep diagrams focused. Prefer a clear, accurate first version over decorative complexity.
- Label nodes and edges with domain language the user already used.
- For technical diagrams, include enough structure to be useful in docs: boundaries, data flow, dependencies, trust zones, layers, or ownership where relevant.
- For data visualizations, include explicit sample data or use the data the user supplied. Do not invent real metrics without marking them as placeholders.
- For security or compliance diagrams, avoid implying guarantees. Show controls, boundaries, and risks factually.
## Workflow
1. Identify the user's artifact type: chart, diagram, architecture, process, mind map, infographic, or card.
2. Select the renderer from the guide above.
3. If the pack is installed locally, read the corresponding upstream `SKILL.md` for exact syntax and pitfalls.
4. Draft the Markdown artifact with the correct code fence or raw HTML/CSS style.
5. Check syntax before delivery: matching fences, valid JSON/YAML where required, PlantUML starts and ends correctly, and labels are readable.
6. If the user needs a file, save it as `.md` and include only the final artifact plus concise notes.
## Delivery
When finished, tell the user:
- which renderer or sub-skill you used;
- where the Markdown file is, if one was created;
- any placeholder data or assumptions;
- any viewer requirement, such as needing a Markdown Viewer extension or compatible renderer.
Do not use static screenshots when the user asked for Markdown-native visuals. The value of this skill is that the output stays editable, reviewable, and renderable from Markdown.
+98
View File
@@ -0,0 +1,98 @@
---
name: remotion
description: "Create editable AI video projects with Remotion and React, then preview and render them to MP4. Use for vertical short videos, product demos, story-driven animations, HUD/tech visuals, feed ads, tutorial videos, subtitles, voiceover, sound effects, and code-based video iteration."
version: 1.0.0
author: Ekko
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
source: skills-sh/google-labs-code/stitch-skills/remotion
tags: [remotion, react-video, ai-video, mp4, animation, short-video]
prerequisites:
commands: [node, npx]
---
# Remotion
Use this skill when the user wants Hermes to turn a short video idea into an editable, renderable React video project with Remotion.
Remotion is different from prompt-only AI video tools: it produces a code project. That means the agent can repeatedly edit subtitles, timing, characters, scenes, voiceover, sound effects, and visual rhythm, then render a new MP4.
Good fits include vertical short videos, product demos, story-driven animations, HUD/tech-style videos, feed ad creatives, tutorial explainers, caption-heavy clips, and reusable video templates.
## Setup
If the upstream Remotion skill is not installed, install it first:
```bash
hermes skills install skills-sh/google-labs-code/stitch-skills/remotion
```
For a new Remotion project, scaffold from an empty folder:
```bash
npx create-video@latest --yes --blank --no-tailwind my-video
```
Replace `my-video` with a short project name based on the user's brief.
## Workflow
1. Turn the request into a concise production brief: purpose, audience, duration, aspect ratio, style, scenes, text, narration, music, sound effects, and output path.
2. Use practical defaults when the user does not specify them: 1080x1920 for vertical short video, 1920x1080 for horizontal video, 30 fps, MP4 output, and a duration that fits the requested platform.
3. Create or reuse a Remotion project.
4. Build the video as React components and Remotion compositions. Keep scene data, captions, colors, timing, and copy easy to edit.
5. Use Remotion primitives for timing and media: `Composition`, `Sequence`, `AbsoluteFill`, `Audio`, `Video`, `Img`, `useCurrentFrame`, `useVideoConfig`, `interpolate`, and `spring`.
6. Preview in Remotion Studio while iterating:
```bash
npx remotion studio
```
7. For non-trivial layouts, render at least one still frame to catch layout, color, and timing issues:
```bash
npx remotion still <composition-id> --scale=0.25 --frame=30
```
8. Render the final MP4:
```bash
npx remotion render <composition-id> out/final.mp4
```
## Implementation Guidelines
- Prefer code that is easy to revise over one-off generated visuals.
- Keep copy, scene timing, colors, and asset references in clear constants or data arrays.
- Make captions readable on mobile: high contrast, generous line height, and safe margins.
- Use deterministic animation. Avoid time-based randomness that changes between renders.
- Use Remotion's frame-based timing instead of browser timers.
- Use separate components for scenes, captions, overlays, lower thirds, and recurring visual motifs.
- When adding voiceover or sound effects, keep audio timing explicit and easy to adjust.
- When using user assets, keep their original files in the project and reference them through Remotion's asset path conventions.
## Checks
Before delivery, run the strongest practical validation for the scope:
```bash
npm run build
npx remotion still <composition-id> --scale=0.25 --frame=30
npx remotion render <composition-id> out/final.mp4
```
If the project uses a different package script, follow that project instead. If rendering fails because of missing browser, FFmpeg, codec, or dependency setup, report the blocker and run the relevant Remotion or environment diagnostic before retrying.
## Delivery
When finished, tell the user:
- the Remotion project path;
- the rendered MP4 path;
- the preview command or Studio URL if a preview server is running;
- the composition ID used for rendering;
- any assumptions about duration, aspect ratio, voiceover, music, assets, or style.
Do not stop at a concept. A Remotion video task is complete when the project is editable and the requested MP4 is rendered, unless the user explicitly asks for source code only.