doc: README updated for whisper

2026-03-28 20:47:57 +08:00
parent b0a8a69373
commit bd7776d1b7
4 changed files with 61 additions and 8 deletions
@@ -98,7 +98,28 @@ npm run dev
 ```


-### 3. 初始账号配置 👤
+### 3. 语音识别服务（可选）🎙️
+
+若你希望使用聊天输入框中的语音输入能力，请单独启动 `whisper` 服务：
+
+```bash
+cd whisper
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+python main.py
+```
+
+默认服务地址：`http://localhost:8001`  
+健康检查接口：`GET /health`
+
+前端配置方式：
+1. 点击左下角用户名，打开菜单；
+2. 进入「语音输入配置」；
+3. 填写服务地址（例如 `http://localhost:8001`）；
+4. 点击「测试连接」通过后保存。
+
+### 4. 初始账号配置 👤
 系统首次注册的用户将自动成为管理员。您可以在登录页面直接点击“注册”按钮创建您的管理员账号（例如：用户名 `admin`，密码 `admin`），随后即可登录并管理项目、数据源和用户。

 ***
@@ -97,7 +97,28 @@ npm install
 npm run dev
 ```

-### 3. Initial Account Setup 👤
+### 3. Optional Voice Service 🎙️
+
+If you want to use voice input in chat, run the standalone `whisper` service:
+
+```bash
+cd whisper
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+python main.py
+```
+
+Default service URL: `http://localhost:8001`  
+Health endpoint: `GET /health`
+
+Frontend setup:
+1. Click the username in the bottom-left to open the user menu;
+2. Open `Voice Input Settings`;
+3. Fill in the service URL (e.g. `http://localhost:8001`);
+4. Click `Test Connection`, then `Save`.
+
+### 4. Initial Account Setup 👤
 The first user to register in the system will automatically be granted admin privileges. You can simply click the "Register" button on the login page to create your admin account (e.g., Username: `admin`, Password: `admin`), and then log in to manage projects, data sources, and users.

 ***
@@ -286,7 +286,7 @@ export function Skills() {

  return (
    <div className="h-full flex flex-col bg-background overflow-hidden">
-      <div className="border-b border-border px-8 pt-5 bg-background shrink-0">
+      <div className="border-b border-border px-8 pt-5 pr-24 bg-background shrink-0">
        <div className="flex items-center justify-between mb-4">
          <div>
            <h1 className="text-2xl font-bold text-foreground flex items-center gap-2">
@@ -4,12 +4,9 @@ This is a standalone HTTP service for transcribing audio files using the OpenAI

 ## Prerequisites

-Make sure you have Python 3.9+ and `ffmpeg` installed on your system.
+Make sure you have Python 3.9+.

-To install `ffmpeg` on macOS:
-```bash
-brew install ffmpeg
-```
+The service uses `imageio-ffmpeg` to provide ffmpeg binary automatically. You do not need to install system ffmpeg manually.

 ## Setup & Run

@@ -34,6 +31,20 @@ The service will run on `http://localhost:8001`.

 ## API Endpoint

+- `GET /health`
+  - Returns: `{"status": "ok"}`
+
 - `POST /transcribe`
  - Body: `multipart/form-data` with a `file` field containing the audio blob.
  - Returns: `{"text": "transcribed text..."}`
+
+## Frontend Integration
+
+In DataClaw frontend:
+
+1. Click username at bottom-left to open user menu.
+2. Click `语音输入配置`.
+3. Fill in service URL, e.g. `http://localhost:8001`.
+4. Click `测试连接` first, then click `保存`.
+
+After configuration, click the mic button in chat input area to start voice input.