diff --git a/README.md b/README.md index de8b3fc..6ee23bc 100644 --- a/README.md +++ b/README.md @@ -98,7 +98,28 @@ npm run dev ``` -### 3. 初始账号配置 👤 +### 3. 语音识别服务(可选)🎙️ + +若你希望使用聊天输入框中的语音输入能力,请单独启动 `whisper` 服务: + +```bash +cd whisper +python -m venv .venv +source .venv/bin/activate +pip install -r requirements.txt +python main.py +``` + +默认服务地址:`http://localhost:8001` +健康检查接口:`GET /health` + +前端配置方式: +1. 点击左下角用户名,打开菜单; +2. 进入「语音输入配置」; +3. 填写服务地址(例如 `http://localhost:8001`); +4. 点击「测试连接」通过后保存。 + +### 4. 初始账号配置 👤 系统首次注册的用户将自动成为管理员。您可以在登录页面直接点击“注册”按钮创建您的管理员账号(例如:用户名 `admin`,密码 `admin`),随后即可登录并管理项目、数据源和用户。 *** diff --git a/README_en.md b/README_en.md index 5430fed..049ac9b 100644 --- a/README_en.md +++ b/README_en.md @@ -97,7 +97,28 @@ npm install npm run dev ``` -### 3. Initial Account Setup 👤 +### 3. Optional Voice Service 🎙️ + +If you want to use voice input in chat, run the standalone `whisper` service: + +```bash +cd whisper +python -m venv .venv +source .venv/bin/activate +pip install -r requirements.txt +python main.py +``` + +Default service URL: `http://localhost:8001` +Health endpoint: `GET /health` + +Frontend setup: +1. Click the username in the bottom-left to open the user menu; +2. Open `Voice Input Settings`; +3. Fill in the service URL (e.g. `http://localhost:8001`); +4. Click `Test Connection`, then `Save`. + +### 4. Initial Account Setup 👤 The first user to register in the system will automatically be granted admin privileges. You can simply click the "Register" button on the login page to create your admin account (e.g., Username: `admin`, Password: `admin`), and then log in to manage projects, data sources, and users. *** diff --git a/frontend/src/pages/Skills.tsx b/frontend/src/pages/Skills.tsx index a59bc5a..06f7f0d 100644 --- a/frontend/src/pages/Skills.tsx +++ b/frontend/src/pages/Skills.tsx @@ -286,7 +286,7 @@ export function Skills() { return (
-
+

diff --git a/whisper/README.md b/whisper/README.md index 581d654..f3be262 100644 --- a/whisper/README.md +++ b/whisper/README.md @@ -4,12 +4,9 @@ This is a standalone HTTP service for transcribing audio files using the OpenAI ## Prerequisites -Make sure you have Python 3.9+ and `ffmpeg` installed on your system. +Make sure you have Python 3.9+. -To install `ffmpeg` on macOS: -```bash -brew install ffmpeg -``` +The service uses `imageio-ffmpeg` to provide ffmpeg binary automatically. You do not need to install system ffmpeg manually. ## Setup & Run @@ -34,6 +31,20 @@ The service will run on `http://localhost:8001`. ## API Endpoint +- `GET /health` + - Returns: `{"status": "ok"}` + - `POST /transcribe` - Body: `multipart/form-data` with a `file` field containing the audio blob. - Returns: `{"text": "transcribed text..."}` + +## Frontend Integration + +In DataClaw frontend: + +1. Click username at bottom-left to open user menu. +2. Click `语音输入配置`. +3. Fill in service URL, e.g. `http://localhost:8001`. +4. Click `测试连接` first, then click `保存`. + +After configuration, click the mic button in chat input area to start voice input.