doc: README updated for whisper
This commit is contained in:
+16
-5
@@ -4,12 +4,9 @@ This is a standalone HTTP service for transcribing audio files using the OpenAI
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Make sure you have Python 3.9+ and `ffmpeg` installed on your system.
|
||||
Make sure you have Python 3.9+.
|
||||
|
||||
To install `ffmpeg` on macOS:
|
||||
```bash
|
||||
brew install ffmpeg
|
||||
```
|
||||
The service uses `imageio-ffmpeg` to provide ffmpeg binary automatically. You do not need to install system ffmpeg manually.
|
||||
|
||||
## Setup & Run
|
||||
|
||||
@@ -34,6 +31,20 @@ The service will run on `http://localhost:8001`.
|
||||
|
||||
## API Endpoint
|
||||
|
||||
- `GET /health`
|
||||
- Returns: `{"status": "ok"}`
|
||||
|
||||
- `POST /transcribe`
|
||||
- Body: `multipart/form-data` with a `file` field containing the audio blob.
|
||||
- Returns: `{"text": "transcribed text..."}`
|
||||
|
||||
## Frontend Integration
|
||||
|
||||
In DataClaw frontend:
|
||||
|
||||
1. Click username at bottom-left to open user menu.
|
||||
2. Click `语音输入配置`.
|
||||
3. Fill in service URL, e.g. `http://localhost:8001`.
|
||||
4. Click `测试连接` first, then click `保存`.
|
||||
|
||||
After configuration, click the mic button in chat input area to start voice input.
|
||||
|
||||
Reference in New Issue
Block a user