Files
DataClaw/whisper/README.md
T

51 lines
1.2 KiB
Markdown
Raw Normal View History

2026-03-28 20:00:48 +08:00
# Whisper Transcription Service
This is a standalone HTTP service for transcribing audio files using the OpenAI Whisper model.
## Prerequisites
2026-03-28 20:47:57 +08:00
Make sure you have Python 3.9+.
2026-03-28 20:00:48 +08:00
2026-03-28 20:47:57 +08:00
The service uses `imageio-ffmpeg` to provide ffmpeg binary automatically. You do not need to install system ffmpeg manually.
2026-03-28 20:00:48 +08:00
## Setup & Run
1. Create a virtual environment and install dependencies:
```bash
cd whisper
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
2. Start the server:
```bash
python main.py
```
Or run with uvicorn directly:
```bash
uvicorn main:app --host 0.0.0.0 --port 8001 --reload
```
The service will run on `http://localhost:8001`.
## API Endpoint
2026-03-28 20:47:57 +08:00
- `GET /health`
- Returns: `{"status": "ok"}`
2026-03-28 20:00:48 +08:00
- `POST /transcribe`
- Body: `multipart/form-data` with a `file` field containing the audio blob.
- Returns: `{"text": "transcribed text..."}`
2026-03-28 20:47:57 +08:00
## Frontend Integration
In DataClaw frontend:
1. Click username at bottom-left to open user menu.
2. Click `语音输入配置`.
3. Fill in service URL, e.g. `http://localhost:8001`.
4. Click `测试连接` first, then click `保存`.
After configuration, click the mic button in chat input area to start voice input.