Whisper Transcription Service

This is a standalone HTTP service for transcribing audio files using the OpenAI Whisper model.

Prerequisites

Make sure you have Python 3.9+.

The service uses imageio-ffmpeg to provide ffmpeg binary automatically. You do not need to install system ffmpeg manually.

cd whisper
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

python main.py

Or run with uvicorn directly:

uvicorn main:app --host 0.0.0.0 --port 8001 --reload

The service will run on http://localhost:8001.

GET /health
- Returns: {"status": "ok"}
POST /transcribe
- Body: multipart/form-data with a file field containing the audio blob.
- Returns: {"text": "transcribed text..."}

In DataClaw frontend:

After configuration, click the mic button in chat input area to start voice input.