whisper-smith¶

whisper-smith is a small Python CLI and library for transcribing audio, running speaker diarization, and producing speaker-aligned transcript JSON.

The main workflow is:

sound file -> transcript JSON + diarization JSON -> aligned JSON

No local setup? Run the full pipeline on a free GPU in Google Colab:

Reference