CLI Guide

Basic usage

whisper-smith data/sample.m4a

Save transcript output:

whisper-smith data/sample.m4a --output data/sample.txt

Choose an output format:

whisper-smith data/sample.m4a --format json --output data/sample.json

Supported transcript formats are txt, json, srt, and vtt. When --format is omitted, the format is inferred from --output.

Overwrite an existing output file:

whisper-smith data/sample.m4a --output data/sample.txt --overwrite

Speaker diarization

whisper-smith data/sample.m4a --diarize --output data/sample.diarization.json

Diarization output currently supports JSON only.

Optional speaker-count hints:

whisper-smith data/sample.m4a --diarize --num-speakers 2
whisper-smith data/sample.m4a --diarize --min-speakers 1 --max-speakers 3

Speaker-aligned JSON

whisper-smith data/sample.m4a --align --output data/sample.aligned.json

This writes the aligned transcript JSON as the main output and writes intermediate transcript and diarization JSON files beside it.

Use a separate artifact directory:

whisper-smith data/sample.m4a --align --output data/sample.aligned.json --artifacts-dir data/artifacts

Suppress intermediate files:

whisper-smith data/sample.m4a --align --output data/sample.aligned.json --no-artifacts

Convert JSON to CSV

Turn transcript or aligned JSON into a spreadsheet-friendly CSV:

whisper-smith json-to-csv data/sample.aligned.json --output data/sample.csv

The CSV columns are start, end, start_dttm, speaker, and text.

Use --initial-datetime to make start_dttm relative to a real recording start time:

whisper-smith json-to-csv data/sample.aligned.json --output data/sample.csv --initial-datetime 2026-06-10T09:00:00