I Built a Local Transcription, Diarization , and Speaker Memory Tool, to Transcribe Meetings, and Save Embeddings for Known Speakers so they are already inserted in the Transcripts on Future Transcripts ( also checks existing transcripts to update)

r/LocalLLaMA
Generative AI AI Hardware

I wanted to Share a Tool I Built: NoobScribe (because my nickname is meganoob1337 ^^) The Base was parakeet-diarized, link in ATTRIBUTIONS(.)md in Repository It Exposes a Whisper Compatible API for Transcribing audio, although my main Additions are the Webui and Endpoints for the Management of Recordings, Transcripts and Speakers It runs in Docker (cpu or with nvidia docker toolkit on gpu), uses Pyannote audio for Diarization and nvidia/canary-1b-v2 for Transcription.