AI RESEARCH
Why don't Automatic speech Recognition models use prompting? [D]
r/MachineLearning
•
I've been working on the listening part of my full-duplex speech model and I realized that ASR prompting could be very useful. Deepgram allows for word boosting but that doesn't work that well in real word applications. Other thing that is missing is feeding a whole conversation history as context to the ASR model. This could be very useful for voice agents. TLDR, during the testing I realized the model can be fine tuned for prompting with text like: Expect a license plate (3 letters, 3 numbers). For example ABC123. <|start|> or Expect a person's name.