Documentation
¶
Overview ¶
Package stt provides a Twilio implementation of stt.Provider.
Twilio STT works within call contexts via TwiML <Gather> verb or real-time transcription on Media Streams. This provider supports TwiML generation and real-time transcription configuration.
Index ¶
- type GatherConfig
- type GatherElement
- type Option
- type Provider
- func (p *Provider) GenerateGatherTwiML(config GatherConfig) string
- func (p *Provider) GenerateRealTimeTranscriptionConfig() map[string]any
- func (p *Provider) Name() string
- func (p *Provider) Transcribe(ctx context.Context, audio []byte, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)
- func (p *Provider) TranscribeFile(ctx context.Context, filePath string, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)
- func (p *Provider) TranscribeStream(ctx context.Context, config stt.TranscriptionConfig) (io.WriteCloser, <-chan stt.StreamEvent, error)
- func (p *Provider) TranscribeURL(ctx context.Context, url string, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)
- type ResponseElement
- type SayElement
- type TranscriptionEvent
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type GatherConfig ¶
type GatherConfig struct {
// Input specifies input types: "speech", "dtmf", or "speech dtmf"
Input string
// Language is the speech recognition language (e.g., "en-US")
Language string
// SpeechTimeout is seconds of silence before finalizing speech
SpeechTimeout string
// Timeout is seconds to wait for input
Timeout int
// NumDigits is max DTMF digits to collect
NumDigits int
// FinishOnKey is the DTMF key that ends input
FinishOnKey string
// Action is the URL to submit results to
Action string
// Method is the HTTP method ("GET" or "POST")
Method string
// Enhanced enables enhanced speech recognition
Enhanced bool
// SpeechModel is the speech model to use
SpeechModel string
// ProfanityFilter filters profanity
ProfanityFilter bool
// Prompt is the text to say before gathering
Prompt string
}
GatherConfig configures a TwiML <Gather> element.
type GatherElement ¶
type GatherElement struct {
XMLName xml.Name `xml:"Gather"`
Input string `xml:"input,attr,omitempty"`
Language string `xml:"language,attr,omitempty"`
SpeechTimeout string `xml:"speechTimeout,attr,omitempty"`
Timeout int `xml:"timeout,attr,omitempty"`
NumDigits int `xml:"numDigits,attr,omitempty"`
FinishOnKey string `xml:"finishOnKey,attr,omitempty"`
Action string `xml:"action,attr,omitempty"`
Method string `xml:"method,attr,omitempty"`
Enhanced bool `xml:"enhanced,attr,omitempty"`
SpeechModel string `xml:"speechModel,attr,omitempty"`
ProfanityFilter bool `xml:"profanityFilter,attr,omitempty"`
Say *SayElement `xml:",omitempty"`
}
GatherElement represents a TwiML <Gather> element.
type Option ¶
type Option func(*options)
Option configures the Provider.
func WithLanguage ¶
WithLanguage sets the default language.
func WithProfanityFilter ¶
WithProfanityFilter enables or disables the profanity filter.
func WithSpeechModel ¶
WithSpeechModel sets the speech recognition model. Options: "default", "numbers_and_commands", "phone_call", "video", "enhanced"
type Provider ¶
type Provider struct {
// contains filtered or unexported fields
}
Provider implements stt.Provider using Twilio's speech recognition.
func (*Provider) GenerateGatherTwiML ¶
func (p *Provider) GenerateGatherTwiML(config GatherConfig) string
GenerateGatherTwiML generates TwiML for speech recognition. Use this to create interactive voice response (IVR) flows.
func (*Provider) GenerateRealTimeTranscriptionConfig ¶
GenerateRealTimeTranscriptionConfig generates the configuration for enabling real-time transcription on Twilio Media Streams.
func (*Provider) Transcribe ¶
func (p *Provider) Transcribe(ctx context.Context, audio []byte, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)
Transcribe is not directly supported for arbitrary audio. Twilio STT works within call contexts.
func (*Provider) TranscribeFile ¶
func (p *Provider) TranscribeFile(ctx context.Context, filePath string, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)
TranscribeFile is not supported by Twilio.
func (*Provider) TranscribeStream ¶
func (p *Provider) TranscribeStream(ctx context.Context, config stt.TranscriptionConfig) (io.WriteCloser, <-chan stt.StreamEvent, error)
TranscribeStream creates a streaming transcription session. This works with Twilio's real-time transcription on Media Streams.
func (*Provider) TranscribeURL ¶
func (p *Provider) TranscribeURL(ctx context.Context, url string, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)
TranscribeURL is not supported by Twilio.
type ResponseElement ¶
type ResponseElement struct {
XMLName xml.Name `xml:"Response"`
Gather *GatherElement `xml:",omitempty"`
}
ResponseElement represents a TwiML <Response> element.
type SayElement ¶
SayElement represents a TwiML <Say> element.
type TranscriptionEvent ¶
type TranscriptionEvent struct {
Type string `json:"type"`
Transcript string `json:"transcript"`
Confidence float64 `json:"confidence"`
IsFinal bool `json:"is_final"`
Language string `json:"language"`
StartTime float64 `json:"start_time"`
EndTime float64 `json:"end_time"`
Timestamp time.Time `json:"timestamp"`
}
TranscriptionEvent represents a real-time transcription event from Twilio.
func ParseTranscriptionEvent ¶
func ParseTranscriptionEvent(data []byte) (*TranscriptionEvent, error)
ParseTranscriptionEvent parses a real-time transcription event from JSON.
func (*TranscriptionEvent) ToStreamEvent ¶
func (e *TranscriptionEvent) ToStreamEvent() stt.StreamEvent
ToStreamEvent converts a Twilio transcription event to an OmniVoice stream event.