stt

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 28, 2025 License: MIT Imports: 8 Imported by: 0

Documentation

Overview

Package stt provides a Twilio implementation of stt.Provider.

Twilio STT works within call contexts via TwiML <Gather> verb or real-time transcription on Media Streams. This provider supports TwiML generation and real-time transcription configuration.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type GatherConfig

type GatherConfig struct {
	// Input specifies input types: "speech", "dtmf", or "speech dtmf"
	Input string

	// Language is the speech recognition language (e.g., "en-US")
	Language string

	// SpeechTimeout is seconds of silence before finalizing speech
	SpeechTimeout string

	// Timeout is seconds to wait for input
	Timeout int

	// NumDigits is max DTMF digits to collect
	NumDigits int

	// FinishOnKey is the DTMF key that ends input
	FinishOnKey string

	// Action is the URL to submit results to
	Action string

	// Method is the HTTP method ("GET" or "POST")
	Method string

	// Enhanced enables enhanced speech recognition
	Enhanced bool

	// SpeechModel is the speech model to use
	SpeechModel string

	// ProfanityFilter filters profanity
	ProfanityFilter bool

	// Prompt is the text to say before gathering
	Prompt string
}

GatherConfig configures a TwiML <Gather> element.

type GatherElement

type GatherElement struct {
	XMLName         xml.Name    `xml:"Gather"`
	Input           string      `xml:"input,attr,omitempty"`
	Language        string      `xml:"language,attr,omitempty"`
	SpeechTimeout   string      `xml:"speechTimeout,attr,omitempty"`
	Timeout         int         `xml:"timeout,attr,omitempty"`
	NumDigits       int         `xml:"numDigits,attr,omitempty"`
	FinishOnKey     string      `xml:"finishOnKey,attr,omitempty"`
	Action          string      `xml:"action,attr,omitempty"`
	Method          string      `xml:"method,attr,omitempty"`
	Enhanced        bool        `xml:"enhanced,attr,omitempty"`
	SpeechModel     string      `xml:"speechModel,attr,omitempty"`
	ProfanityFilter bool        `xml:"profanityFilter,attr,omitempty"`
	Say             *SayElement `xml:",omitempty"`
}

GatherElement represents a TwiML <Gather> element.

type Option

type Option func(*options)

Option configures the Provider.

func WithLanguage

func WithLanguage(language string) Option

WithLanguage sets the default language.

func WithProfanityFilter

func WithProfanityFilter(enabled bool) Option

WithProfanityFilter enables or disables the profanity filter.

func WithSpeechModel

func WithSpeechModel(model string) Option

WithSpeechModel sets the speech recognition model. Options: "default", "numbers_and_commands", "phone_call", "video", "enhanced"

type Provider

type Provider struct {
	// contains filtered or unexported fields
}

Provider implements stt.Provider using Twilio's speech recognition.

func New

func New(opts ...Option) (*Provider, error)

New creates a new Twilio STT provider.

func (*Provider) GenerateGatherTwiML

func (p *Provider) GenerateGatherTwiML(config GatherConfig) string

GenerateGatherTwiML generates TwiML for speech recognition. Use this to create interactive voice response (IVR) flows.

func (*Provider) GenerateRealTimeTranscriptionConfig

func (p *Provider) GenerateRealTimeTranscriptionConfig() map[string]any

GenerateRealTimeTranscriptionConfig generates the configuration for enabling real-time transcription on Twilio Media Streams.

func (*Provider) Name

func (p *Provider) Name() string

Name returns the provider name.

func (*Provider) Transcribe

func (p *Provider) Transcribe(ctx context.Context, audio []byte, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)

Transcribe is not directly supported for arbitrary audio. Twilio STT works within call contexts.

func (*Provider) TranscribeFile

func (p *Provider) TranscribeFile(ctx context.Context, filePath string, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)

TranscribeFile is not supported by Twilio.

func (*Provider) TranscribeStream

func (p *Provider) TranscribeStream(ctx context.Context, config stt.TranscriptionConfig) (io.WriteCloser, <-chan stt.StreamEvent, error)

TranscribeStream creates a streaming transcription session. This works with Twilio's real-time transcription on Media Streams.

func (*Provider) TranscribeURL

func (p *Provider) TranscribeURL(ctx context.Context, url string, config stt.TranscriptionConfig) (*stt.TranscriptionResult, error)

TranscribeURL is not supported by Twilio.

type ResponseElement

type ResponseElement struct {
	XMLName xml.Name       `xml:"Response"`
	Gather  *GatherElement `xml:",omitempty"`
}

ResponseElement represents a TwiML <Response> element.

type SayElement

type SayElement struct {
	XMLName xml.Name `xml:"Say"`
	Text    string   `xml:",chardata"`
}

SayElement represents a TwiML <Say> element.

type TranscriptionEvent

type TranscriptionEvent struct {
	Type       string    `json:"type"`
	Transcript string    `json:"transcript"`
	Confidence float64   `json:"confidence"`
	IsFinal    bool      `json:"is_final"`
	Language   string    `json:"language"`
	StartTime  float64   `json:"start_time"`
	EndTime    float64   `json:"end_time"`
	Timestamp  time.Time `json:"timestamp"`
}

TranscriptionEvent represents a real-time transcription event from Twilio.

func ParseTranscriptionEvent

func ParseTranscriptionEvent(data []byte) (*TranscriptionEvent, error)

ParseTranscriptionEvent parses a real-time transcription event from JSON.

func (*TranscriptionEvent) ToStreamEvent

func (e *TranscriptionEvent) ToStreamEvent() stt.StreamEvent

ToStreamEvent converts a Twilio transcription event to an OmniVoice stream event.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL