Adds a pause or break within the sentence. Useful for creating natural pauses between clauses or for dramatic effect.
Breaks within sentences can help improve comprehension and create more natural-sounding speech by adding pauses where commas or other punctuation might naturally occur.
Optional
options: string | BreakOptionsBreak configuration or duration string
Configuration options for break/pause elements.
Defines pauses in speech either by strength (semantic) or explicit duration. If both are specified, time takes precedence.
Optional
strength?: BreakStrengthSemantic strength of the pause.
Each strength corresponds to a typical pause duration:
x-weak
: 250ms (very short)weak
: 500ms (short, like a comma)medium
: 750ms (default, like a period)strong
: 1000ms (long, like paragraph break)x-strong
: 1250ms (very long, for emphasis)Ignored if time
is specified.
Optional
time?: stringExplicit duration of the pause.
Specified in milliseconds (ms) or seconds (s). Valid range: 0-20000ms (20 seconds max) Values above 20000ms are capped at 20000ms.
Takes precedence over strength
if both are specified.
This SentenceBuilder instance for method chaining
// Using duration string for precise control
sentence
.text('First')
.break('300ms')
.text('let me think about that.');
// Using strength for semantic pauses
sentence
.text('Well')
.break({ strength: 'weak' })
.text('that\'s interesting.');
// Creating dramatic pause
sentence
.text('The winner is')
.break('2s')
.text('Team Alpha!');
https://docs.microsoft.com/azure/cognitive-services/speech-service/speech-synthesis-markup#add-a-break Break Element Documentation
Adds emphasized speech with adjustable intensity to the sentence. Emphasis changes the speaking style to highlight important words or phrases.
The emphasis level affects both the pitch and timing of the emphasized text, making it stand out from the surrounding speech.
Text to emphasize
Optional
level: EmphasisLevelEmphasis level: 'strong' | 'moderate' | 'reduced'. Default is 'moderate'
This SentenceBuilder instance for method chaining
// Strong emphasis for important information
sentence
.text('This is ')
.emphasis('absolutely critical', 'strong')
.text(' for success.');
// Moderate emphasis (default)
sentence
.text('Please ')
.emphasis('remember')
.text(' to save your work.');
// Reduced emphasis for de-emphasis
sentence
.emphasis('(optional)', 'reduced')
.text(' You can also add notes.');
https://docs.microsoft.com/azure/cognitive-services/speech-service/speech-synthesis-markup#adjust-emphasis Emphasis Element Documentation
Protected
escapeProtected
Escapes special XML characters in text content to ensure valid XML output.
This method replaces XML special characters with their corresponding entity references to prevent XML parsing errors and potential security issues (XML injection). It should be used whenever inserting user-provided or dynamic text content into XML elements.
The following characters are escaped:
&
becomes &
(must be escaped first to avoid double-escaping)<
becomes <
(prevents opening of unintended tags)>
becomes >
(prevents closing of unintended tags)"
becomes "
(prevents breaking out of attribute values)'
becomes '
(prevents breaking out of attribute values)This method is marked as protected
so it's only accessible to classes that extend
SSMLElement, ensuring proper encapsulation while allowing all element implementations
to use this essential functionality.
The text content to escape
The text with all special XML characters properly escaped
// In a render method implementation
class TextElement extends SSMLElement {
private text: string = 'Hello & "world" <script>';
render(): string {
// Escapes to: Hello & "world" <script>
return `<text>${this.escapeXml(this.text)}</text>`;
}
}
// Edge cases handled correctly
this.escapeXml('5 < 10 & 10 > 5');
// Returns: '5 < 10 & 10 > 5'
this.escapeXml('She said "Hello"');
// Returns: 'She said "Hello"'
this.escapeXml("It's a test");
// Returns: 'It's a test'
// Prevents XML injection
this.escapeXml('</voice><voice name="evil">');
// Returns: '</voice><voice name="evil">'
Specifies exact phonetic pronunciation for words within the sentence. Provides precise control over pronunciation using phonetic alphabets.
This is essential for proper names, technical terms, or words that might be mispronounced by the default text-to-speech engine.
The text to pronounce
Phoneme configuration
Configuration options for phoneme elements.
Provides exact phonetic pronunciation using standard phonetic alphabets. Essential for proper names, technical terms, or words with ambiguous pronunciation.
Phonetic alphabet used for transcription. (Required)
Available alphabets:
ipa
: International Phonetic Alphabet (universal standard)sapi
: Microsoft SAPI phonemes (English-focused)ups
: Universal Phone Set (Microsoft's unified system)Phonetic transcription of the word. (Required)
The exact phonetic representation in the specified alphabet. Must be valid according to the chosen alphabet's rules.
This SentenceBuilder instance for method chaining
// IPA example for technical terms
sentence
.text('The ')
.phoneme('API', {
alphabet: 'ipa',
ph: 'eɪpiːˈaɪ'
})
.text(' returns JSON data.');
// SAPI example for names
sentence
.text('Contact ')
.phoneme('Nguyen', {
alphabet: 'sapi',
ph: 'w ih n'
})
.text(' for more information.');
// Disambiguating homographs
sentence
.text('I need to ')
.phoneme('read', {
alphabet: 'ipa',
ph: 'riːd' // present tense, not past tense 'rɛd'
})
.text(' this book.');
Modifies prosody (pitch, rate, volume, contour, range) of speech within the sentence. Allows fine-grained control over how text is spoken for expressive speech.
Prosody modifications can convey emotion, emphasis, or create specific speaking styles like whispering or shouting.
Text to modify with prosody settings
Prosody configuration options
Configuration options for prosody (speech characteristics).
Controls various aspects of speech delivery including pitch, speaking rate, volume, and intonation contours. Multiple properties can be combined for complex speech modifications.
Optional
contour?: stringPitch contour changes over time.
Defines how pitch changes during speech using time-position pairs. Format: "(time1,pitch1) (time2,pitch2) ..." Time as percentage, pitch as Hz or percentage change.
Optional
pitch?: stringPitch adjustment for the speech.
Can be specified as:
Optional
range?: stringPitch range variation.
Controls the variability of pitch (monotone vs expressive). Can be relative change or named value.
Optional
rate?: stringSpeaking rate/speed.
Can be specified as:
Optional
volume?: stringVolume level of the speech.
Can be specified as:
This SentenceBuilder instance for method chaining
// Whispering effect
sentence
.prosody('This is a secret', {
volume: 'x-soft',
rate: 'slow',
pitch: 'low'
})
.text('!');
// Excited/energetic speech
sentence
.prosody('Amazing news everyone', {
rate: '1.2',
pitch: '+10%',
volume: 'loud'
})
.text('!');
// Question intonation with contour
sentence.prosody('Are you sure', {
contour: '(0%,+5Hz) (50%,+10Hz) (100%,+20Hz)'
});
// Monotone/robotic effect
sentence.prosody('I am a robot', {
pitch: 'medium',
range: 'x-low',
rate: '0.9'
});
https://docs.microsoft.com/azure/cognitive-services/speech-service/speech-synthesis-markup#adjust-prosody Prosody Element Documentation
Internal
Renders the sentence element as an XML string. This method is called internally by the SSML builder to generate the final XML.
The rendered output includes the <s>
tags and all child elements properly formatted
as valid SSML XML content.
The sentence element as an XML string with all its content
Controls how text is interpreted and pronounced within the sentence. Essential for proper pronunciation of numbers, dates, times, and other formatted text.
The say-as element ensures that specialized text formats are spoken correctly according to their semantic meaning rather than their literal characters.
Text to interpret
Say-as configuration
Configuration options for say-as elements.
Controls interpretation and pronunciation of formatted text like dates, numbers, currency, and other specialized content.
Optional
detail?: stringAdditional detail for interpretation.
Provides extra context for certain interpretAs types:
Optional
format?: stringFormat hint for interpretation.
Provides additional formatting information. Available formats depend on interpretAs value:
For dates:
For time:
How to interpret the text content. (Required)
Determines the pronunciation rules applied to the text. Each type has specific formatting requirements.
This SentenceBuilder instance for method chaining
// Date interpretation
sentence
.text('The deadline is ')
.sayAs('2025-12-31', {
interpretAs: 'date',
format: 'ymd' // year-month-day
});
// Currency with detail
sentence
.text('The total is ')
.sayAs('42.50', {
interpretAs: 'currency',
detail: 'USD'
});
// Phone number
sentence
.text('Call us at ')
.sayAs('1-800-555-1234', {
interpretAs: 'telephone'
});
// Ordinal numbers
sentence
.text('She came in ')
.sayAs('3', { interpretAs: 'ordinal' })
.text(' place.'); // "third place"
// Spell out acronyms
sentence
.sayAs('API', { interpretAs: 'spell-out' })
.text(' stands for Application Programming Interface.');
// Time with 24-hour format
sentence
.text('The meeting starts at ')
.sayAs('14:30:00', {
interpretAs: 'time',
format: 'hms24'
});
https://docs.microsoft.com/azure/cognitive-services/speech-service/speech-synthesis-markup#say-as-element Say-As Element Documentation
Substitutes text with an alias for pronunciation within the sentence. Useful for acronyms, abbreviations, or any text that should be spoken differently than written.
The sub element allows you to display one text while having the speech synthesizer pronounce something different.
The text as it appears in writing
How the text should be pronounced
This SentenceBuilder instance for method chaining
// Acronym expansion
sentence
.text('The ')
.sub('CEO', 'Chief Executive Officer')
.text(' will speak at the ')
.sub('AGM', 'Annual General Meeting')
.text('.');
// Technical abbreviations
sentence
.sub('Dr.', 'Doctor')
.text(' Smith studies ')
.sub('DNA', 'deoxyribonucleic acid')
.text('.');
// Custom pronunciations
sentence
.text('Visit ')
.sub('www.example.com', 'w w w dot example dot com')
.text(' for more information.');
// Chemical formulas
sentence
.text('Water is ')
.sub('H2O', 'H two O')
.text('.');
https://www.w3.org/TR/speech-synthesis11/#S3.1.11 W3C Sub Element Specification
Adds plain text content to the sentence. Special characters (&, <, >, ", ') are automatically escaped to ensure valid XML.
Multiple text segments can be added and will be concatenated in order. The text will be spoken with the natural intonation for a sentence.
The text content to add to the sentence
This SentenceBuilder instance for method chaining
Builder class for creating sentence elements within an SSML document. Provides a fluent API for structuring content into grammatically complete sentences with proper intonation.
The
<s>
element explicitly marks sentence boundaries, which helps the speech synthesizer apply appropriate intonation patterns, pauses, and prosody. This is particularly useful when the default sentence detection might not work correctly for your specific content.Example
Example
See