-
-
Notifications
You must be signed in to change notification settings - Fork 888
Open
Description
Describe the bug
When using SSML to apply a style (e.g., angry, cheerful) with <mstts:express-as>, the service does not speak the provided text. Instead, it plays a default, unrelated English audio clip ("The range hood had droned so long...").
This only happens when a style is specified. If the SSML is removed or if no style is used, the text-to-speech conversion works correctly with the provided text.
To Reproduce
The issue can be reproduced with the following minimal Python script:
import asyncio
import edge_tts
VOICE = "en-US-AvaNeural"
STYLE = "angry"
TEXT = "This is a simple test." # My actual text
OUTPUT_FILE = "error_output.mp3"
# Construct the SSML string
ssml = (
f"<speak version='1.0' xmlns='[http://www.w3.org/2001/10/synthesis](http://www.w3.org/2001/10/synthesis)' "
f"xmlns:mstts='[https://www.w3.org/2001/mstts](https://www.w3.org/2001/mstts)' xml:lang='en-US'>"
f"<voice name='{VOICE}'>"
f"<mstts:express-as style='{STYLE}'>"
f"{TEXT}"
f"</mstts:express-as>"
f"</voice>"
f"</speak>"
)
async def main():
try:
print("--- SSML data being sent ---")
print(ssml)
print("--------------------------")
communicate = edge_tts.Communicate(ssml, VOICE)
await communicate.save(OUTPUT_FILE)
print(f"File saved to {OUTPUT_FILE}. Please check its content.")
except Exception as e:
print(f"An error occurred: {e}")
if __name__ == "__main__":
asyncio.run(main())Metadata
Metadata
Assignees
Labels
No labels