Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: samtupy/star

Tags

v4

Toggle v4's commit message
Revision 4

This update to STAR contains all changes to the project that have taken place over the last 4+ months, including a slightly better visual UI, more providers, the coagulator web frontend, security/stability and bugfixes.
* Improves the visual layout for the user client UI, it's still very likely quite far from perfect.
* New providers in the STAR source package: bestspeech / Keynote Gold, openai, elevenlabs, and googlecloud.
* Though it still needs work, at least somewhat improved the consolidated render feature. Now at least all the clips get rendered and in order too, though it's still a bit slow and has weird resampling.
* The coagulator now provides an http frontend and API as a lightweight alternative to the STAR client.
* Fixed a bug in the balcony provider which could cause text containing quotes to be output through speakers!
* Major provider stability improvements, from the ability to specify maximum concurrent requests to vastly improved synthesis cancelation to general robustness including 10mb default max packet size. Before this update, providers would easily crash if too much text was fed to it. Now they handle that situation much more gracefully.
* Fix bug in user client which was causing render complete noise to be played on synthesis error.
* The STAR repository now includes a script which requests permission for macsay to be able to access and provide your MacOS personal voices!
* STAR can now handle audio in formats other than wav when required. For example some cloud services actually offer the best sounding quality as mp3 or vorbis, and it would just be a waste of bandwidth to deceptively decode to wav before providing.
* Implemented default pitch and rate functionality into the provider, sets macsay's default rate to 195wpm.
* Minor provider code cleanup, including reducing very noisy error output when connections can't be established.
* Fixed user client not reporting synthesis errors sent from a provider.
* Fixed broken SAPI4 voice selection when a SAPI4 and SAPI5 voice existed with the same name.
* Minor documentation updates including correcting a misdocumented keyboard shortcut.

v3

Toggle v3's commit message
Revision 3

This is a major update to STAR which includes a complete user client rewrite and consequently the introduction of several useful features.
* The user client was completely rewritten from scratch in python and WX Widgets, meaning that though feedback must still be gathered to make it look right or even to insure that controls are visible at all, the user client should soon  be able to be used without a screen reader within a couple of revisions!
* Due to the script text field being a true richtext control, the previewing hotkeys were changed to control+alt+up, down, and space rather than just control. This was forced on us because NVDA at least seems to always speak when control+up and down are pressed on a text field regardless of any application code.
* It is now possible to press ctrl+alt+enter on any valid speech line in the script to begin auto previewing the entire script up to it's end or the next error.
* You can press alt+backspace anywhere in the main screen to pause and resume any playing speech, thus the stop currently playing speech button was removed from the interface.
* The STAR client can now run all needed components such as the coagulator and providers locally with one click, meaning that STAR should require 0 extra setup assuming you just want to use only the voices on your computer!
* There is now an options dialog with various customizations you can make from the render filename template to the default render output location to the voice preview text and more.
* Now that the options dialog exists, the output device list has been moved to that dialog.
* It is now possible to consolidate the entire script into one audio file! Next to the script field, there is an output subdirectory or consolidated filename field. If you set thhis to a filename such as output.wav instead of a folder name such as output, now a single output.wav file will be created containing all voice clips seperated by a configurable amount of silence.
* The synthesis caching is much improved. Now if you preview a script before rendering it, the render will be almost instant as the cached phrases from the synthesis will now be used while rendering. It is also possible to manually clear the audio cache in the options dialog.
* The render progress sound was removed in favor of a real native progress bar. It sounds a bit less cool but should be visually accessible soon and is far less expensive than playing the sound.
* There are some cool new features in the script syntax, such as selecting only certain lines to render on the minor end to being able to define character aliases per script on the very useful end!
* Typically STAR coagulators now require authentication. To make dealing with this easier, STAR can now save previous hosts you connect to, allowing you to select between them in the options dialog.
* Full documentation is now provided as well as automatic windows client builds on Github.