Thanks to visit codestin.com
Credit goes to sourceforge.net

Python Speech Software

View 4761 business solutions

Browse free open source Python Speech Software and projects below. Use the toggles on the left to filter open source Python Speech Software by OS, license, language, programming language, and project status.

  • One Platform. Total IT Insight. Start with PRTG Now Icon
    One Platform. Total IT Insight. Start with PRTG Now

    Rely on a single source of truth. PRTG unifies monitoring for all your systems, apps, and services.

    Why settle for fragmented monitoring? PRTG consolidates everything - servers, VMs, network devices, cloud services, and more, into one powerful platform. Get real-time status, customizable alerts, and deep analytics to drive smarter decisions. Designed for complex environments, PRTG scales with your needs, supports team collaboration, and helps you prevent outages before they impact users. Take control of your IT landscape and deliver the uptime your business requires.
    Start Your Free PRTG Trial
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 2
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip install SpeechRecognition. The first software requirement is Python 2.6, 2.7, or Python 3.3+. This is required to use the library. PyAudio is required if and only if you want to use microphone input (Microphone). PyAudio version 0.2.11+ is required, as earlier versions have known memory management bugs when recording from microphones in certain situations. To hack on this library, first make sure you have all the requirements listed in the "Requirements" section.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    AGTK is a suite of software components for building tools for annotating linguistic signals, time-series data which documents any kind of linguistic behavior (e.g. audio, video). The internal data structures are based on annotation graphs.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    ASR-Builder provides an easy-to-use interface to the HTK toolkit, that allows users to build ASR systems. ASR-Builder provides a platform that performs house-keeping tasks when using HTK and also provides default training/testing/recognition scripts.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Easy-to-Use Website Accessibility Widget Icon
    Easy-to-Use Website Accessibility Widget

    An accessibility solution for quick website accessibility improvement.

    All in One Accessibility is an AI based accessibility tool that helps organizations to enhance the accessibility and usability of websites quickly.
    Learn More
  • 5
    The PyGE (Python Gutenberg E-text) project is a suite of GUI desktop utilities written in Python to promote and facilitate awareness and enjoyment of works of literature that are available from the archives of Project Gutenberg.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6

    Steel TTS

    A cross-platform wrapper for common text-to-speech engines in Python

    Steel is a cross-platform package for using common text-to-speech (speech synthesis) engines in Python. Steel currently supports the following TTS software: - Microsoft Speech API 5 (SAPI5) - eSpeak - NS Speech Synthesis - FreeTTS Documentation: http://sourceforge.net/p/steeltts/wiki/ Bug Tracker: http://sourceforge.net/p/steeltts/tickets/ If you are interested in contributing to the Steel TTS codebase, or would like to make a feature-request, please contact the lead developer, Jasper Danielson, at [email protected].
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    TeamBlibbityBlabbity is an attempt to document and provide an example implemention of the proprietary TeamSpeak 2 protocol.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Yet Another Audio Feature Extractor is a toolbox for audio analysis. Easy to use and efficient at extracting a large number of audio features simultaneously. WAV and MP3 files supported, or embedding in C++, Python or Matlab applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Performs actions on detected volume threshold Examples : - Launch music on clap - Launch speech recording when you start speaking - Launch guard webcam when a significant sound is detected - Increase or decrease headphones volume when ambient noise pass
    Downloads: 1 This Week
    Last Update:
    See Project
  • Performance Monitoring Solution for DevOps and IT Operations. Icon
    Performance Monitoring Solution for DevOps and IT Operations.

    Site24x7 offers unified cloud monitoring for DevOps and IT operations within small to large organizations.

    The solution monitors the experience of real users accessing websites and applications from desktop and mobile devices. In-depth monitoring capabilities enable DevOps teams to monitor and troubleshoot applications, servers and network infrastructure, including private and public clouds. End-user experience monitoring is done from more than 100 locations across the world and various wireless carriers.
    Learn More
  • 10
    This is a Linux project that acts as a front end to cdparanoia, sox, and ffmpeg with the hope of making it incredibly simple to rip many audiobook cds into one mono, audiobook (m4b) format file for use in audio players capable of playing audiobooks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    FM2TXT

    FM2TXT

    RtlSdr listen to radio, recognize audio, and writes text file log

    Just log your favorite FM station speech to a text file using rtl-sdr dongle and speech recognition. Cross-platform tool. Follow the README on the download page for Windows installation. https://sourceforge.net/projects/fm2txt-rtlsdr/files/ If you prefer GitHub source, not SF: https://github.com/randaller/fm2txt For those, who want to recognize from soundcard, not from rtl-sdr (this allows to transcribe NFM etc): https://github.com/randaller/souncard2txt
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    InproTK

    InproTK

    An Incremental Spoken Dialogue Processing Toolkit

    InproTK is an Incremental Spoken Dialogue Processing Toolkit, that is, a toolkit to help you build dialogue systems that listen and talk incrementally, allowing for advanced interactional behaviour. Please see our Wiki for more information: http://sourceforge.net/p/inprotk/wiki/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    This is a fast C implementation of Arturo Camacho's SWIPE' pitch extraction algorithm. See the project homepage for more about the advantages of the SWIPE' algorithm. swipe-1.0.tar.gz contains the current source, which should compile quite neatly.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    A collection of tools for generating audio and visual (PNG/HTML/WAVE) for use in web sites including CAPTCHA challenges and PNG image creation tools with Javascript mouse tracking support.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    A.L.V.I. e' nato per essere un semplice ma modulare Bot, in grado di interagire con l'essere umano attraverso il linguaggio naturale ed eseguire svariati compiti, come leggere ad alta voce Mail, notizie, Feeds. Tutto in Italiano!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    An extensible (by plugin) chatbot project
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    A collection of scripts and programs to automatically annotate video/audio for subtitles. Basically relies on a MARSYAS (Music Analysis, Retrieval and Synthesis for Audio Signals) plug-in for detecting human voice in polyphonic recordings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AarTon
    AarTon is an automated text-to-speech application. It allows user to enter text in a web-based front-end and render these texts via a multi-channel sound card.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The Blind Audio Tactile Mapping System (BATS) attempts to address the lack of spatial information available for visually impaired students.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DJBorg turns your MP3 playlist into a personalized radio station, adding randomly-generated DJ banter between tracks. Song information (based on ID3 tags), news, weather, and headlines are announced via a text-to-speech engine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Defox text to speech and downloader

    Defox text to speech and downloader

    Written or imported text offline read or online download.

    This software design to convert text to speech and download the converted speech. Description : • Installation setup with two languages (English, French) • Two areas called text reading and speech downloading • Many languages supported to download center Note 1: I'm a student yet and I'm not in the software designing industry. Therefore maybe I haven't software making skills. I'm worried about that. ! Note 2 : When you double click on the software maybe it will get some seconds to open. That's not my fault. I used Python language to make this software and Python was not supported speedy to modern computers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Eve is a AI project written in python that takes commands verbally or textually to control the computer and eveyday functions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi, and the other one to the user. At inference, the stream from the user is taken from the audio input, and the one for Moshi is sampled from the model's output. Along these two audio streams, Moshi predicts text tokens corresponding to its own speech, its inner monologue, which greatly improves the quality of its generation. A small Depth Transformer models inter codebook dependencies for a given time step, while a large, 7B parameter Temporal Transformer models the temporal dependencies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    The Open Interface for Speech Synthesis (OISS) provides an interface to speech synthesis hardware and software for end-user applications under Unix.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PhoneBlogger allows you to post to a weblog by phone. PhoneBlogger is written in VoiceXML, Python, and JavaScript.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next