Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

brukg/hri_speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Speech-Based Human Robot Interaction and Task Assignment

The motivation for this project is to communicate with robots through natural speech. For this we are using an open-source Speech-to-Text engine named DeepSpeech, developed by Mozila.

Getting Started

To get started, first install the following Dependencies

Dependencies

ROS-Noetic

    sudo apt install ros-noetic-turtlebot3-gazebo   #simulation env
    sudo apt install ros-noetic-turtlebot3-slam     #slam
    sudo apt install ros-noetic-turtlebot3-navigation #navigation stack
    sudo apt install ros-noetic-gmapping                #for mapping
    sudo apt install ros-noetic-dwa-local-planner       #dynamic windowing approach controller
    sudo apt install ros-noetic-behaviortree-cpp-v3     #for task planning

install the following python packages

    pip3 install mediapipe  #gesture recognition

    #for using deepspeech
    pip3 install deepspeech #speech to text
    sudo apt-get install python3-pyaudio python3-pyaudio #

    # for using whisper
    pip install git+https://github.com/openai/whisper.git 
    pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
    # on Ubuntu or Debian
    sudo apt update && sudo apt install ffmpeg
    #for text to speech
    sudo apt install libespeak-dev
    pip install pyttsx3

    

How to run the project

After installing all the dependencies, go to your ros workspace and clone this repository.

    cd <your_catkin_workspace>
    git clone [email protected]:brukg/hri_speech.git

Then build the project using the following command and source it your workspace.

    catkin build
    source devel/setup.bash or source devel/setup.zsh #depending on your shell

Next go to project directory and create a folder named models.

    export TURTLEBOT3_MODEL=waffle
    roscd hri_speech or cd <your_catkin_workspace>/src/hri_speech
    mkdir models

Go inside the models folder and download two deepspeech models.

    cd models
    # makes sure to place/download the below files in the on the projects models directory
    wget https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm
    wget https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer

openai davinci model is used to extract semantic meaning from text for this you have to setup openai key here and export it before running the project replace the string in the below command with your key

export OPENAI_API_KEY="key obtained from openai account"

Finally run the project using the following command.

    roslaunch hri_speech start_all.launch

About

Human Robot Interaction Using Speech

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •