Chatbot project over Raspberry Pi (5 / Zero 2w)
π¨ This is a Work in Progress project. Take it or leave it. Suggestions are welcome.
This project works with a bunch of system resources. This means that for Python to be able to compile the dependecies we need to have some packages in the OS level.
To use Pitxu without Poetry we need to install the dependencies for the Python bundled in the OS (that must already be 3.11 min).
This is only in case you want to avoid using Poetry, as in the Raspberry Pi we don't need to have virtual environments (because you don't use that RPi for anything else, right?).
Please have the python3-pip pachage installed beforehand:
| Modern OS/Python are pointing out the difference between having Python budled to support internal applications, not for the end user. That's why we should always use a Virtual Environment. Keeping it here for knowledge sharing, but should not be used. |
For Debian based linux distros:
sudo apt install python3-pip
Some dependencies are built at installing time. Please have the python3-dev pachage installed beforehand:
For Debian based linux distros:
sudo apt install python3-dev
This is needed for the internal Pillow support, for the e-Ink display
For Debian based linux distros:
sudo apt install libjpeg-dev zlib1g-dev libfreetype6-dev
This is needed for the internal Gemini support, for the dication feature
For Debian based linux distros:
sudo apt install libffi-dev
This is needed for the internal Pyaudio support, for the audio support
For Debian based linux distros:
sudo apt install portaudio19-dev python3-pyaudio
This is needed for the internal GPIO support
For Debian based linux distros:
sudo apt install swig liblgpio-dev
This is not needed for the Python / Poetry application to work, but it's useful to debug and identify the own hardware.
For Debian based linux distros:
sudo apt install i2c-tools
Just make sure that I did not forget to add here anything from above. Just put them all together.
sudo apt install python3-dev libjpeg-dev zlib1g-dev libfreetype6-dev libffi-dev portaudio19-dev python3-pyaudio swig liblgpio-dev i2c-tools
This is needed for the internal Pyaudio support, for the audio support
brew install portaudio
git clone [email protected]:XaviArnaus/pitxu.git
cd pitxu
curl -sSL https://install.python-poetry.org | python3 -
Remember that after the installation, most likely you need to add an export line in your shell config file (for example /home/username/.bashrc). The end of the Poetry installation announces that.
make init
In lot of cases, poetry builds the dependencies and they fail due to diverse issues.
Warning: Apparently I could evolve all of this to leave the 2 packages below inside the pyproject.toml, so maybe try first to do the normal install and then jump directly to the
installing packages in the shell. Numpy and Onnxruntime should be installed:
numpy = [{version="^2.3.4", markers="sys_platform=='darwin'"}]
onnxruntime = [{version="^1.23.2", markers="sys_platform=='darwin'"}]
piper-tts = [{version="^1.3.0", markers="sys_platform=='linux'"}]
... because would be very great to know a way to force "--no-deps" for a darwing marker there inside.
Numpy is a dependency from Piper. It is also mentioned in one of the Vosk (STT) examples as a tool for calculation.
I have it as a direct dependency as it gave some headaches. At the end, it gets installed but needs VERY MUCH TIME.
It's installation (isolated back then with poetry add numpy -vvv) was monitored with another ssh window running htop,
And it only worked after a reboot and directly install it.
Onnxruntime is a dependency from Piper. It is needed for the TTS as controlls the model. It simply does not get installed
due to the --no-deps param in the section below. Needs to be installed by poetry add onnxruntime -vvv.
Some packages are found in the repository but will fail installing, for diverse reasons.
The workaround is to enter into the shell of the Poetry's virtual environment and pip3 install the packages there.
In general, the idea is that then they don't get compiled but rather it uses the wheel
βΉοΈ It affects gpiozero & piper-tts, apparently only with Mac OS.
Then, install the shell plugin:
poetry self add poetry-plugin-shell
βΉοΈ Added the plugin as a requirement in pyproject.toml, maybe this manual plugin installation is not needed now.
... and then you can continue as usual:
poetry shell
pip3 install gpiozero
pip3 install piper-tts --no-deps
cp config/main.yaml.dist config/main.yaml
... and edit it at your test
nano .env
... and add there your Google Gemini key, that you got for free from https://aistudio.google.com/app/apikey like
API_KEY=abcdefghijkl
make run
No more to say. Not usable. Will bring data.
Works very decent, no very significant difference with MacOS
- Be sure to feed the RPi. Old USB chargers do not work. When charged the Piper model used to die by hunger.
- From Piper 1.2.0 to 1.3.0 the API for
sintetize_stream_raw()changed tosintentize()and the subsequent loop a bit as well. - I did lot of tinkering in the underlying Linux (Debian/RaspberryOS) system to make the sound to work (ALSA, USB dongle, PulseAudio) that I don't know what actually makes it to play and record. I've dropped some test commands in /bin for the next time. I remember that I deactivated the sound from the boot/
config.txt, in a wish to properly select the output device to the USB Audio. - Some cricks and noise mostly at the beginning and at the end of the play
- Must activate the SPI interface from
sudo raspi-config. - Getting very stuck with the display saying
waveshare_epd.epd2in13_V4 e-Paper busy...- Check malfunctioning cables, faulty in-between pieces (GPIO HATs and headers). Happened to me twice.
- Has plenty of problems controlling the subprocess to close properly, not allowing the next one to succeed. Complains about GPIO being busy while initialising the next Process. Solution was to move to a long lasting subprocess like Piper.
- Most of the times, the very first start, the Splash screen is shown grey-ish.
- Works good in general
- The very first show of the KITT mouth is shown mangled. The rest of the times is good.
- Spotted few times where the KITT mouth did not appear while TTS speaks. Smells like Shared Memory Flags were not updated on time.
- API public transport
- Button to mute, so it does not attend what is spoken in front
- Button to skip what is being TTS, so user can discard the explanation (can be anoyingly long)
Python offers several powerful libraries for sentiment analysis. Some of the most popular and effective ones include:
- NLTK (Natural Language Toolkit): A comprehensive library for natural language processing, NLTK includes tools for sentiment analysis, notably the VADER (Valence Aware Dictionary and Sentiment Reasoner) sentiment analyzer, which is particularly effective for social media texts.
- TextBlob: Built on top of NLTK, TextBlob is known for its simplicity and ease of use, making it ideal for beginners and quick sentiment evaluations. It provides a pre-trained sentiment analyzer and offers fine-grained polarity scores and subjectivity analysis.
- VADER (Valence Aware Dictionary and Sentiment Reasoner): Specifically designed for analyzing sentiment in social media and short text content, VADER is a rule-based sentiment analysis tool. It generates compound polarity scores and can handle informal language, slang, and emojis.
- SpaCy: A modern NLP library focused on efficiency and production use, SpaCy includes support for sentiment analysis. It utilizes a machine learning approach based on convolutional neural networks, which can handle complex language features like negation and sarcasm.
- BERT (Bidirectional Encoder Representations from Transformers): A state-of-the-art library from Hugging Face, Transformers offers a wide range of pre-trained models, including BERT, which achieve remarkable performance on sentiment analysis benchmarks.
- Flair: Another advanced library offering sophisticated features and capabilities for more complex sentiment analysis tasks, including strong multilingual support.
- Scikit-learn: A popular machine learning library, Scikit-learn includes tools for building custom sentiment analysis models using classifiers and feature extraction.
- PyTorch: A deep learning framework used for building custom sentiment analysis models, PyTorch provides full flexibility to design and train neural networks.
https://www.waveshare.com/wiki/2.13inch_e-Paper_HAT%2B
It also explains dependencies from Debian. Useful to deal with PIL. Remember to port to Poetry. https://github.com/waveshareteam/e-Paper/blob/master/RaspberryPi_JetsonNano/python/readme_rpi_EN.txt
https://www.waveshare.com/wiki/2.13inch_e-Paper_HAT_Manual#Demo_code
https://python-sounddevice.readthedocs.io/en/0.5.1/usage.html
https://alphacephei.com/vosk/install https://alphacephei.com/vosk/models
https://ai.google.dev/gemini-api/docs/migrate https://ai.google.dev/gemini-api/docs/rate-limits https://aistudio.google.com/usage?timeRange=last-28-days&project=gen-lang-client-0547047381&tab=rate-limit
https://gofastmcp.com/integrations/gemini https://github.com/stepanogil/mcp-sse-demo?tab=readme-ov-file
https://github.com/IllFil/gemma3-ollama-tools https://www.philschmid.de/gemma-function-calling
Use the USB-C (5v/5A) from the UPS and not from the Raspberry Pi. If connected without software, it will behave as follows:
- When connected it, the Raspberry Pi will start automatically
- The charging will start also automatically. One led blinks. There are 3 green leds that indicate the battery level.
- When shutting down the Raspberry Pi, it will remain on.
- To completelly shut it down, press the UPS power button 3 times.
- If a momentary button is connected to the XH2.54 dedicated socket, it also needs 3 times.
- To turn it on again, a single push to any of above buttons will do.
The software and some instructions can be found here: https://wiki.geekworm.com/X1203 https://suptronics.com/Raspberrypi/Power_mgmt/x120x-v1.0_software.html
I2C. At this point we should already have it
activated as the eInk and the LED matrix need it as well. Otherwise, read how to activate the
I2C feature through the sudo raspi-config command.
To see which address the UPS is connected (docs says 0x36)
sudo i2cdetect -y 1
In a terminal in the RPi, edit the EEPROM config:
sudo rpi-eeprom-config -e
Change the setting of POWER_OFF_ON_HALT from 0 to 1,
Add PSU_MAX_CURRENT=5000 at the end of the file that reads like this:
[all]
BOOT_UART=1
BOOT_ORDER=0xf14
POWER_OFF_ON_HALT=1
PSU_MAX_CURRENT=5000
Reboot
https://www.thedigitalpictureframe.com/ultimate-guide-systemd-autostart-scripts-raspberry-pi/
See The pitxu.service file in the /bin folder
- Ensure that this file has
644permissions - Create a soft link from
/etc/systemd/system/to this file:
cd /etc/systemd/system/
sudo ln -s /home/xavier/pitxu/bin/pitxu.service pitxu.service
- Reload the systemd daemon and enable the service
sudo systemctl daemon-reload
sudo systemctl enable
Further updates do not need to repeat point 3, but if the filename changes.
https://askubuntu.com/a/416330
See The k99_cleanup_pitxu file in the /bin folder
- Ensure that this file has
755permissions - Create a soft link from
/etc/rc0.d/to this file. This will clear the displays on reboot
cd /etc/rc0.d/
sudo ln -s /home/xavier/pitxu/bin/k99_cleanup_pitxu k99_cleanup_pitxu
- Create a soft link from
/etc/rc6.d/to this file. This will clear the displays on shutdown
cd /etc/rc6.d/
sudo ln -s /home/xavier/pitxu/bin/k99_cleanup_pitxu k99_cleanup_pitxu
https://unix.stackexchange.com/a/414301
By default Debian's journal saves no files to disk. We need to change that so that we can see what happens during shutdown (if we want)
- Edit
/etc/systemd/journald.conf - Change the configuration so that the following parameters are uncommented and with the following values:
Storage=persistent # This will persist the logs, even after reboot.
MaxRetentionSec=1week # This will rotate the logs, cleaning them after a week
- Add your user to the journal group
sudo usermod -a -G systemd-journal xavier
- Restart the journal servie
systemctl restart systemd-journald
The very first conversation with Pitxu was 2025-06-02 [commit hash: fcaccfc]
In Mac OS - catalan
- Dictation is good
- eInk is mocked
- Gemini improved very significantly after switching from single query (without context) to chat (with context)
- Speech is ok
- π Works pretty good
In RPi 02W - catalan
- Dictation takes ~4s
- eInk takes ~2s (still fullscan)
- Gemini takes ~1s (same as above)
- Speech is ok.
- π Good quality but pretty slow all in all.