🧑💻 Project Description
This is the first conversational ESP32 project to demonstrate real-time conversations using a custom speech pipeline (Silero for VAD, Whisper for STT, GPT4o for text-to-text and ElevenLabs for TTS). Demonstrated in the video, the device can talks to users direcly in Wheatley (Portal 2) voice and tone.
It achieves this by interfacing directly with LiveKit real-time pipeline, which hosts a custom voice trained on ElevenLabs and a fine-tuned GPT4o.
💻 Github Repository
The source and written tutorial of the project can be found here: https://github.com/pham-tuan-binh/wheatley-ai
🛒 SenseCap Watcher
If you would like a SenseCap Watcher yourself, consider buying it from my affiliated link: https://www.seeedstudio.com/SenseCAP-Watcher-W1-A-p-5979.html?sensecap_affiliate=3gToNR2&referring_service=link
This helps me a lot since I'm not yet in YouTube monetization program and making these videos cost me quite lot as a creator.
Seeed Studio Coupon (applicable to most items on their shop): 5EB420ZS
👨💼 Collaboration
The project is distributed under a copyleft license, details on Github. If you want to collaborate with me or commercialize this project, please write me an email at [email protected].
#esp32 #embedded #ai #livekit #elevenlabs
🎞️ Chapters
00:00 - Beginning
00:54 - Chapter 1
01:46 - Chapter 2
02:06 - Front end
03:21 - Back end
04:44 - Chapter 3
09:35 - Demo