All checks were successful
Build and Publish Docker Image / build (push) Successful in 38s
321 lines
12 KiB
Markdown
321 lines
12 KiB
Markdown
+++
|
|
date = '2026-05-17T23:57:15-06:00'
|
|
draft = false
|
|
title = "Katchi, a dragon's best friend"
|
|
tags = ['kobold', 'esp32']
|
|
+++
|
|
|
|
## A smart-home for a Dragon
|
|
|
|
{{< typeit
|
|
tag=h3
|
|
speed=50
|
|
breakLines=false
|
|
loop=true
|
|
>}}
|
|
"It's the Future..."
|
|
"Dumb homes are so 2010"
|
|
"Is all of this really necessary?" — Concerned Friends
|
|
{{< /typeit >}}
|
|
|
|
### Smart-homes
|
|
|
|
The present state of smart-home choices is fairly acceptable. You have your major players, **Google**, **Apple**, **Amazon** and
|
|
their associated services like Google Home or **Alexa**. These systems are fairly easy to set up; plug in the new device,
|
|
type in some credentials or type a prompt on your phone, and done. Most of these systems rely on a central hub that
|
|
orchestrates the entire smart home.
|
|
|
|
But all these systems have one fatal annoyance. They all require access to the internet.
|
|
|
|
### Internet dependency
|
|
|
|
In recent years, it is common to run into issues with major providers. Privacy concerns, outages and the forced
|
|
obsolescence of existing systems put a lot of pressure on me when building my first smart home. Sure the big players
|
|
make it easy to set up and use, but for me the non-monetary cost was just too great. Besides the limitations in software,
|
|
knowing that if I had an internet outage, or god forbid, the provider has an outage, I would be shit out of luck in
|
|
turning off my lights turned me away from major providers.
|
|
|
|
### So what did I use?
|
|
|
|
After spending a lot of time frustrated with my options and dealing with the difficulties in automating and doing what
|
|
I wanted with my smart-home, I went down the rabbit hole of options and found
|
|
**[Home Assistant](https://www.home-assistant.io/)**.
|
|

|
|
|
|
Unlike the big-name smart-homes, **Home Assistant** is a self-hosted option that runs on your own hardware and locally
|
|
connects to supported devices. It supports a wide range of [devices and integrations](https://www.home-assistant.io/integrations/?brands=featured)
|
|
and is fairly easy to set up.
|
|
|
|
I wont expound on it much more here, but I will link to the [getting started](https://www.home-assistant.io/installation/), [documentation](https://www.home-assistant.io/docs/)
|
|
and [community](https://community.home-assistant.io/) for more information.
|
|
|
|
### So what's the problem?
|
|
|
|
Of all the amazing options that **Home Assistant** gives us, it has a fairly significant miss; that being Smart Speaker integration.
|
|
|
|
## Home Assistant Smart Speaker
|
|
|
|
The options for **Home Assistant** smart speakers are quite limited, they only offer one official product as of the date of publishing this post.
|
|
|
|
{{< externalLink url="https://www.home-assistant.io/voice-pe/" >}}
|
|
|
|
While the **Home Assistant Voice PE** works decently, it is the only off-the-shelf option for **Home Assistant** which considering
|
|
all the freedom **Home Assistant** gives us, feels quite limiting. However, there is a solution.
|
|
|
|
### The Solution
|
|
|
|
Thankfully we are not constrained by the limitations of existing hardware thanks to microcontrollers, specifically the ESP family of microcontrollers.
|
|

|
|
|
|
Using **[ESPHome](https://esphome.io)** you can create a whole myriad of smart devices based on the [ESP32 microcontroller](https://www.espressif.com/en/products/socs/esp32). It provides a very diverse
|
|
family of options that can fit nearly any use-case. Think of it as an alternative to Arduino, where instead of writing C code
|
|
you can write yaml configuration files that dictate and configure your ESP device.
|
|
|
|
Knowing this, I set out to make my own Smart Speaker.
|
|
|
|
## Building a Katchi (smart speaker)
|
|
|
|
So what does it take to make a smart speaker? You ultimately need a few key things such as a Speaker, a Microphone and a
|
|
Wifi-enabled Microcontroller. For my purposes I decided I also wanted a screen so I could give my Katchi a little
|
|
more personality. The main requirement I had for the display was for it to be circular as my intent was to use the display
|
|
as the eye for my smart speaker.
|
|
|
|
### Waveshare ESP32-S3
|
|
|
|
I ended up landing on the [Waveshare ESP32-S3 1.75inch AMOLED Round Touch Display Development Board](https://www.waveshare.com/esp32-s3-touch-amoled-1.75.htm).
|
|
Despite being quite a mouthful, this handy little device is absolutely packed with sensors and features, as well as a
|
|
glorious AMOLED round display.
|
|
|
|
{{< gallery >}}
|
|
{{< figure src="https://www.waveshare.com/media/catalog/product/cache/1/image/800x800/9df78eab33525d08d6e5fb8d27136e95/e/s/esp32-s3-touch-amoled-1.75-1.jpg" alt="Gallery image 1" figureClass="grid-w33" >}}
|
|
{{< figure src="https://www.waveshare.com/media/catalog/product/cache/1/image/800x800/9df78eab33525d08d6e5fb8d27136e95/e/s/esp32-s3-touch-amoled-1.75-2.jpg" alt="Gallery image 2" figureClass="grid-w33" >}}
|
|
{{< figure src="https://www.waveshare.com/media/catalog/product/cache/1/image/800x800/9df78eab33525d08d6e5fb8d27136e95/e/s/esp32-s3-touch-amoled-1.75-3.jpg" alt="Gallery image 3" figureClass="grid-w33" >}}
|
|
{{< /gallery >}}
|
|
|
|
Some of the fundamental things to take note of when considering esp devices, is what components and associated drivers
|
|
are available in **ESPHome**. For the Waveshare device I picked, it has the following components and their support:
|
|
|
|
| Device | Purpose | Supported |
|
|
|---------|-----------------------------------------|--------------------------------------------------------|
|
|
| TCA9554 | GPIO expander for additional interfaces | [Yes](https://esphome.io/components/pca9554/) |
|
|
| ES7210 | ADC for microphones | [Yes](https://esphome.io/components/audio_adc/es7210/) |
|
|
| ES8311 | DAC for speaker | [Yes](https://esphome.io/components/audio_dac/es8311/) |
|
|
| CO5300 | Display controller for amoled | [Yes](https://esphome.io/components/display/mipi_spi/) |
|
|
| CST9217 | Touchscreen control device | No |
|
|
|
|
To fill in the support gap for the touch screen, we can make use of an [external driver](https://github.com/shelson/esphome-cst9217) to handle making the touchscreen
|
|
work.
|
|
|
|
### Configuration
|
|
|
|
Since **ESPHome** uses a yaml configuration file to define the device, configuring the device is fairly straightforward.
|
|
|
|
#### Basic Configuration
|
|
|
|
We first need to start with the basic confguration of the device. When setting up the esp32 configuration, it is crucial
|
|
to be aware of the Flash size and CPU frequency otherwise your device may not run correctly. For the device I am using,
|
|
it has a **16MB flash** and a **240MHz CPU**. We also use the esp-idf framework. This is the preferred framework for
|
|
esp devices as the Arduino framework is not as feature-rich and is no longer supported by newer devices.
|
|
|
|
```yaml
|
|
esphome:
|
|
name: kobold
|
|
friendly_name: Kobold
|
|
|
|
esp32:
|
|
board: esp32-s3-devkitc-1
|
|
flash_size: 16MB
|
|
cpu_frequency: 240MHZ
|
|
framework:
|
|
type: esp-idf
|
|
```
|
|
|
|
We also need to set up **psram**, the **i2c bus** as well as the **SPI bus**. This will require
|
|
a firm understanding of the GPIO pins and their associated functions. For the Waveshare device,
|
|
they provide the following pinout diagram: [ESP32-S3-Touch-AMOLED-1.75.pdf](https://files.waveshare.com/wiki/ESP32-S3-Touch-AMOLED-1.75/ESP32-S3-Touch-AMOLED-1.75.pdf)
|
|
We will rely on this document extensively for the rest of the configuration.
|
|
|
|
The **psram** is important for making sure that the device does not run out of memory.
|
|
and its config is quite simple.
|
|
|
|
```yaml
|
|
psram:
|
|
mode: octal
|
|
speed: 80MHz
|
|
```
|
|
|
|
The **i2c bus** is used for the touchscreen and for other future components and acts as
|
|
an important communication protocol for Microcontrollers as it allows a large amount of
|
|
sensors and devices to connect to the same bus.
|
|
|
|
```yaml
|
|
i2c:
|
|
sda: GPIO15
|
|
scl: GPIO14
|
|
scan: true
|
|
id: bus_a
|
|
```
|
|
|
|
The **SPI bus** is essential for the display module as it communicates via quad SPI which is functionally
|
|
a quad channel serial bus.
|
|
|
|
```yaml
|
|
spi:
|
|
- id: spi_bus
|
|
clk_pin: GPIO2
|
|
mosi_pin: GPIO1
|
|
miso_pin:
|
|
number: GPIO3
|
|
ignore_strapping_warning: true
|
|
- id: quad_spi_bus
|
|
type: quad
|
|
clk_pin: GPIO38
|
|
data_pins:
|
|
- GPIO4
|
|
- GPIO5
|
|
- GPIO6
|
|
- GPIO7
|
|
```
|
|
|
|
#### Display Configuration
|
|
|
|
For our basic display configuration we will use the **mipi_spi** display driver. This driver
|
|
specifically requires the **quad SPI bus** to be configured as well as the correct `data_rate`.
|
|
You can play with the data rate for a **quad SPI display** as it will impact how the display refreshes and draws images.
|
|
```yaml
|
|
display:
|
|
- platform: mipi_spi
|
|
id: disp1
|
|
model: CO5300
|
|
bus_mode: quad
|
|
reset_pin: GPIO39
|
|
cs_pin: GPIO12
|
|
data_rate: 80MHz
|
|
dimensions:
|
|
height: 466
|
|
width: 466
|
|
offset_width: 6
|
|
```
|
|
|
|
#### Audio Configuration
|
|
|
|
Our audio configuration is quite a bit more complex. It requires we configure its own **SPI bus** as well as the **DAC**
|
|
and **ADC** configs. Finally we then need to actually configure the audio components.
|
|
|
|
Our **audio SPI bus** is more simple than our **quad SPI bus**
|
|
|
|
> we ignore the strapping pin here to prevent warnings being thrown. Read more about this [here](https://esphome.io/guides/configuration-types/#pin-schema)
|
|
|
|
```yaml
|
|
i2s_audio:
|
|
- id: i2s_audio_bus
|
|
i2s_mclk_pin: GPIO42
|
|
i2s_bclk_pin: GPIO9
|
|
i2s_lrclk_pin:
|
|
number: GPIO45
|
|
ignore_strapping_warning: true
|
|
```
|
|
|
|
We then need to configure both our **DAC** and **ADC** drivers. For the ease of syncing our configs and not confusing changes
|
|
in the future, we will first add substitutions.
|
|
|
|
```yaml
|
|
substitutions:
|
|
i2s_bps_spk: 16bit
|
|
i2s_bps_mic: 16bit
|
|
i2s_sample_rate_spk: 44100
|
|
i2s_sample_rate_mic: 16000
|
|
```
|
|
|
|
We can then configure our **ADC** and **DAC** drivers and make use of these substitutions.
|
|
|
|
```yaml
|
|
audio_adc:
|
|
- platform: es7210
|
|
id: es7210_adc
|
|
bits_per_sample: $i2s_bps_mic
|
|
sample_rate: $i2s_sample_rate_mic
|
|
|
|
audio_dac:
|
|
- platform: es8311
|
|
id: es8311_dac
|
|
bits_per_sample: $i2s_bps_spk
|
|
sample_rate: $i2s_sample_rate_spk
|
|
```
|
|
|
|
Once we have our audio drivers configured, we can configure our **audio output** and **audio input** devices. We configure
|
|
our audio devices using the same substitutions allowing us to change sample rates and bit depths without a possible
|
|
mismatch between driver and device.
|
|
|
|
```yaml
|
|
microphone:
|
|
- platform: i2s_audio
|
|
id: box_mic
|
|
sample_rate: $i2s_sample_rate_mic
|
|
i2s_din_pin: GPIO10
|
|
bits_per_sample: $i2s_bps_mic
|
|
adc_type: external
|
|
|
|
speaker:
|
|
- platform: i2s_audio
|
|
id: box_speaker
|
|
i2s_dout_pin: GPIO8
|
|
dac_type: external
|
|
sample_rate: $i2s_sample_rate_spk
|
|
bits_per_sample: $i2s_bps_spk
|
|
audio_dac: es8311_dac
|
|
buffer_duration: 90ms
|
|
use_apll: true
|
|
```
|
|
|
|
All together we end up with a long block of configuration that looks like this:
|
|
|
|
```yaml
|
|
i2s_audio:
|
|
- id: i2s_audio_bus
|
|
i2s_mclk_pin: GPIO42
|
|
i2s_bclk_pin: GPIO9
|
|
i2s_lrclk_pin:
|
|
number: GPIO45
|
|
ignore_strapping_warning: true
|
|
|
|
audio_adc:
|
|
- platform: es7210
|
|
id: es7210_adc
|
|
bits_per_sample: $i2s_bps_mic
|
|
sample_rate: $i2s_sample_rate_mic
|
|
|
|
audio_dac:
|
|
- platform: es8311
|
|
id: es8311_dac
|
|
bits_per_sample: $i2s_bps_spk
|
|
sample_rate: $i2s_sample_rate_spk
|
|
|
|
microphone:
|
|
- platform: i2s_audio
|
|
id: box_mic
|
|
sample_rate: $i2s_sample_rate_mic
|
|
i2s_din_pin: GPIO10
|
|
bits_per_sample: $i2s_bps_mic
|
|
adc_type: external
|
|
|
|
speaker:
|
|
- platform: i2s_audio
|
|
id: box_speaker
|
|
i2s_dout_pin: GPIO8
|
|
dac_type: external
|
|
sample_rate: $i2s_sample_rate_spk
|
|
bits_per_sample: $i2s_bps_spk
|
|
audio_dac: es8311_dac
|
|
buffer_duration: 90ms
|
|
use_apll: true
|
|
```
|
|
|
|
#### Final Configuration
|
|
|
|
There is a lot more config to go through, and I don't want to go over all of it in this blog, you can find all resources
|
|
for the ESPHome portion of Katchi at my gitea repo.
|
|
|
|
{{< gitea server="https://git.toomuchtaco.net" repo="taco/voice-assistant" >}}
|
|
|
|
### Designing a kobold |