More updates to my first blog post.
All checks were successful
Build and Publish Docker Image / build (push) Successful in 38s
All checks were successful
Build and Publish Docker Image / build (push) Successful in 38s
This commit is contained in:
@@ -20,8 +20,8 @@ loop=true
|
||||
|
||||
### Smart-homes
|
||||
|
||||
The present state of smart-home choices is fairly acceptable. You have your major players, Google, Apple, Amazon and
|
||||
their associated services like Google Home or Alexa. These systems are fairly easy to set up; plug in the new device,
|
||||
The present state of smart-home choices is fairly acceptable. You have your major players, **Google**, **Apple**, **Amazon** and
|
||||
their associated services like Google Home or **Alexa**. These systems are fairly easy to set up; plug in the new device,
|
||||
type in some credentials or type a prompt on your phone, and done. Most of these systems rely on a central hub that
|
||||
orchestrates the entire smart home.
|
||||
|
||||
@@ -39,10 +39,10 @@ turning off my lights turned me away from major providers.
|
||||
|
||||
After spending a lot of time frustrated with my options and dealing with the difficulties in automating and doing what
|
||||
I wanted with my smart-home, I went down the rabbit hole of options and found
|
||||
[Home Assistant](https://www.home-assistant.io/).
|
||||
**[Home Assistant](https://www.home-assistant.io/)**.
|
||||

|
||||
|
||||
Unlike the big-name smart-homes, Home Assistant is a self-hosted option that runs on your own hardware and locally
|
||||
Unlike the big-name smart-homes, **Home Assistant** is a self-hosted option that runs on your own hardware and locally
|
||||
connects to supported devices. It supports a wide range of [devices and integrations](https://www.home-assistant.io/integrations/?brands=featured)
|
||||
and is fairly easy to set up.
|
||||
|
||||
@@ -51,26 +51,271 @@ and [community](https://community.home-assistant.io/) for more information.
|
||||
|
||||
### So what's the problem?
|
||||
|
||||
Of all the amazing options that Home Assistant gives us, it has a fairly significant miss; that being Smart Speaker integration.
|
||||
Of all the amazing options that **Home Assistant** gives us, it has a fairly significant miss; that being Smart Speaker integration.
|
||||
|
||||
## Home Assistant Smart Speaker
|
||||
|
||||
The options for Home Assistant smart speakers are quite limited, they only offer one official product as of the date of publishing this post.
|
||||
The options for **Home Assistant** smart speakers are quite limited, they only offer one official product as of the date of publishing this post.
|
||||
|
||||
{{< externalLink url="https://www.home-assistant.io/voice-pe/" >}}
|
||||
|
||||
While the Home Assistant Voice PE works decently, it is the only off-the-shelf option for Home Assistant which considering
|
||||
all the freedom Home Assistant gives us, feels quite limiting. However, there is a solution.
|
||||
While the **Home Assistant Voice PE** works decently, it is the only off-the-shelf option for **Home Assistant** which considering
|
||||
all the freedom **Home Assistant** gives us, feels quite limiting. However, there is a solution.
|
||||
|
||||
### The Solution
|
||||
|
||||
Thankfully we are not constrained by the limitations of existing hardware thanks to microcontrollers, specifically the ESP family of microcontrollers.
|
||||

|
||||
|
||||
Using [ESPHome](https://esphome.io) you can create a whole myriad of smart devices based on the [ESP32 microcontroller](https://www.espressif.com/en/products/socs/esp32). It provides a very diverse
|
||||
Using **[ESPHome](https://esphome.io)** you can create a whole myriad of smart devices based on the [ESP32 microcontroller](https://www.espressif.com/en/products/socs/esp32). It provides a very diverse
|
||||
family of options that can fit nearly any use-case. Think of it as an alternative to Arduino, where instead of writing C code
|
||||
you can write yaml configuration files that dictate and configure your ESP device.
|
||||
|
||||
Knowing this, I set out to make my own Smart Speaker.
|
||||
|
||||
## Katchi the Kobold Smart Speaker
|
||||
## Building a Katchi (smart speaker)
|
||||
|
||||
So what does it take to make a smart speaker? You ultimately need a few key things such as a Speaker, a Microphone and a
|
||||
Wifi-enabled Microcontroller. For my purposes I decided I also wanted a screen so I could give my Katchi a little
|
||||
more personality. The main requirement I had for the display was for it to be circular as my intent was to use the display
|
||||
as the eye for my smart speaker.
|
||||
|
||||
### Waveshare ESP32-S3
|
||||
|
||||
I ended up landing on the [Waveshare ESP32-S3 1.75inch AMOLED Round Touch Display Development Board](https://www.waveshare.com/esp32-s3-touch-amoled-1.75.htm).
|
||||
Despite being quite a mouthful, this handy little device is absolutely packed with sensors and features, as well as a
|
||||
glorious AMOLED round display.
|
||||
|
||||
{{< gallery >}}
|
||||
{{< figure src="https://www.waveshare.com/media/catalog/product/cache/1/image/800x800/9df78eab33525d08d6e5fb8d27136e95/e/s/esp32-s3-touch-amoled-1.75-1.jpg" alt="Gallery image 1" figureClass="grid-w33" >}}
|
||||
{{< figure src="https://www.waveshare.com/media/catalog/product/cache/1/image/800x800/9df78eab33525d08d6e5fb8d27136e95/e/s/esp32-s3-touch-amoled-1.75-2.jpg" alt="Gallery image 2" figureClass="grid-w33" >}}
|
||||
{{< figure src="https://www.waveshare.com/media/catalog/product/cache/1/image/800x800/9df78eab33525d08d6e5fb8d27136e95/e/s/esp32-s3-touch-amoled-1.75-3.jpg" alt="Gallery image 3" figureClass="grid-w33" >}}
|
||||
{{< /gallery >}}
|
||||
|
||||
Some of the fundamental things to take note of when considering esp devices, is what components and associated drivers
|
||||
are available in **ESPHome**. For the Waveshare device I picked, it has the following components and their support:
|
||||
|
||||
| Device | Purpose | Supported |
|
||||
|---------|-----------------------------------------|--------------------------------------------------------|
|
||||
| TCA9554 | GPIO expander for additional interfaces | [Yes](https://esphome.io/components/pca9554/) |
|
||||
| ES7210 | ADC for microphones | [Yes](https://esphome.io/components/audio_adc/es7210/) |
|
||||
| ES8311 | DAC for speaker | [Yes](https://esphome.io/components/audio_dac/es8311/) |
|
||||
| CO5300 | Display controller for amoled | [Yes](https://esphome.io/components/display/mipi_spi/) |
|
||||
| CST9217 | Touchscreen control device | No |
|
||||
|
||||
To fill in the support gap for the touch screen, we can make use of an [external driver](https://github.com/shelson/esphome-cst9217) to handle making the touchscreen
|
||||
work.
|
||||
|
||||
### Configuration
|
||||
|
||||
Since **ESPHome** uses a yaml configuration file to define the device, configuring the device is fairly straightforward.
|
||||
|
||||
#### Basic Configuration
|
||||
|
||||
We first need to start with the basic confguration of the device. When setting up the esp32 configuration, it is crucial
|
||||
to be aware of the Flash size and CPU frequency otherwise your device may not run correctly. For the device I am using,
|
||||
it has a **16MB flash** and a **240MHz CPU**. We also use the esp-idf framework. This is the preferred framework for
|
||||
esp devices as the Arduino framework is not as feature-rich and is no longer supported by newer devices.
|
||||
|
||||
```yaml
|
||||
esphome:
|
||||
name: kobold
|
||||
friendly_name: Kobold
|
||||
|
||||
esp32:
|
||||
board: esp32-s3-devkitc-1
|
||||
flash_size: 16MB
|
||||
cpu_frequency: 240MHZ
|
||||
framework:
|
||||
type: esp-idf
|
||||
```
|
||||
|
||||
We also need to set up **psram**, the **i2c bus** as well as the **SPI bus**. This will require
|
||||
a firm understanding of the GPIO pins and their associated functions. For the Waveshare device,
|
||||
they provide the following pinout diagram: [ESP32-S3-Touch-AMOLED-1.75.pdf](https://files.waveshare.com/wiki/ESP32-S3-Touch-AMOLED-1.75/ESP32-S3-Touch-AMOLED-1.75.pdf)
|
||||
We will rely on this document extensively for the rest of the configuration.
|
||||
|
||||
The **psram** is important for making sure that the device does not run out of memory.
|
||||
and its config is quite simple.
|
||||
|
||||
```yaml
|
||||
psram:
|
||||
mode: octal
|
||||
speed: 80MHz
|
||||
```
|
||||
|
||||
The **i2c bus** is used for the touchscreen and for other future components and acts as
|
||||
an important communication protocol for Microcontrollers as it allows a large amount of
|
||||
sensors and devices to connect to the same bus.
|
||||
|
||||
```yaml
|
||||
i2c:
|
||||
sda: GPIO15
|
||||
scl: GPIO14
|
||||
scan: true
|
||||
id: bus_a
|
||||
```
|
||||
|
||||
The **SPI bus** is essential for the display module as it communicates via quad SPI which is functionally
|
||||
a quad channel serial bus.
|
||||
|
||||
```yaml
|
||||
spi:
|
||||
- id: spi_bus
|
||||
clk_pin: GPIO2
|
||||
mosi_pin: GPIO1
|
||||
miso_pin:
|
||||
number: GPIO3
|
||||
ignore_strapping_warning: true
|
||||
- id: quad_spi_bus
|
||||
type: quad
|
||||
clk_pin: GPIO38
|
||||
data_pins:
|
||||
- GPIO4
|
||||
- GPIO5
|
||||
- GPIO6
|
||||
- GPIO7
|
||||
```
|
||||
|
||||
#### Display Configuration
|
||||
|
||||
For our basic display configuration we will use the **mipi_spi** display driver. This driver
|
||||
specifically requires the **quad SPI bus** to be configured as well as the correct `data_rate`.
|
||||
You can play with the data rate for a **quad SPI display** as it will impact how the display refreshes and draws images.
|
||||
```yaml
|
||||
display:
|
||||
- platform: mipi_spi
|
||||
id: disp1
|
||||
model: CO5300
|
||||
bus_mode: quad
|
||||
reset_pin: GPIO39
|
||||
cs_pin: GPIO12
|
||||
data_rate: 80MHz
|
||||
dimensions:
|
||||
height: 466
|
||||
width: 466
|
||||
offset_width: 6
|
||||
```
|
||||
|
||||
#### Audio Configuration
|
||||
|
||||
Our audio configuration is quite a bit more complex. It requires we configure its own **SPI bus** as well as the **DAC**
|
||||
and **ADC** configs. Finally we then need to actually configure the audio components.
|
||||
|
||||
Our **audio SPI bus** is more simple than our **quad SPI bus**
|
||||
|
||||
> we ignore the strapping pin here to prevent warnings being thrown. Read more about this [here](https://esphome.io/guides/configuration-types/#pin-schema)
|
||||
|
||||
```yaml
|
||||
i2s_audio:
|
||||
- id: i2s_audio_bus
|
||||
i2s_mclk_pin: GPIO42
|
||||
i2s_bclk_pin: GPIO9
|
||||
i2s_lrclk_pin:
|
||||
number: GPIO45
|
||||
ignore_strapping_warning: true
|
||||
```
|
||||
|
||||
We then need to configure both our **DAC** and **ADC** drivers. For the ease of syncing our configs and not confusing changes
|
||||
in the future, we will first add substitutions.
|
||||
|
||||
```yaml
|
||||
substitutions:
|
||||
i2s_bps_spk: 16bit
|
||||
i2s_bps_mic: 16bit
|
||||
i2s_sample_rate_spk: 44100
|
||||
i2s_sample_rate_mic: 16000
|
||||
```
|
||||
|
||||
We can then configure our **ADC** and **DAC** drivers and make use of these substitutions.
|
||||
|
||||
```yaml
|
||||
audio_adc:
|
||||
- platform: es7210
|
||||
id: es7210_adc
|
||||
bits_per_sample: $i2s_bps_mic
|
||||
sample_rate: $i2s_sample_rate_mic
|
||||
|
||||
audio_dac:
|
||||
- platform: es8311
|
||||
id: es8311_dac
|
||||
bits_per_sample: $i2s_bps_spk
|
||||
sample_rate: $i2s_sample_rate_spk
|
||||
```
|
||||
|
||||
Once we have our audio drivers configured, we can configure our **audio output** and **audio input** devices. We configure
|
||||
our audio devices using the same substitutions allowing us to change sample rates and bit depths without a possible
|
||||
mismatch between driver and device.
|
||||
|
||||
```yaml
|
||||
microphone:
|
||||
- platform: i2s_audio
|
||||
id: box_mic
|
||||
sample_rate: $i2s_sample_rate_mic
|
||||
i2s_din_pin: GPIO10
|
||||
bits_per_sample: $i2s_bps_mic
|
||||
adc_type: external
|
||||
|
||||
speaker:
|
||||
- platform: i2s_audio
|
||||
id: box_speaker
|
||||
i2s_dout_pin: GPIO8
|
||||
dac_type: external
|
||||
sample_rate: $i2s_sample_rate_spk
|
||||
bits_per_sample: $i2s_bps_spk
|
||||
audio_dac: es8311_dac
|
||||
buffer_duration: 90ms
|
||||
use_apll: true
|
||||
```
|
||||
|
||||
All together we end up with a long block of configuration that looks like this:
|
||||
|
||||
```yaml
|
||||
i2s_audio:
|
||||
- id: i2s_audio_bus
|
||||
i2s_mclk_pin: GPIO42
|
||||
i2s_bclk_pin: GPIO9
|
||||
i2s_lrclk_pin:
|
||||
number: GPIO45
|
||||
ignore_strapping_warning: true
|
||||
|
||||
audio_adc:
|
||||
- platform: es7210
|
||||
id: es7210_adc
|
||||
bits_per_sample: $i2s_bps_mic
|
||||
sample_rate: $i2s_sample_rate_mic
|
||||
|
||||
audio_dac:
|
||||
- platform: es8311
|
||||
id: es8311_dac
|
||||
bits_per_sample: $i2s_bps_spk
|
||||
sample_rate: $i2s_sample_rate_spk
|
||||
|
||||
microphone:
|
||||
- platform: i2s_audio
|
||||
id: box_mic
|
||||
sample_rate: $i2s_sample_rate_mic
|
||||
i2s_din_pin: GPIO10
|
||||
bits_per_sample: $i2s_bps_mic
|
||||
adc_type: external
|
||||
|
||||
speaker:
|
||||
- platform: i2s_audio
|
||||
id: box_speaker
|
||||
i2s_dout_pin: GPIO8
|
||||
dac_type: external
|
||||
sample_rate: $i2s_sample_rate_spk
|
||||
bits_per_sample: $i2s_bps_spk
|
||||
audio_dac: es8311_dac
|
||||
buffer_duration: 90ms
|
||||
use_apll: true
|
||||
```
|
||||
|
||||
#### Final Configuration
|
||||
|
||||
There is a lot more config to go through, and I don't want to go over all of it in this blog, you can find all resources
|
||||
for the ESPHome portion of Katchi at my gitea repo.
|
||||
|
||||
{{< gitea server="https://git.toomuchtaco.net" repo="taco/voice-assistant" >}}
|
||||
|
||||
### Designing a kobold
|
||||
Reference in New Issue
Block a user