Since this is a pure Python environment, the CUDA toolkit isn't really necessary. The various Python packages that can use CUDA will download and install the CUDA dependencies they need. This shaves off at least 8gb from the final image. The Python package install order needed to be tweaked to ensure onnxruntime, tensorflow and torch are installed in that order. Any other order results in dependent cuda package clashes. Resolves: #12
🥔 MicroWakeWord Trainer – Tater Approved
✅ Tater Totterson tested & working on an NVIDIA RTX 3070 Laptop GPU (8 GB VRAM).
Easily train microWakeWord detection models with this pre-built Docker image and JupyterLab notebook.
🚀 Quick Start
Follow these steps to get up and running:
1️⃣ Pull the Pre-Built Docker Image
docker pull ghcr.io/tatertotterson/microwakeword:latest
2️⃣ Run the Docker Container
docker run --rm -it \
--gpus all \
-p 8888:8888 \
-v $(pwd):/data \
ghcr.io/tatertotterson/microwakeword:latest
What these flags do:
--gpus all→ Enables GPU acceleration-p 8888:8888→ Exposes JupyterLab on port 8888-v $(pwd):/data→ Saves your work in the current folder
3️⃣ Open JupyterLab
Visit http://localhost:8888 in your browser — the notebook UI will open.
4️⃣ Set Your Wake Word
At the top of the notebook, find this line:
TARGET_WORD = "hey_tater" # Change this to your desired wake word
Change "hey_tater" to your desired wake word (phonetic spellings often work best).
5️⃣ Run the Notebook
Run all cells in the notebook. This process will:
- Generate wake word samples
- Train a detection model
- Output a quantized
.tflitemodel ready for on-device use
6️⃣ Retrieve the Trained Model & JSON
When training finishes, download links for both the .tflite model and its .json manifest will be displayed in the last cell.
🔄 Resetting to a Clean State
If you need to start fresh:
- Delete the
datafolder that was mapped to your Docker container. - Restart the container using the steps above.
- A fresh copy of the notebook will be placed into the
datadirectory.
🙌 Credits
This project builds upon the excellent work of kahrendt/microWakeWord.
Huge thanks to the original authors for their contributions to the open-source community!
