πŸŽ™οΈ microWakeWord Nvidia Trainer & Recorder

Screenshot 2026-01-18 at 8 13 35β€―AM
Train **microWakeWord** detection models using a simple **web-based recorder + trainer UI**, packaged in a Docker container. No Jupyter notebooks required. No manual cell execution. Just record your voice (optional) and train. --- ## πŸš€ Quick Start ### 1️⃣ Pull the Docker Image ```bash docker pull ghcr.io/tatertotterson/microwakeword:latest ``` --- ### 2️⃣ Run the Container ```bash docker run --rm -it \ --gpus all \ -p 8888:8888 \ -v $(pwd):/data \ ghcr.io/tatertotterson/microwakeword:latest ``` **What these flags do:** - `--gpus all` β†’ Enables GPU acceleration - `-p 8888:8888` β†’ Exposes the Recorder + Trainer WebUI - `-v $(pwd):/data` β†’ Persists all models, datasets, and cache --- ### 3️⃣ Open the Recorder WebUI Open your browser and go to: πŸ‘‰ **http://localhost:8888** You’ll see the **microWakeWord Recorder & Trainer UI**. --- ## 🎀 Recording Voice Samples (Optional) Personal voice recordings are **optional**. - You may **record your own voice** for better accuracy - Or simply **click β€œTrain” without recording anything** If no recordings are present, training will proceed using **synthetic TTS samples only**. ### Remote systems (important) If you are running this on a **remote PC / server**, browser-based recording will not work unless: - You use a **reverse proxy** (HTTPS + mic permissions), **or** - You access the UI via **localhost** on the same machine Training itself works fine remotely β€” only recording requires local microphone access. --- ## 🧠 Training Behavior (Important Notes) ### ⏬ First training run The **first time you click Train**, the system will download **large training datasets** (background noise, speech corpora, etc.). - This can take **several minutes** - This happens **only once** - Data is cached inside `/data` You **will NOT need to download these again** unless you delete `/data`. --- ### πŸ” Re-training is safe and incremental - You can train **multiple wake words** back-to-back - You do **NOT** need to clear any folders between runs - Old models are preserved in timestamped output directories - All required cleanup and reuse logic is handled automatically --- ## πŸ“¦ Output Files When training completes, you’ll get: - `.tflite` – quantized streaming model - `.json` – ESPHome-compatible metadata Both are saved under: ```text /data/output/ ``` Each run is placed in its own timestamped folder. --- ## 🎀 Optional: Personal Voice Samples (Advanced) If you record personal samples: - They are automatically augmented - They are **up-weighted during training** - This significantly improves real-world accuracy No configuration required β€” detection is automatic. --- ## πŸ”„ Resetting Everything (Optional) If you want a **completely clean slate**: Delete the /data folder Then restart the container. ⚠️ This will: - Remove cached datasets - Require re-downloading training data - Delete trained models --- ## πŸ™Œ Credits Built on top of the excellent **https://github.com/kahrendt/microWakeWord** Huge thanks to the original authors ❀️