George Joseph dc92dc7d8b Update notebook to fix issues with environment inheritance.
Two issues:

* The notebook cell that actually runs model_train_eval was running it in a
  subprocess so while it inherited environment variables from the running
  python kernel, it couldn't inherit the tensorflow environment from it.
  This resulted in the `set_memory_growth(g, True)` and
  `mixed_precision.set_global_policy("mixed_float16")` calls in the previous
  cell to be lost.

* TFlite doesn't support "mixed_float16" anyway and causes the model export to
  fail spectacularly so it's kind of a good thing it wasn't being applied.

So..

* The tensorflow environment variable and memory_growth setting code was moved
  from the notebook cell that also wrote the config yaml to the next cell
  which does the train and test.  This leaves the "config" cell to just write
  the yaml.  This is really just a cosmetic change to group functionality
  better.

* The code that tried to set "mixed_float16" has been removed but since setting
  memory_growth to true is a good thing, the model_train_eval is now run using
  runpy instead in a subprocess.  This way it's run in the same python kernel
  instance and tensorflow environment as the rest of the notebook and inherits
  the memory_growth setting.

Resolves: #14
2025-12-20 10:22:33 -07:00
2025-01-02 20:22:06 -06:00
2025-01-02 23:15:53 -06:00
2025-09-27 15:04:16 -05:00
2025-09-26 19:35:09 -05:00

MicroWakeWord Trainer Logo

microWakeWord Trainer Docker

🥔 MicroWakeWord Trainer Tater Approved

Tater Totterson tested & working on an NVIDIA RTX 3070 Laptop GPU (8 GB VRAM).
Easily train microWakeWord detection models with this pre-built Docker image and JupyterLab notebook.


🚀 Quick Start

Follow these steps to get up and running:

1 Pull the Pre-Built Docker Image

docker pull ghcr.io/tatertotterson/microwakeword:latest

2 Run the Docker Container

docker run --rm -it \
    --gpus all \
    -p 8888:8888 \
    -v $(pwd):/data \
    ghcr.io/tatertotterson/microwakeword:latest

What these flags do:

  • --gpus all → Enables GPU acceleration
  • -p 8888:8888 → Exposes JupyterLab on port 8888
  • -v $(pwd):/data → Saves your work in the current folder

3 Open JupyterLab

Visit http://localhost:8888 in your browser — the notebook UI will open.


4 Set Your Wake Word

At the top of the notebook, find this line:

TARGET_WORD = "hey_tater"  # Change this to your desired wake word

Change "hey_tater" to your desired wake word (phonetic spellings often work best).


5 Run the Notebook

Run all cells in the notebook. This process will:

  • Generate wake word samples
  • Train a detection model
  • Output a quantized .tflite model ready for on-device use

6 Retrieve the Trained Model & JSON

When training finishes, download links for both the .tflite model and its .json manifest will be displayed in the last cell.


🔄 Resetting to a Clean State

If you need to start fresh:

  1. Delete the data folder that was mapped to your Docker container.
  2. Restart the container using the steps above.
  3. A fresh copy of the notebook will be placed into the data directory.

🙌 Credits

This project builds upon the excellent work of kahrendt/microWakeWord.
Huge thanks to the original authors for their contributions to the open-source community!

Description
No description provided
Readme 2 MiB
Languages
Python 42.3%
HTML 37.1%
Shell 20.6%