Tuning models

The second of MOVE’s pipeline consists of training multiple models with different hyperparameters in order to determine which set is optimal, i.e., produces models that generate the most accurate reconstructions and/or the most stable latent representations.

The hyperparameters can be anything from number of training epochs, to size of samples per batch, to number and size of hidden layers in the encoder-decoder architecture.

The experiment config

To start with this step, we define a experiment configuration (please first consult the introductory and data preparation tutorial if you have not set up your workspace and data). This type of config references a data config, a task config, and the values of hyperparameters to test out.

The first lines of our config should look like:

# @package _global_

# Define the default configuration for the data and task (model and training)

defaults:
  - override /data: random_small
  - override /task: tune_model_reconstruction

The override directives indicate (1) the name of our data config (in this example we reference the config of our simulated dataset, see tutorial for more info about this dataset) and (2) the name of the tuning task. There are two possible values for tuning task:

  • tune_model_reconstruction, which reports the reconstruction accuracy of models trained with different hyperparameter combinations; and

  • tune_model_stability, which reports the stability of the latent space of differently hyperparameterized models.

Next, we have to define the hyperparameters that we wish to test out. An example would be:

hydra:
  mode: MULTIRUN
  sweeper:
    params:
      task.batch_size: 10, 50
      task.model.num_hidden: "[500],[1000]"
      task.training_loop.num_epochs: 40, 60, 100

The above config would result in 12 hyperparameter combinations (2 options of batch size times 2 options of encoder-decoder architecture times 3 options of training epochs).

Any parameter of the training loop, model, and task can be swept. However, do note that the more options you provide, the more models that will be trained, and the more resource-intensive this task will become.

Below is a list of hyperparameters that we recommend tuning:

Tunable hyperparameters

Hyperparameter

Description

task.batch_size

Number of samples per training batch

task.model.num_hidden

Architecture of the encoder network (reversed for the decoder network)

task.model.num_latent

Number of units of the latent space

task.model.beta

Weight applied to the KLD term in the loss function

task.model.dropout

Dropout

task.training_loop.num_epochs

Number of training epochs

task.training_loop.lr

Learning rate

task.training_loop.kld_warmup_steps

Epochs at which KLD is warmed

task.training_loop.batch_dilation_steps

Epochs at which batch size is increased

task.training_loop.early_stopping

Whether early stopping is triggered

Finally, to run the tuning:

>>> cd tutorial
>>> move-dl experiment=random_small__tune_reconstruction

This process may take a while (depending on the number of hyperparameter combinations that will be trained and tested), and it will produce a TSV table in a {results_path}/tune_model directory summarizing the metrics (either reconstruction metrics like accuracy or stability). These metrics can be plotted to visualize and select the optimal hyperparameter combination.