Skip to content

Using YAML Configuration Files

In both the command line and the Python module, options for video loading, training, and prediction can be set by passing a YAML file instead of passing arguments directly. YAML files (.yml or .yaml) are commonly used to serialize data in an easily readable way.

The basic structure of a YAML model configuration is:

$ cat basic_config.yaml
video_loader_config:
  model_input_height: 240
  model_input_width: 426
  total_frames: 16
  # other video loading parameters

predict_config:
  model_name: time_distributed
  data_directoty: example_vids/
  # other training parameters, eg. batch_size

train_config:
  model_name: time_distributed
  data_directory: example_vids/
  labels: example_labels.csv
  # other training parameters, eg. batch_size

For example, the configuration below will predict labels for the videos in example_vids using the time_distributed model. When videos are loaded, each will be resized to 240x426 pixels and 16 frames will be selected:

video_loader_config:
  model_input_height: 240
  model_input_width: 426
  total_frames: 16

predict_config:
  model_name: time_distributed
  data_directoty: example_vids/

Required arguments

Either predict_config or train_config is required, based on whether you will be running inference or training a model. See All Optional Arguments for a full list of what can be specified under each class. To run inference, either data_directory or filepaths must be specified. To train a model, both data_directory and labels must be specified.

In video_loader_config, you must specify at least model_input_height, model_input_width, and total_frames.

  • For time_distributed or european, total_frames must be 16
  • For slowfast, total_frames must be 32

See the Available Models page for more details on each model's requirements.

Command line interface

A YAML configuration file can be passed to the command line interface with the --config argument. For example, say the example configuration above is saved as example_config.yaml. To run prediction:

$ zamba predict --config example_config.yaml

Only some of the possible parameters can be passed directly as arguments to the command line. Those not listed in zamba predict --help or zamba train --help must be passed in a YAML file (see the Quickstart guide for details).

Python package

The main API for zamba is the ModelManager class that can be accessed with:

from zamba.models.manager import ModelManager

The ModelManager class is used by zamba’s command line interface to handle preprocessing the filenames, loading the videos, serving them to the model, and saving predictions. Therefore any functionality available to the command line interface is accessible via the ModelManager class.

To instantiate the ModelManager based on a configuration file saved at test_config.yaml:

>>> manager = ModelManager.from_yaml('test_config.yaml')
>>> manager.config

ModelConfig(
  video_loader_config=VideoLoaderConfig(crop_bottom_pixels=None, i_frames=False, 
                                        scene_threshold=None, megadetector_lite_config=None, 
                                        model_input_height=240, model_input_width=426, 
                                        total_frames=16, ensure_total_frames=True, 
                                        fps=None, early_bias=False, frame_indices=None,
                                        evenly_sample_total_frames=False, pix_fmt='rgb24'
                                      ), 
  train_config=None, 
  predict_config=PredictConfig(data_directory=PosixPath('vids'), 
                               filepaths=                         filepath
                                          0    /home/ubuntu/vids/eleph.MP4
                                          1  /home/ubuntu/vids/leopard.MP4
                                          2    /home/ubuntu/vids/blank.MP4
                                          3    /home/ubuntu/vids/chimp.MP4, 
                                checkpoint='zamba_time_distributed.ckpt', 
                                model_params=ModelParams(scheduler=None, scheduler_params=None),
                                model_name='time_distributed', species=None, 
                                gpus=1, num_workers=3, batch_size=8, 
                                save=True, dry_run=False, proba_threshold=None,
                                output_class_names=False, weight_download_region='us', 
                                cache_dir=None, skip_load_validation=False)
                              )

We can now run inference or model training without specifying any additional parameters, because they are already associated with our instance of the ModelManager class. To run inference or training:

manager.predict() # inference

manager.train() # training

In our user tutorials, we refer to train_model and predict_model functions. The ModelManager class calls these same functions behind the scenes when .predict() or .train() is run.

Default configurations

In the command line, the default configuration for each model is passed in using a specified YAML file that ships with zamba. You can see the default configuration YAML files on Github in the config.yaml file within each model's folder.

For example, the default configuration for the time_distributed model is:

train_config:
  model_name: time_distributed
  backbone_finetune_config:
    backbone_initial_ratio_lr: 0.01
    multiplier: 1
    pre_train_bn: True
    train_bn: False
    unfreeze_backbone_at_epoch: 3
    verbose: True
  early_stopping_config:
    patience: 5
  scheduler_config:
    scheduler: MultiStepLR
    scheduler_params:
      gamma: 0.5
      milestones:
      - 3
      verbose: true

video_loader_config:
  model_input_height: 240
  model_input_width: 426
  crop_bottom_pixels: 50
  fps: 4
  total_frames: 16
  ensure_total_frames: True
  megadetector_lite_config:
    confidence: 0.25
    fill_mode: score_sorted
    n_frames: 16

predict_config:
  model_name: time_distributed

For reference, the below shows how to specify the same video loading and training parameters using only the Python package:

from zamba.data.video import VideoLoaderConfig
from zamba.models.config import TrainConfig
from zamba.models.model_manager import train_model

video_loader_config = VideoLoaderConfig(
    model_input_height=240,
    model_input_width=426,
    crop_bottom_pixels=50,
    fps=4,
    total_frames=16,
    ensure_total_frames=True,
    megadetector_lite_config={
        "confidence": 0.25, 
        "fill_mode": "score_sorted", 
        "n_frames": 16,
    },
)

train_config = TrainConfig(
    # data_directory=YOUR_DATA_DIRECTORY_HERE,
    # labels=YOUR_LABELS_CSV_HERE,
    model_name="time_distributed",
    backbone_finetune_config={
        "backbone_initial_ratio_lr": 0.01,
        "unfreeze_backbone_at_epoch": 3,
        "verbose": True,
        "pre_train_bn": True,
        "train_bn": False,
        "multiplier": 1,
    },
    early_stopping_config={"patience": 5},
    scheduler_config={
        "scheduler": "MultiStepLR",
        "scheduler_params": {"gamma": 0.5, "milestones": 3, "verbose": True,},
    },
)

train_model(
    train_config=train_config,
    video_loader_config=video_loader_config,
)