
Echofilter model training

usage: echofilter-train [-h] [--version] [--data-dir DIR]
                        [--dataset DATASET_NAME]
                        [--train-partition TRAIN_PARTITION]
                        [--val-partition VAL_PARTITION]
                        [--shape SAMPLE_SHAPE SAMPLE_SHAPE]
                        [--crop-depth CROP_DEPTH] [--resume PATH]
                        [--cold-restart] [--warm-restart] [--log LOG_NAME]
                        [--log-append LOG_NAME_APPEND] [--conditional]
                        [--nblock N_BLOCK] [--latent-channels LATENT_CHANNELS]
                        [--expansion-factor EXPANSION_FACTOR]
                        [--blocks-per-downsample BLOCKS_PER_DOWNSAMPLE [BLOCKS_PER_DOWNSAMPLE ...]]
                        [--blocks-before-first-downsample BLOCKS_BEFORE_FIRST_DOWNSAMPLE [BLOCKS_BEFORE_FIRST_DOWNSAMPLE ...]]
                        [--deepest-inner DEEPEST_INNER]
                        [--intrablock-expansion INTRABLOCK_EXPANSION]
                        [--se-reduction SE_REDUCTION]
                        [--downsampling-modes DOWNSAMPLING_MODES [DOWNSAMPLING_MODES ...]]
                        [--upsampling-modes UPSAMPLING_MODES [UPSAMPLING_MODES ...]]
                        [--fused-conv] [--no-residual] [--actfn ACTFN]
                        [--kernel KERNEL_SIZE] [--device DEVICE] [--multigpu]
                        [--no-amp] [--amp-opt AMP_OPT] [-j N] [-p PRINT_FREQ]
                        [-b BATCH_SIZE] [--no-stratify] [--epochs N_EPOCH]
                        [--seed SEED] [--optim OPTIMIZER]
                        [--schedule SCHEDULE] [--lr LR] [--momentum MOMENTUM]
                        [--base-momentum BASE_MOMENTUM] [--wd WEIGHT_DECAY]
                        [--warmup-pct WARMUP_PCT]
                        [--warmdown-pct WARMDOWN_PCT]
                        [--anneal-strategy ANNEAL_STRATEGY]
                        [--overall-loss-weight OVERALL_LOSS_WEIGHT]


These arguments specify special actions to perform. The main action of this program is supressed if any of these are given.

--version, -V

Show program’s version number and exit.

Data parameters#


path to root data directory


which dataset to use


which partition to train on (default depends on dataset)


which partition to validate on (default depends on dataset)


input shape [W, H] (default: (128, 512))


depth, in metres, at which data should be truncated (default: None)


path to latest checkpoint (default: ````)


when resuming from a checkpoint, use this only for initial weights


when resuming from a checkpoint, use the existing weights and optimizer state but start a new LR schedule


output directory name (default: DATE_TIME)


string to append to output directory name (default: HOSTNAME)

Model parameters#


train a model conditioned on the direction the sounder is facing (in addition to an unconditional model)

--nblock, --num-blocks

number of blocks down and up in the UNet (default: 6)


number of initial/final latent channels to use in the model (default: 32)


expansion for number of channels as model becomes deeper (default: 1.0, constant number of channels)


only expand channels on dowsampling blocks


for each dim (time, depth), number of blocks between downsample steps (default: (2, 1))


for each dim (time, depth), number of blocks before first downsample step (default: (2, 1))


only include skip connections when downsampling


layer to include at the deepest point of the UNet (default: “horizontal_block”). Set to “identity” to disable.


expansion within inverse residual blocks (default: 6.0)

--se-reduction, --se

reduction within squeeze-and-excite blocks (default: 4.0)


for each downsampling step, the method to use (default: "max")


for each upsampling step, the method to use (default: "bilinear")


use fused instead of depthwise separable convolutions


don’t use residual blocks


activation function to use


convolution kernel size (default: 5)

Training parameters#


device to use (default: "cuda", using first gpu)


train on multiple GPUs


use fp32 instead of mixed precision (default: use mixed precision on gpu)


optimizer level for apex automatic mixed precision (default: "O1")

-j, --workers

number of data loading workers (default: 8)

-p, --print-freq

print frequency (default: 50)

-b, --batch-size

mini-batch size (default: 16)


disable stratified sampling; use fully random sampling instead


number of total epochs to run (default: 20)


seed for initializing training.

Optimizer parameters#

--optim, --optimiser, --optimizer

optimizer name (default: "rangerva")


LR schedule (default: "constant")

--lr, --learning-rate

initial learning rate (default: 0.1)


momentum (default: 0.9)


base momentum; only used for OneCycle schedule (default: same as momentum)

--wd, --weight-decay

weight decay (default: 1e-05)


fraction of training to spend warming up LR; only used for OneCycle MesaOneCycle schedules (default: 0.2)


fraction of training before warming down LR; only used for MesaOneCycle schedule (default: 0.7)


annealing strategy; only used for OneCycle schedule (default: "cos")


weighting for overall loss term (default: 0.0)