echofilter-train#

Echofilter model training

usage: echofilter-train [-h] [--version] [--data-dir DIR]
                        [--dataset DATASET_NAME]
                        [--train-partition TRAIN_PARTITION]
                        [--val-partition VAL_PARTITION]
                        [--shape SAMPLE_SHAPE SAMPLE_SHAPE]
                        [--crop-depth CROP_DEPTH] [--resume PATH]
                        [--cold-restart] [--warm-restart] [--log LOG_NAME]
                        [--log-append LOG_NAME_APPEND] [--conditional]
                        [--nblock N_BLOCK] [--latent-channels LATENT_CHANNELS]
                        [--expansion-factor EXPANSION_FACTOR]
                        [--expand-only-on-down]
                        [--blocks-per-downsample BLOCKS_PER_DOWNSAMPLE [BLOCKS_PER_DOWNSAMPLE ...]]
                        [--blocks-before-first-downsample BLOCKS_BEFORE_FIRST_DOWNSAMPLE [BLOCKS_BEFORE_FIRST_DOWNSAMPLE ...]]
                        [--only-skip-connection-on-downsample]
                        [--deepest-inner DEEPEST_INNER]
                        [--intrablock-expansion INTRABLOCK_EXPANSION]
                        [--se-reduction SE_REDUCTION]
                        [--downsampling-modes DOWNSAMPLING_MODES [DOWNSAMPLING_MODES ...]]
                        [--upsampling-modes UPSAMPLING_MODES [UPSAMPLING_MODES ...]]
                        [--fused-conv] [--no-residual] [--actfn ACTFN]
                        [--kernel KERNEL_SIZE] [--device DEVICE] [--multigpu]
                        [--no-amp] [--amp-opt AMP_OPT] [-j N] [-p PRINT_FREQ]
                        [-b BATCH_SIZE] [--no-stratify] [--epochs N_EPOCH]
                        [--seed SEED] [--optim OPTIMIZER]
                        [--schedule SCHEDULE] [--lr LR] [--momentum MOMENTUM]
                        [--base-momentum BASE_MOMENTUM] [--wd WEIGHT_DECAY]
                        [--warmup-pct WARMUP_PCT]
                        [--warmdown-pct WARMDOWN_PCT]
                        [--anneal-strategy ANNEAL_STRATEGY]
                        [--overall-loss-weight OVERALL_LOSS_WEIGHT]

Actions#

These arguments specify special actions to perform. The main action of this program is supressed if any of these are given.

--version, -V

Show program’s version number and exit.

Data parameters#

--data-dir

path to root data directory

--dataset

which dataset to use

--train-partition

which partition to train on (default depends on dataset)

--val-partition

which partition to validate on (default depends on dataset)

--shape

input shape [W, H] (default: (128, 512))

--crop-depth

depth, in metres, at which data should be truncated (default: None)

--resume

path to latest checkpoint (default: ````)

--cold-restart

when resuming from a checkpoint, use this only for initial weights

--warm-restart

when resuming from a checkpoint, use the existing weights and optimizer state but start a new LR schedule

--log

output directory name (default: DATE_TIME)

--log-append

string to append to output directory name (default: HOSTNAME)

Model parameters#

--conditional

train a model conditioned on the direction the sounder is facing (in addition to an unconditional model)

--nblock, --num-blocks

number of blocks down and up in the UNet (default: 6)

--latent-channels

number of initial/final latent channels to use in the model (default: 32)

--expansion-factor

expansion for number of channels as model becomes deeper (default: 1.0, constant number of channels)

--expand-only-on-down

only expand channels on dowsampling blocks

--blocks-per-downsample

for each dim (time, depth), number of blocks between downsample steps (default: (2, 1))

--blocks-before-first-downsample

for each dim (time, depth), number of blocks before first downsample step (default: (2, 1))

--only-skip-connection-on-downsample

only include skip connections when downsampling

--deepest-inner

layer to include at the deepest point of the UNet (default: “horizontal_block”). Set to “identity” to disable.

--intrablock-expansion

expansion within inverse residual blocks (default: 6.0)

--se-reduction, --se

reduction within squeeze-and-excite blocks (default: 4.0)

--downsampling-modes

for each downsampling step, the method to use (default: "max")

--upsampling-modes

for each upsampling step, the method to use (default: "bilinear")

--fused-conv

use fused instead of depthwise separable convolutions

--no-residual

don’t use residual blocks

--actfn

activation function to use

--kernel

convolution kernel size (default: 5)

Training parameters#

--device

device to use (default: "cuda", using first gpu)

--multigpu

train on multiple GPUs

--no-amp

use fp32 instead of mixed precision (default: use mixed precision on gpu)

--amp-opt

optimizer level for apex automatic mixed precision (default: "O1")

-j, --workers

number of data loading workers (default: 8)

-p, --print-freq

print frequency (default: 50)

-b, --batch-size

mini-batch size (default: 16)

--no-stratify

disable stratified sampling; use fully random sampling instead

--epochs

number of total epochs to run (default: 20)

--seed

seed for initializing training.

Optimizer parameters#

--optim, --optimiser, --optimizer

optimizer name (default: "rangerva")

--schedule

LR schedule (default: "constant")

--lr, --learning-rate

initial learning rate (default: 0.1)

--momentum

momentum (default: 0.9)

--base-momentum

base momentum; only used for OneCycle schedule (default: same as momentum)

--wd, --weight-decay

weight decay (default: 1e-05)

--warmup-pct

fraction of training to spend warming up LR; only used for OneCycle MesaOneCycle schedules (default: 0.2)

--warmdown-pct

fraction of training before warming down LR; only used for MesaOneCycle schedule (default: 0.7)

--anneal-strategy

annealing strategy; only used for OneCycle schedule (default: "cos")

--overall-loss-weight

weighting for overall loss term (default: 0.0)