echofilter-train
Contents
echofilter-train#
Echofilter model training
usage: echofilter-train [-h] [--version] [--data-dir DIR]
[--dataset DATASET_NAME]
[--train-partition TRAIN_PARTITION]
[--val-partition VAL_PARTITION]
[--shape SAMPLE_SHAPE SAMPLE_SHAPE]
[--crop-depth CROP_DEPTH] [--resume PATH]
[--cold-restart] [--warm-restart] [--log LOG_NAME]
[--log-append LOG_NAME_APPEND] [--conditional]
[--nblock N_BLOCK] [--latent-channels LATENT_CHANNELS]
[--expansion-factor EXPANSION_FACTOR]
[--expand-only-on-down]
[--blocks-per-downsample BLOCKS_PER_DOWNSAMPLE [BLOCKS_PER_DOWNSAMPLE ...]]
[--blocks-before-first-downsample BLOCKS_BEFORE_FIRST_DOWNSAMPLE [BLOCKS_BEFORE_FIRST_DOWNSAMPLE ...]]
[--only-skip-connection-on-downsample]
[--deepest-inner DEEPEST_INNER]
[--intrablock-expansion INTRABLOCK_EXPANSION]
[--se-reduction SE_REDUCTION]
[--downsampling-modes DOWNSAMPLING_MODES [DOWNSAMPLING_MODES ...]]
[--upsampling-modes UPSAMPLING_MODES [UPSAMPLING_MODES ...]]
[--fused-conv] [--no-residual] [--actfn ACTFN]
[--kernel KERNEL_SIZE] [--device DEVICE] [--multigpu]
[--no-amp] [--amp-opt AMP_OPT] [-j N] [-p PRINT_FREQ]
[-b BATCH_SIZE] [--no-stratify] [--epochs N_EPOCH]
[--seed SEED] [--optim OPTIMIZER]
[--schedule SCHEDULE] [--lr LR] [--momentum MOMENTUM]
[--base-momentum BASE_MOMENTUM] [--wd WEIGHT_DECAY]
[--warmup-pct WARMUP_PCT]
[--warmdown-pct WARMDOWN_PCT]
[--anneal-strategy ANNEAL_STRATEGY]
[--overall-loss-weight OVERALL_LOSS_WEIGHT]
Actions#
These arguments specify special actions to perform. The main action of this program is supressed if any of these are given.
- --version, -V
Show program’s version number and exit.
Data parameters#
- --data-dir
path to root data directory
- --dataset
which dataset to use
- --train-partition
which partition to train on (default depends on dataset)
- --val-partition
which partition to validate on (default depends on dataset)
- --shape
input shape [W, H] (default:
(128, 512)
)- --crop-depth
depth, in metres, at which data should be truncated (default:
None
)- --resume
- --cold-restart
when resuming from a checkpoint, use this only for initial weights
- --warm-restart
when resuming from a checkpoint, use the existing weights and optimizer state but start a new LR schedule
- --log
output directory name (default: DATE_TIME)
- --log-append
string to append to output directory name (default: HOSTNAME)
Model parameters#
- --conditional
train a model conditioned on the direction the sounder is facing (in addition to an unconditional model)
- --nblock, --num-blocks
number of blocks down and up in the UNet (default:
6
)- --latent-channels
number of initial/final latent channels to use in the model (default:
32
)- --expansion-factor
expansion for number of channels as model becomes deeper (default:
1.0
, constant number of channels)- --expand-only-on-down
only expand channels on dowsampling blocks
- --blocks-per-downsample
for each dim (time, depth), number of blocks between downsample steps (default:
(2, 1)
)- --blocks-before-first-downsample
for each dim (time, depth), number of blocks before first downsample step (default:
(2, 1)
)- --only-skip-connection-on-downsample
only include skip connections when downsampling
- --deepest-inner
layer to include at the deepest point of the UNet (default: “horizontal_block”). Set to “identity” to disable.
- --intrablock-expansion
expansion within inverse residual blocks (default:
6.0
)- --se-reduction, --se
reduction within squeeze-and-excite blocks (default:
4.0
)- --downsampling-modes
for each downsampling step, the method to use (default:
"max"
)- --upsampling-modes
for each upsampling step, the method to use (default:
"bilinear"
)- --fused-conv
use fused instead of depthwise separable convolutions
- --no-residual
don’t use residual blocks
- --actfn
activation function to use
- --kernel
convolution kernel size (default:
5
)
Training parameters#
- --device
device to use (default:
"cuda"
, using first gpu)- --multigpu
train on multiple GPUs
- --no-amp
use fp32 instead of mixed precision (default: use mixed precision on gpu)
- --amp-opt
optimizer level for apex automatic mixed precision (default:
"O1"
)- -j, --workers
number of data loading workers (default:
8
)- -p, --print-freq
print frequency (default:
50
)- -b, --batch-size
mini-batch size (default:
16
)- --no-stratify
disable stratified sampling; use fully random sampling instead
- --epochs
number of total epochs to run (default:
20
)- --seed
seed for initializing training.
Optimizer parameters#
- --optim, --optimiser, --optimizer
optimizer name (default:
"rangerva"
)- --schedule
LR schedule (default:
"constant"
)- --lr, --learning-rate
initial learning rate (default:
0.1
)- --momentum
momentum (default:
0.9
)- --base-momentum
base momentum; only used for OneCycle schedule (default: same as momentum)
- --wd, --weight-decay
weight decay (default:
1e-05
)- --warmup-pct
fraction of training to spend warming up LR; only used for OneCycle MesaOneCycle schedules (default:
0.2
)- --warmdown-pct
fraction of training before warming down LR; only used for MesaOneCycle schedule (default:
0.7
)- --anneal-strategy
annealing strategy; only used for OneCycle schedule (default:
"cos"
)- --overall-loss-weight
weighting for overall loss term (default:
0.0
)