echofilter#

Remove echosounder noise by identifying the ocean floor and entrained air at the ocean surface.

usage: echofilter [-h] [--version] [--show-cache-dir] [--list-checkpoints]
                  [--list-colors [{alphabetic,full,full-alphabetic,xkcd,xkcd-alphabetic}]]
                  [-c CONFIG_FILE] [--source-dir SOURCE_DIR]
                  [--recursive-dir-search] [--no-recursive-dir-search]
                  [--extension SEARCH_EXTENSION [SEARCH_EXTENSION ...]]
                  [--skip-existing] [--skip-incompatible]
                  [--continue-on-error] [--output-dir OUTPUT_DIR] [--dry-run]
                  [--overwrite-files] [--overwrite-ev-lines] [--force]
                  [--no-ev-import] [--no-turbulence-line] [--no-bottom-line]
                  [--no-surface-line] [--no-nearfield-line]
                  [--suffix-file SUFFIX_FILE] [--suffix-var SUFFIX_VAR]
                  [--color-turbulence COLOR_TURBULENCE]
                  [--color-turbulence-offset COLOR_TURBULENCE_OFFSET]
                  [--color-bottom COLOR_BOTTOM]
                  [--color-bottom-offset COLOR_BOTTOM_OFFSET]
                  [--color-surface COLOR_SURFACE]
                  [--color-surface-offset COLOR_SURFACE_OFFSET]
                  [--color-nearfield COLOR_NEARFIELD]
                  [--thickness-turbulence THICKNESS_TURBULENCE]
                  [--thickness-turbulence-offset THICKNESS_TURBULENCE_OFFSET]
                  [--thickness-bottom THICKNESS_BOTTOM]
                  [--thickness-bottom-offset THICKNESS_BOTTOM_OFFSET]
                  [--thickness-surface THICKNESS_SURFACE]
                  [--thickness-surface-offset THICKNESS_SURFACE_OFFSET]
                  [--thickness-nearfield THICKNESS_NEARFIELD]
                  [--cache-dir CACHE_DIR] [--cache-csv [CSV_DIR]]
                  [--suffix-csv SUFFIX_CSV] [--keep-ext]
                  [--line-status LINE_STATUS] [--offset OFFSET]
                  [--offset-turbulence OFFSET_TURBULENCE]
                  [--offset-bottom OFFSET_BOTTOM]
                  [--offset-surface OFFSET_SURFACE] [--nearfield NEARFIELD]
                  [--cutoff-at-nearfield | --no-cutoff-at-nearfield]
                  [--lines-during-passive {interpolate-time,interpolate-index,predict,redact,undefined}]
                  [--collate-passive-length COLLATE_PASSIVE_LENGTH]
                  [--collate-removed-length COLLATE_REMOVED_LENGTH]
                  [--minimum-passive-length MINIMUM_PASSIVE_LENGTH]
                  [--minimum-removed-length MINIMUM_REMOVED_LENGTH]
                  [--minimum-patch-area MINIMUM_PATCH_AREA]
                  [--patch-mode PATCH_MODE] [--variable-name VARIABLE_NAME]
                  [--keep-exclusions]
                  [--row-len-selector {init,min,max,median,mode}]
                  [--facing {downward,upward,auto}]
                  [--training-standardization]
                  [--prenorm-nan-value PRENORM_NAN_VALUE]
                  [--postnorm-nan-value POSTNORM_NAN_VALUE]
                  [--crop-min-depth CROP_MIN_DEPTH]
                  [--crop-max-depth CROP_MAX_DEPTH]
                  [--autocrop-threshold AUTOCROP_THRESHOLD]
                  [--image-height IMAGE_HEIGHT] [--checkpoint CHECKPOINT]
                  [--unconditioned]
                  [--logit-smoothing-sigma SIGMA [SIGMA ...]]
                  [--device DEVICE]
                  [--hide-echoview | --show-echoview | --always-hide-echoview]
                  [--minimize-echoview] [--verbose] [--quiet]
                  FILE_OR_DIRECTORY [FILE_OR_DIRECTORY ...]

Actions#

These arguments specify special actions to perform. The main action of this program is supressed if any of these are given.

--version, -V

Show program’s version number and exit.

--show-cache-dir

Show the path to the cache directory and exit.

--list-checkpoints

Show the available model checkpoints and exit.

--list-colors, --list-colours

Possible choices: alphabetic, full, full-alphabetic, xkcd, xkcd-alphabetic

Show the available line color names and exit. The available color palette can be viewed at https://matplotlib.org/stable/gallery/color/named_colors.html#css-colors. The XKCD color palette is also available, but is not shown in the output by default due to its size. To show the just main palette, run as --list-colors without argument, or --list-colors alphabetic to view it in alphabetic order. The default ordering is by hue. To show the full palette, run as --list-colors full or --list-colors full-alphabetic.

Configuration#

-c, --config

Path to a configuration file. The settings in the configuration file will override the default values described in the rest of the help documentation, but will themselves be overridden by any arguments provided at the command prompt. Config file syntax allows: key=value, flag=true, stuff=[a,b,c] (for details, see syntax at https://goo.gl/R74nmi).

Positional arguments#

FILE_OR_DIRECTORY

File(s)/directory(ies) to process. Inputs can be absolute paths or relative paths to either files or directories. Paths can be given relative to the current directory, or optionally be relative to the SOURCE_DIR argument specified with --source-dir. For each directory given, the directory will be searched recursively for files bearing an extension specified by SEARCH_EXTENSION (see the --extension argument for details). Multiple files and directories can be specified, separated by spaces. This is a required argument. At least one input file or directory must be given, unless one of the arguments listed above under “Actions” is given. In order to process the directory given by SOURCE_DIR, specify “.” for this argument, such as:

echofilter . --source-dir SOURCE_DIR

Input file arguments#

Optional parameters specifying which files will processed.

--source-dir, -d

Path to source directory which contains the files and folders specified by the paths argument. Default: "." (the current directory).

--recursive-dir-search, -r

For any directories provided in the FILE_OR_DIRECTORY input, all subdirectories will also be recursively walked through to find files to process. This is the default behaviour.

--no-recursive-dir-search, -R

For any directories provided in the FILE_OR_DIRECTORY input, only files within the specified directory will be included in the files to process. Subfolders within the directory will not be included.

--extension, -x

File extension(s) to process. This argument is used when the FILE_OR_DIRECTORY is a directory; files within the directory (and all its recursive subdirectories) are filtered against this list of extensions to identify which files to process. Default: ['csv']. (Note that the default SEARCH_EXTENSION value is OS-specific.)

--skip-existing, --skip, -s

Skip processing files for which all outputs already exist

--skip-incompatible

Skip over incompatible input CSV files, without raising an error. Default behaviour is to stop if an input CSV file can not be processed. This argument is useful if you are processing a directory which contains a mixture of CSV files - some are Sv data exported from EV files and others are not.

--continue-on-error

Continue running on remaining files if one file hits an error.

Destination file arguments#

Optional parameters specifying where output files will be located.

--output-dir, -o

Path to output directory. If empty (default), each output is placed in the same directory as its input file. If OUTPUT_DIR is specified, the full output path for each file contains the subtree of the input file relative to the base directory given by SOURCE_DIR.

--dry-run, -n

Perform a trial run, with no changes made. Text printed to the command prompt indicates which files would be processed, but work is only simulated and not performed.

--overwrite-files

Overwrite existing files without warning. Default behaviour is to stop processing if an output file already exists.

--overwrite-ev-lines

Overwrite existing lines within the Echoview file without warning. Default behaviour is to append the current datetime to the name of the line in the event of a collision.

--force, -f

Short-hand equivalent to supplying both --overwrite-files and --overwrite-ev-lines.

--no-ev-import

Do not import lines and regions back into any EV file inputs. Default behaviour is to import lines and regions and then save the file, overwriting the original EV file.

--no-turbulence-line

Do not output an evl file for the turbulence line, and do not import a turbulence line into the EV file.

--no-bottom-line

Do not output an evl file for the bottom line, and do not import a bottom line into the EV file.

--no-surface-line

Do not output an evl file for the surface line, and do not import a surface line into the EV file.

--no-nearfield-line

Do not add a nearfield line to the EV file.

--suffix-file, --suffix

Suffix to append to output artifacts evl and evr files, between the name of the file and the extension. If SUFFIX_FILE begins with an alphanumeric character, “-” is prepended to it to act as a delimiter. The default behavior is to not append a suffix.

--suffix-var

Suffix to append to line and region names when imported back into EV file. If SUFFIX_VAR begins with an alphanumeric character, “-” is prepended to it to act as a delimiter. The default behaviour is to match SUFFIX_FILE if it is set, and use "_echofilter" otherwise.

--color-turbulence

Color to use for the turbulence line when it is imported into Echoview. This can either be the name of a supported color (see --list-colors for options), or a a hexadecimal string, or a string representation of an RGB color to supply directly to Echoview (such as "(0,255,0)"). Default: "orangered".

--color-turbulence-offset

Color to use for the offset turbulence line when it is imported into Echoview. If unset, this will be the same as COLOR_TURBULENCE.

--color-bottom

Color to use for the bottom line when it is imported into Echoview. This can either be the name of a supported color (see --list-colors for options), or a a hexadecimal string, or a string representation of an RGB color to supply directly to Echoview (such as "(0,255,0)"). Default: "orangered".

--color-bottom-offset

Color to use for the offset bottom line when it is imported into Echoview. If unset, this will be the same as COLOR_BOTTOM.

--color-surface

Color to use for the surface line when it is imported into Echoview. This can either be the name of a supported color (see --list-colors for options), or a a hexadecimal string, or a string representation of an RGB color to supply directly to Echoview (such as "(0,255,0)"). Default: "green".

--color-surface-offset

Color to use for the offset surface line when it is imported into Echoview. If unset, this will be the same as COLOR_SURFACE.

--color-nearfield

Color to use for the nearfield line when it is created in Echoview. This can either be the name of a supported color (see --list-colors for options), or a a hexadecimal string, or a string representation of an RGB color to supply directly to Echoview (such as "(0,255,0)"). Default: "mediumseagreen".

--thickness-turbulence

Thicknesses with which the turbulence line will be displayed in Echoview. Default: 2.

--thickness-turbulence-offset

Thicknesses with which the offset turbulence line will be displayed in Echoview. If unset, this will be the same as THICKNESS_TURBULENCE.

--thickness-bottom

Thicknesses with which the bottom line will be displayed in Echoview. Default: 2.

--thickness-bottom-offset

Thicknesses with which the offset bottom line will be displayed in Echoview. If unset, this will be the same as THICKNESS_BOTTOM.

--thickness-surface

Thicknesses with which the surface line will be displayed in Echoview. Default: 1.

--thickness-surface-offset

Thicknesses with which the offset surface line will be displayed in Echoview. If unset, this will be the same as THICKNESS_SURFACE.

--thickness-nearfield

Thicknesses with which the nearfield line will be displayed in Echoview. Default: 1.

--cache-dir

Path to checkpoint cache directory. Default: "/home/docs/.cache/echofilter".

--cache-csv

Path to directory where CSV files generated from EV inputs should be cached. If this argument is supplied with an empty string, exported CSV files will be saved in the same directory as each input EV file. The default behaviour is discard any CSV files generated by this program once it has finished running.

--suffix-csv

Suffix to append to the file names of cached CSV files which are exported from EV files. The suffix is inserted between the input file name and the new file extension, “.csv”. If SUFFIX_CSV begins with an alphanumeric character, a delimiter is prepended. The delimiter is “-”, or “.” if --keep-ext is given. The default behavior is to not append a suffix.

--keep-ext

If provided, the output file names (evl, evr, csv) maintain the input file extension before their suffix (including a new file extension). Default behaviour is to strip the input file name extension before constructing the output paths.

Output configuration arguments#

Optional parameters specifying the properties of the output.

--line-status

Status value for all the lines which are generated. Options are:

0: none, 1: unverified, 2: bad, 3: good

Default: 3.

--offset

Offset for turbulence, bottom, and surface lines, in metres. This will shift turbulence and surface lines downwards and the bottom line upwards by the same distance of OFFSET. Default: 1.0.

--offset-turbulence

Offset for the turbulence line, in metres. This shifts the turbulence line downards by some distance OFFSET_TURBULENCE. If this is set, it overwrites the value provided by --offset.

--offset-bottom

Offset for the bottom line, in metres. This shifts the bottom line upwards by some distance OFFSET_BOTTOM. If this is set, it overwrites the value provided by --offset.

--offset-surface

Offset for the surface line, in metres. This shifts the surface line downards by some distance OFFSET_SURFACE. If this is set, it overwrites the value provided by --offset.

--nearfield

Nearfield distance, in metres. Default: 1.7. If the echogram is downward facing, the nearfield cutoff will be NEARFIELD meters below the shallowest depth recorded in the input data. If the echogram is upward facing, the nearfield cutoff will be NEARFIELD meters above the deepest depth recorded in the input data. When processing an EV file, by default a nearfield line will be added at the nearfield cutoff depth. To prevent this behaviour, use the --no-nearfield-line argument.

--cutoff-at-nearfield

Enable cut-off at the nearfield distance for both the turbulence line (on downfacing data) as well as the bottom line (on upfacing data). Default behavior is to only clip the bottom line.

--no-cutoff-at-nearfield

Disable cut-off at the nearfield distance for both the turbulence line (on downfacing data) and the bottom line (on upfacing data). Default behavior is to clip the bottom line but not the turbulence line.

--lines-during-passive

Possible choices: interpolate-time, interpolate-index, predict, redact, undefined

Method used to handle line depths during collection periods determined to be passive recording instead of active recording. Options are:

interpolate-time:

depths are linearly interpolated from active recording periods, using the time at which recordings where made.

interpolate-index:

depths are linearly interpolated from active recording periods, using the index of the recording.

predict:

the model’s prediction for the lines during passive data collection will be kept; the nature of the prediction depends on how the model was trained.

redact:

no depths are provided during periods determined to be passive data collection.

undefined:

depths are replaced with the placeholder value used by Echoview to denote undefined values, which is -10000.99.

Default: "interpolate-time".

--collate-passive-length

Maximum interval, in ping indices, between detected passive regions which will removed to merge consecutive passive regions together into a single, collated, region. Default: 10.

--collate-removed-length

Maximum interval, in ping indices, between detected blocks (vertical rectangles) marked for removal which will also be removed to merge consecutive removed blocks together into a single, collated, region. Default: 10.

--minimum-passive-length

Minimum length, in ping indices, which a detected passive region must have to be included in the output. Set to -1 to omit all detected passive regions from the output. Default: 10.

--minimum-removed-length

Minimum length, in ping indices, which a detected removal block (vertical rectangle) must have to be included in the output. Set to -1 to omit all detected removal blocks from the output (default). When enabling this feature, the recommended minimum length is 10.

--minimum-patch-area

Minimum area, in pixels, which a detected removal patch (contour/polygon) region must have to be included in the output. Set to -1 to omit all detected patches from the output (default). When enabling this feature, the recommended minimum area is 25.

--patch-mode

Type of mask patches to use. Must be supported by the model checkpoint used. Should be one of:

merged:

Target patches for training were determined after merging as much as possible into the turbulence and bottom lines.

original:

Target patches for training were determined using original lines, before expanding the turbulence and bottom lines.

ntob:

Target patches for training were determined using the original bottom line and the merged turbulence line.

Default: “merged” is used if downfacing; “ntob” if upfacing.

Input processing arguments#

Optional parameters specifying how data will be loaded from the input files and transformed before it given to the model.

--variable-name, --vn

Name of the Echoview acoustic variable to load from EV files. Default: "Fileset1: Sv pings T1".

--keep-exclusions, --keep-thresholds

Export CSV with all thresholds, exclusion regions, and bad data exclusions set as per the EV file. Default behavior is to ignore these settings and export the underlying raw data.

--row-len-selector

Possible choices: init, min, max, median, mode

How to handle inputs with differing number of depth samples across time. This method is used to select the “master” number of depth samples and minimum and maximum depth. The Sv values for all timepoints are interpolated onto this range of depths in order to create an input which is sampled in a rectangular manner. Default: "mode", the modal number of depths is used, and the modal depth range is select amongst time samples which bear this number of depths.

--facing

Possible choices: downward, upward, auto

Orientation of echosounder. If this is “auto” (default), the orientation is automatically determined from the ordering of the depths field in the input (increasing depth values = “downward”; diminishing depths = “upward”).

--training-standardization

If this is given, Sv intensities are scaled using the values used when the model was trained before being given to the model for inference. The default behaviour is to derive the standardization values from the Sv statistics of the input instead.

--prenorm-nan-value

If set, NaN values in the imported CSV data will be replaced with this Sv intensity value.

--postnorm-nan-value

If set, NaN values in the imported CSV data will be replaced with this Sv intensity value after the input distribution has been standardized to have zero mean and unit variance.

--crop-min-depth

Shallowest depth, in metres, to analyse. Data will be truncated at this depth, with shallower data removed before the Sv input is shown to the model. Default behaviour is not to truncate.

--crop-max-depth

Deepest depth, in metres, to analyse. Data will be truncated at this depth, with deeper data removed before the Sv input is shown to the model. Default behaviour is not to truncate.

--autocrop-threshold, --autozoom-threshold

The inference routine will re-run the model with a zoomed in version of the data, if the fraction of the depth which it deems irrelevant exceeds the AUTO_CROP_THRESHOLD. The extent of the depth which is deemed relevant is from the shallowest point on the surface line to the deepest point on the bottom line. The data will only be zoomed in and re-analysed at most once. To always run the model through once (never auto zoomed), set to 1. To always run the model through exactly twice (always one round of auto-zoom), set to 0. Default: 0.35.

--image-height, --height

Height to which the Sv image will be rescaled, in pixels, before being given to the model. The default behaviour is to use the same height as was used when the model was trained.

Model arguments#

Optional parameters specifying which model checkpoint will be used and how it is run.

--checkpoint

Name of checkpoint to load, or path to a checkpoint file. Default: "echofilter-v1_bifacing_700ep".

--unconditioned, --force-unconditioned

If this flag is present and a conditional model is loaded, it will be run for its unconditioned output. This means the model is output is not conditioned on the orientation of the echosounder. By default, conditional models are used for their conditional output.

--logit-smoothing-sigma

Standard deviation of Gaussian smoothing kernel applied to the logits provided as the model’s output. The smoothing regularises the output to make it smoother. Multiple values can be given to use different kernel sizes for each dimension, in which case the first value is for the timestamp dimension and the second value is for the depth dimension. If a single value is given, the kernel is symmetric. Values are relative to the pixel space returned by the UNet model. Disabled by default.

--device

Device to use for running the model for inference. Default: use first GPU if available, otherwise use the CPU. Note: echofilter.exe is complied without GPU support and can only run on the CPU. To use the GPU you must use the source version.

Echoview window management#

Optional parameters specifying how to interact with any Echoview windows which are used during this process.

--hide-echoview

Hide any Echoview window spawned by this program. If it must use an Echoview instance which was already running, that window is not hidden. This is the default behaviour.

--show-echoview

Don’t hide an Echoview window created to run this code. (Disables the default behaviour which is equivalent to --hide-echoview.)

--always-hide-echoview, --always-hide

Hide the Echoview window while this code runs, even if this process is utilising an Echoview window which was already open.

--minimize-echoview

Minimize any Echoview window used to runs this code while it runs. The window will be restored once the program is finished. If this argument is supplied, --show-echoview is implied unless --hide-echoview is also given.

Verbosity arguments#

Optional parameters controlling how verbose the program should be while it is running.

--verbose, -v

Increase the level of verbosity of the program. This can be specified multiple times, each will increase the amount of detail printed to the terminal. The default verbosity level is 2.

--quiet, -q

Decrease the level of verbosity of the program. This can be specified multiple times, each will reduce the amount of detail printed to the terminal.