echofilter-generate-shards#

Generate dataset shards

usage: echofilter-generate-shards [-h] [--version] [--root ROOT_DATA_DIR]
                                  [--partitioning-version PARTITIONING_VERSION]
                                  [--max-depth MAX_DEPTH]
                                  [--shard-len SHARD_LEN] [--ncores NCORES]
                                  [--verbose]
                                  partition dataset

Positional Arguments#

partition

partition to shard

dataset

dataset to shard

Named Arguments#

--version, -V

show program’s version number and exit

--root

root data directory

Default: “/data/dsforce/surveyExports”

--partitioning-version

partitioning version

Default: “firstpass”

--max-depth

maximum depth to include in sharded data

--shard-len

number of samples in each shard

Default: 128

--ncores

number of cores to use (default: all). Set to 1 to disable multiprocessing.

--verbose, -v

increase verbosity

Default: 0