ncdiff
Compare either two NetCDF files or all NetCDF files in two directories.
Usage
usage: ncdiff.py [-h]
[--engine {netcdf4,scipy,pydap,h5netcdf,pynio,cfgrib,pseudonetcdf}]
[--quiet] [--recursive] [--match MATCH] [--rtol RTOL]
[--atol ATOL] [--brief_dims DIM [DIM ...] | --brief]
lhs rhs
Compare either two NetCDF files or all NetCDF files in two directories.
positional arguments:
lhs Left-hand-side NetCDF file or (if --recursive) directory
rhs Right-hand-side NetCDF file or (if --recursive) directory
optional arguments:
-h, --help show this help message and exit
--engine {netcdf4,scipy,pydap,h5netcdf,pynio,cfgrib,pseudonetcdf},
-e {netcdf4,scipy,pydap,h5netcdf,pynio,cfgrib,pseudonetcdf}
NeCDF engine (may require additional modules
--quiet, -q Suppress logging
--recursive, -r Compare all NetCDF files with matching names in two directories
--match MATCH, -m MATCH
Bash wildcard match for file names when using --recursive (default: **/*.nc)
--rtol RTOL Relative comparison tolerance (default: 1e-9)
--atol ATOL Absolute comparison tolerance (default: 0)
--brief_dims DIM [DIM ...]
Just count differences along one or more dimensions instead of printing them out individually
--brief, -b Just count differences for every variable instead of printing them out individually
Examples:
Compare two NetCDF files:
ncdiff a.nc b.nc
Compare all NetCDF files with identical names in two directories:
ncdiff -r dir1 dir2
Chunking and RAM design
This tool does not support chunked files, or loading only part of large datasets into memory at once. Instead, chunked datasets are loaded as individual files. One variable at a time is then loaded into memory completely, compared, and then discarded.
This has the big advantage of simplicity, but a few disadvantages:
No option to compare datasets with mismatched prefixes (e.g.
foo.*.nc
vs.bar.*.nc
).No option to compare chunked datasets that differ only in chunking
Slower, as there is no option to skip loading over and over again variables that don’t sit on the concat_dim. See also xarray#2039.
Huge RAM usage in case of monolithic variables
Further limitations
Won’t compare NetCDF settings, e.g. store version, compression, chunking, etc.
Doesn’t support indices with duplicate elements