Working with Jupyter notebooks

display_diffs() can be used to compare two NumPy, Pandas, or Xarray objects in a Jupyter notebook:

import sys

sys.path.insert(0, "..")

import xarray

from recursive_diff import display_diffs

a = xarray.Dataset(
    {
        "v1": (("r", "c"), [[1, 2], [3, 4]]),
        "v2": ("r", ["foo", "bar"]),
        "r": ["r1", "r2"],
        "extra": [5],
    },
    attrs={"some_tag": "Hello"},
)

b = xarray.Dataset(
    {
        "v1": (("r", "c"), [[1, 5], [3.1, 4]]),
        "v2": ("r", ["bar", "bar"]),
        "r": ["r1", "r2"],
    },
    attrs={"some_tag": "World"},
)


display_diffs(a, b)

[data_vars][v1]

lhs rhs abs_delta rel_delta
c r
0 r2 3 3.1 0.1 0.033333
1 r1 2 5.0 3.0 1.500000

[data_vars][v2]

lhs rhs
r
r1 foo bar

Other differences

  • [attrs][some_tag]: Hello != World
  • [index]: Dimension extra is in LHS only

Just like recursive_diff.recursive_diff(), you may use it to visualize differences in nested structures too:

c = {"foo": [1, 2, [3, 4]]}
d = {"foo": [1.0000000001, 5, [3]], "bar": 6}

display_diffs(c, d)
  • Pair bar:6 is in RHS only
  • [foo][1]: 2 != 5 (abs: 3.0e+00, rel: 1.5e+00)
  • [foo][2]: LHS has 1 more elements than RHS: [4]

Comparing directories

If you have two directories full of data, you can compare them in one go with recursive_open():

import json
import tempfile

lhs = tempfile.TemporaryDirectory()
rhs = tempfile.TemporaryDirectory()

a.to_zarr(f"{lhs.name}/array.zarr", mode="w", zarr_format=2)
b.to_zarr(f"{rhs.name}/array.zarr", mode="w", zarr_format=2)
with open(f"{lhs.name}/nested.json", "w") as fh:
    json.dump(c, fh)
with open(f"{rhs.name}/nested.json", "w") as fh:
    json.dump(d, fh)
from recursive_diff import recursive_open

display_diffs(recursive_open(lhs.name), recursive_open(rhs.name))

[array.zarr][data_vars][v1]

lhs rhs abs_delta rel_delta
c r
0 r2 3 3.1 0.1 0.033333
1 r1 2 5.0 3.0 1.500000

[array.zarr][data_vars][v2]

lhs rhs
r
r1 foo bar

Other differences

  • [array.zarr][attrs][some_tag]: Hello != World
  • [array.zarr][index]: Dimension extra is in LHS only
  • [nested.json]: Pair bar:6 is in RHS only
  • [nested.json][foo][1]: 2 != 5 (abs: 3.0e+00, rel: 1.5e+00)
  • [nested.json][foo][2]: LHS has 1 more elements than RHS: [4]