Indexing#
Boost-histogram implements the UHI indexing protocol. You can read more about it on the UHI Indexing page.
Boost-histogram specific details#
Boost-histogram implements bh.loc, builtins.sum, bh.rebin, bh.underflow, and bh.overflow from the UHI spec. A bh.tag.at locator is provided as well, which simulates the Boost.Histogram C++ .at() indexing using the UHI locator protocol.
Boost-histogram allows “picking” using lists, similar to NumPy. If you select with multiple lists, boost-histogram instead selects per-axis, rather than group-selecting and reducing to a single axis, like NumPy does. You can use bh.loc(...) inside these lists.
Example:
h = bh.Histogram(
bh.axis.Regular(10, 0, 1),
bh.axis.StrCategory(["a", "b", "c"]),
bh.axis.IntCategory([5, 6, 7]),
)
minihist = h[:, [bh.loc("a"), bh.loc("c")], [0, 2]]
# Produces a 3D histogram with Regular(10, 0, 1) x StrCategory(["a", "c"]) x IntCategory([5, 7])
This feature is considered experimental. Removed bins are not added to the overflow bin currently.
Vectorized indexing#
While lists select per-axis (as described above), NumPy integer arrays are treated as vectorized fancy indexing, following NumPy’s broadcasting rules. This gathers (or scatters, when assigning) many individual cells in a single operation, instead of building a new histogram. It is much faster than looping over scalar indices, which matters for high-dimensional histograms with many categories.
Example:
import numpy as np
h = bh.Histogram(
bh.axis.IntCategory(range(100)),
bh.axis.IntCategory(range(100)),
bh.axis.Regular(50, 0, 1),
)
datasets = np.array([3, 7, 42])
categories = np.array([1, 1, 9])
# Gather the Regular-axis contents for three (dataset, category) pairs at once
values = h[datasets, categories, :] # shape (3, 50)
# Assignment works the same way
h[datasets, categories, :] = 0
The array indices use the same (non-flow) numbering as scalar indexing, and you
can mix them with integers, bh.loc(...), and plain integer slices. This path
covers the common case directly; for anything more advanced (rebinning,
sum-projection, or locator-based slices alongside arrays) index .view()
explicitly instead.