Accumulators#

Common properties#

All accumulators can be filled like a histogram. You just call .fill with values, and this looks and behaves like filling a single-bin or “scalar” histogram. Like histograms, the fill is inplace.

All accumulators have a .value property as well, which gives the primary value being accumulated.

Types#

There are several accumulators.

Sum#

This is the simplest accumulator, and is never returned from a histogram. This is internally used by the Double and Unlimited storages to perform sums when needed. It uses a highly accurate Neumaier sum to compute the floating point sum with a correction term. Since this accumulator is never returned by a histogram, it is not available in a view form, but only as a single accumulator for comparison and access to the algorithm. Usage example in Python 3.8, showing how non-accurate sums fail to produce the obvious answer, 2.0:

import math
import numpy as np
import boost_histogram as bh

values = [1.0, 1e100, 1.0, -1e100]
print(f"{sum(values) = } (simple)")
print(f"{math.fsum(values) = }")
print(f"{np.sum(values) = } (pairwise)")
print(f"{bh.accumulators.Sum().fill(values) = }")
sum(values) = 0.0 (simple)
math.fsum(values) = 2.0
np.sum(values) = 0.0 (pairwise)
bh.accumulators.Sum().fill(values) = Sum(0 + 2)

Note that this is still intended for performance and does not guarantee correctness as math.fsum does. In general, you must not have more than two orders of values:

values = [1., 1e100, 1e50, 1., -1e50, -1e100]
print(f"{math.fsum(values) = }")
print(f"{bh.accumulators.Sum().fill(values) = }")
math.fsum(values) = 2.0
bh.accumulators.Sum().fill(values) = Sum(0 + 0)

You should note that this is a highly contrived example and the Sum accumulator should still outperform simple and pairwise summation methods for a minimal performance cost. Most notably, you have to have large cancellations with negative values, which histograms generally do not have.

You can use += with a float value or a Sum to fill as well.

WeightedSum#

This accumulator is contained in the Weight storage, and supports Views. It provides two values; .value, and .variance. The value is the sum of the weights, and the variance is the sum of the squared weights.

For example, you could sum the following values:

import boost_histogram as bh

values = [10]*10
smooth = bh.accumulators.WeightedSum().fill(values)
print(f"{smooth = }")

values = [1]*9 + [91]
rough = bh.accumulators.WeightedSum().fill(values)
print(f"{rough =  }")
smooth = WeightedSum(value=100, variance=1000)
rough =  WeightedSum(value=100, variance=8290)

When filling, you can optionally provide a variance= keyword, with either a single value or a matching length array of values.

You can also fill with += on a value or another WeighedSum.

Mean#

This accumulator is contained in the Mean storage, and supports Views. It provides three values; .count, .value, and .variance. Internally, the variance is stored as _sum_of_deltas_squared, which is used to compute variance.

For example, you could compute the mean of the following values:

import boost_histogram as bh

values = [10]*10
smooth = bh.accumulators.Mean().fill(values)
print(f"{smooth = }")

values = [1]*9 + [91]
rough = bh.accumulators.Mean().fill(values)
print(f"{rough =  }")
smooth = Mean(count=10, value=10, variance=0)
rough =  Mean(count=10, value=10, variance=810)

You can add a weight= keyword when filling, with either a single value or a matching length array of values.

You can call a Mean with a value or with another Mean to fill inplace, as well.

WeightedMean#

This accumulator is contained in the WeightedMean storage, and supports Views. It provides four values; .sum_of_weights, sum_of_weights_squared, .value, and .variance. Internally, the variance is stored as _sum_of_weighted_deltas_squared, which is used to compute variance.

For example, you could compute the mean of the following values:

import boost_histogram as bh

values = [1]*9 + [91]
wm = bh.accumulators.WeightedMean().fill(values, weight=2)
print(f"{wm = }")
wm = WeightedMean(sum_of_weights=20, sum_of_weights_squared=40, value=10, variance=810)

You can add a weight= keyword when filling, with either a single value or a matching length array of values.

You can call a WeightedMean with a value or with another WeightedMean to fill inplace, as well.

Views#

Most of the accumulators (except Sum) support a View. This is what is returned from a histogram when .view() is requested. This is a structured NumPy ndarray, with a few small additions to make them easier to work with. Like a NumPy recarray, you can access the fields with attributes; you can even access (but not set) computed attributes like .variance. A view will also return an accumulator instance if you select a single item. You can set a view’s contents with a stacked array, and each item in the stack will be used for the (computed) values that a normal constructor would take. For example, WeighedMean can take an array with a final dimension four long, with sum_of_weights, sum_of_weights_squared, value, and variance elements, even though several of these values are computed from the internal representation.