Threaded Fills

[1]:
import numpy as np
import boost_histogram as bh
from concurrent.futures import ThreadPoolExecutor
from functools import reduce
from operator import add
from numpy.testing import assert_array_equal

This notebook explores parallel filling by hand (not using the threads= argument).

[2]:
hist_linear = bh.Histogram(bh.axis.Regular(100, 0, 1))
hist_atomic = bh.Histogram(bh.axis.Regular(100, 0, 1),
                                storage=bh.storage.AtomicInt64())

vals = np.random.rand(10_000_000)
hist_answer = hist_linear.fill(vals).copy()

This is a traditional fill.

[3]:
%%timeit
hist_linear.reset()
hist_linear.fill(vals)
assert_array_equal(hist_answer, hist_linear)
25.5 ms ± 774 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

This is a single threaded atomic fill.

[4]:
%%timeit
hist_atomic.reset()
hist_atomic.fill(vals)
assert_array_equal(hist_answer, hist_atomic)
59.9 ms ± 2.19 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

This is a threaded fill (storage not threadsafe, so will get the wrong result; just for comparison)

This is a threaded fill, this time with atomics. It may not be faster, but is useful in situations where you are filling from multiple places in your code.

[5]:
%%timeit
hist_atomic.reset()
threads = 4
with ThreadPoolExecutor(threads) as pool:
    for chunk in np.array_split(vals, threads):
        pool.submit(hist_atomic.fill, chunk)
assert_array_equal(hist_answer, hist_atomic)
61.2 ms ± 1.57 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

This makes four seperate histograms, then fills them and adds at the end.

[6]:
def fun(x):
    hist = bh.Histogram(bh.axis.Regular(100, 0, 1))
    return hist.fill(x)
[7]:
%%timeit
threads = 4
with ThreadPoolExecutor(threads) as pool:
    results = pool.map(fun, np.array_split(vals, threads))
hist_quad = reduce(add, results)
assert_array_equal(hist_answer, hist_quad)
8.12 ms ± 90 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The expense of creating the histogram and summing them must be significanly less than the cost of filling for this to be faster.