Performance Comparison

We will compare boost-histogram to numpy.

[1]:
import boost_histogram as bh
import numpy as np
from numpy.testing import assert_allclose
[2]:
import os
threads = os.cpu_count() // 2
print(f"threads: {threads}")
threads: 8

Testing setup

This is just a simple 1D and 2D dataset to use for performance runs. The testing setup is the same as “MBP” in this post, a dual-core MacBook Pro 2015.

[3]:
bins=(100, 100)
ranges=((-3, 3),(-3, 3))
bins = np.asarray(bins).astype(np.int64)
ranges = np.asarray(ranges).astype(np.float64)

edges = (np.linspace(*ranges[0,:], bins[0]+1),
         np.linspace(*ranges[1,:], bins[1]+1))
[4]:
np.random.seed(42)
vals = np.random.normal(size=[2, 10_000_000]).astype(np.float32)
vals1d = np.random.normal(size=[10_000_000]).astype(np.float32)

Traditional 1D Numpy Histogram

This is reasonably optimized; it should provide good perforance.

[5]:
answer, e = np.histogram(vals1d, bins=bins[0], range=ranges[0])
[6]:
%%timeit
h, _ = np.histogram(vals1d, bins=bins[0], range=ranges[0])
assert_allclose(h, answer, atol=1)
74.5 ms ± 2.37 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Boost histogram 1D

[7]:
%%timeit
hist = bh.Histogram(bh.axis.Regular(bins[0], *ranges[0]), storage=bh.storage.Int64())
hist.fill(vals1d)
assert_allclose(hist, answer, atol=1)
41.6 ms ± 712 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Boost histogram 1D NumPy clone

[8]:
%%timeit
h, _ = bh.numpy.histogram(vals1d, bins=bins[0], range=ranges[0])
assert_allclose(h, answer, atol=1)
43.1 ms ± 769 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Boost histogram in 1D, threaded

[9]:
%%timeit
hist = bh.Histogram(bh.axis.Regular(bins[0], *ranges[0]), storage=bh.storage.Int64())

hist.fill(vals1d ,threads=threads)
assert_allclose(hist, answer, atol=1)
13.3 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Boost histogram 1D NumPy clone, threaded

[10]:
%%timeit
h, _ = bh.numpy.histogram(vals1d, bins=bins[0], range=ranges[0], threads=threads)
assert_allclose(h, answer, atol=1)
13.8 ms ± 238 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Traditional 2D Numpy histogram

Not as well optimized for regular filling.

[11]:
answer2, *ledges = np.histogram2d(*vals, bins=bins, range=ranges)
[12]:
%%timeit
H, *ledges = np.histogram2d(*vals, bins=bins, range=ranges)
assert_allclose(H, answer2, atol=1)
874 ms ± 22.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Boost histogram in 2D

[13]:
%%timeit
hist = bh.Histogram(bh.axis.Regular(bins[0], *ranges[0]),
                    bh.axis.Regular(bins[1], *ranges[1]))
hist.fill(*vals)
assert_allclose(hist, answer2, atol=1)
77.6 ms ± 615 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Boost histogram 2D NumPy clone

[14]:
%%timeit
H, *ledges = bh.numpy.histogram2d(*vals, bins=bins, range=ranges)
assert_allclose(H, answer2, atol=1)
84.7 ms ± 2.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Boost histogram in 2D, threaded

[15]:
%%timeit
hist = bh.Histogram(bh.axis.Regular(bins[0], *ranges[0]),
                    bh.axis.Regular(bins[1], *ranges[1]))

hist.fill(*vals, threads=threads)
assert_allclose(hist, answer2, atol=1)
28.7 ms ± 708 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Boost histogram 2D NumPy clone, threaded

[16]:
%%timeit
H, *ledges = bh.numpy.histogram2d(*vals, bins=bins, range=ranges, threads=threads)
assert_allclose(H, answer2, atol=1)
29.6 ms ± 503 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)