Performance of `select-keys` variants

Measure `select-keys` variants, contains all keys
Measure `select-keys` variants, contains half keys
Measure `select-keys` variants, contains no keys

How does Clojure's select-keys performance compare to alternatives?

Observation: A straightforward, idiomatic implementation with reduce, conj!, and transients offers performance improvements across a wide range of conditions.

Clojure's select-keys is implemented with recursive loop pattern. Let's write six variants that mix-and-match components and compare their performance.

name machine
select-keys loop & recur
fselect-keys-1 reduce & assoc
fselect-keys-2 reduce & assoc! transients
fselect-keys-3 reduce & conj a MapEntry
fselect-keys-4 reduce & conj! transients
fselect-keys-5 reduce & conj a MapEntry
fselect-keys-6 reduce & conj! a MapEntry transients

We'll use the Criterium benchmarking facility to objectively measure the evaluation times of a range of conditions. We'll explore two regimes of hashmap sizes: zero to sixteen key/values pairs stepping by ones, and one to one-million key/value pairs stepping by decades. Further, we'll investigate the effects of the different flavors of key sequences (labeled all keys, half keys, and no keys, explained later).

Overall, three variants demonstrated improved performance over Clojure's select-keys. Variant six, labeled as fselect-keys-6, displayed 5–20% improved performance over a wide range of hashmap sizes and fractions of key sequence matches. This implementation is straightforward and maintains idiomatic style. It is provided by see-sharp's core namespace, named fselect-keys.

Measure `select-keys` variants, contains all keys

Let's consider selecting keys from hashmaps where the elements of the key sequence are all present in the hashmap. For example, this hashmap…

{:a 11 :b 22 :c 33 :d 44 :e 55 :f 66}

…fully contains the elements of this key sequence.

[:a :b :c :d :e :f]

If we stack them, we can see the matches.

{:a 11 :b 22 :c 33 :d 44 :e 55 :f 66} ;; hashmap
[:a :b :c :d :e :f] ;; key sequence

The hashmap contains every keyword contained in the key sequence, :a through :f. The output hashmap would be identical to the input hashmap. Though we are using keywords as keys for this illustration, during benchmarking, the keys are integers.

We'll test hashmaps in two size regimes. The first regime is 'large': hashmaps containing one, ten, one-hundred, …up to one-million key-value pairs. The second regime is 'small': hashmaps containing one to sixteen key/value pairs. We will compare the Clojure's select-keys (magenta diamonds) to six alternative implementations.

The upper chart shows that variants 2 (orange diamonds), 4 (green squares), and 6 (gold circles) demonstrate 20-25% speedups across a wide range of hashmap sizes. The lower chart shows that most of the variants perform similarly, with 2, 4, and 6 again performing among the speediest by a small margin.

(fn [n] ((select-keys-tactic) (get-in large-maps [n :map]) (get-in large-maps [n :all-keys])))

Benchmark measurements for expression `(fn [n] ((select-keys-tactic) (get-in large-maps [n :map]) (get-in large-maps [n :all-keys])))`, time versus 'n' arguments, comparing different versions.

(fn [n] ((select-keys-tactic) (get-in small-maps [n :map]) (get-in small-maps [n :all-keys])))

Benchmark measurements for expression `(fn [n] ((select-keys-tactic) (get-in small-maps [n :map]) (get-in small-maps [n :all-keys])))`, time versus 'n' arguments, comparing different versions.
times in seconds, mean±std
arg, n
version 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
fselect-keys-1 1.3e-04±1.1e-06 1.3e-04±6.1e-07 1.4e-04±2.8e-06 1.4e-04±9.9e-07 1.4e-04±3.0e-06 1.4e-04±6.9e-07 1.4e-04±2.4e-06 1.4e-04±1.1e-06 1.4e-04±1.3e-06 1.4e-04±3.6e-06 1.4e-04±1.1e-06 1.4e-04±1.3e-06 1.4e-04±6.6e-07 1.4e-04±1.1e-06 1.4e-04±1.3e-06 1.4e-04±2.5e-06 1.4e-04±7.5e-07
fselect-keys-2 1.4e-04±2.2e-06 1.4e-04±1.5e-06 1.4e-04±9.9e-07 1.4e-04±1.1e-06 1.4e-04±1.5e-06 1.4e-04±9.8e-07 1.4e-04±1.8e-06 1.4e-04±1.1e-06 1.4e-04±1.0e-06 1.4e-04±3.3e-06 1.4e-04±8.2e-07 1.4e-04±1.4e-06 1.4e-04±2.0e-06 1.4e-04±3.7e-06 1.4e-04±1.1e-06 1.4e-04±2.2e-06 1.4e-04±8.1e-07
fselect-keys-3 1.4e-04±9.2e-07 1.4e-04±4.0e-06 1.4e-04±9.1e-07 1.4e-04±8.8e-07 1.4e-04±8.7e-07 1.4e-04±1.2e-06 1.5e-04±1.7e-06 1.4e-04±8.3e-07 1.4e-04±8.4e-07 1.5e-04±1.4e-06 1.5e-04±1.2e-06 1.5e-04±4.3e-06 1.5e-04±1.4e-06 1.5e-04±1.0e-06 1.5e-04±1.0e-06 1.5e-04±1.1e-06 1.5e-04±4.0e-07
fselect-keys-4 1.4e-04±1.1e-05 1.4e-04±1.1e-06 1.4e-04±8.9e-06 1.3e-04±1.8e-06 1.4e-04±1.6e-06 1.4e-04±1.0e-06 1.3e-04±1.6e-06 1.3e-04±1.7e-06 1.3e-04±2.9e-06 1.4e-04±1.7e-06 1.4e-04±7.1e-07 1.4e-04±3.4e-06 1.4e-04±2.7e-06 1.4e-04±1.1e-06 1.4e-04±4.0e-06 1.3e-04±2.2e-06 1.4e-04±6.2e-06
fselect-keys-5 1.3e-04±2.4e-06 1.3e-04±2.3e-06 1.4e-04±3.3e-06 1.3e-04±2.3e-06 1.4e-04±5.0e-06 1.4e-04±9.9e-07 1.3e-04±1.4e-06 1.3e-04±7.9e-07 1.4e-04±2.1e-06 1.4e-04±5.1e-06 1.4e-04±3.3e-06 1.4e-04±1.7e-06 1.4e-04±1.5e-06 1.4e-04±1.7e-06 1.4e-04±9.4e-07 1.4e-04±2.3e-06 1.4e-04±2.2e-06
fselect-keys-6 1.3e-04±1.2e-06 1.3e-04±1.1e-06 1.3e-04±7.8e-07 1.3e-04±7.1e-07 1.3e-04±2.5e-06 1.3e-04±4.3e-06 1.3e-04±1.5e-06 1.3e-04±5.5e-07 1.3e-04±5.8e-07 1.3e-04±8.5e-07 1.4e-04±9.4e-07 1.4e-04±1.3e-06 1.4e-04±9.1e-07 1.4e-04±1.5e-06 1.4e-04±4.0e-06 1.4e-04±1.5e-07 1.4e-04±1.1e-06
select-keys 1.3e-04±1.3e-06 1.3e-04±8.8e-07 1.3e-04±1.2e-06 1.3e-04±1.4e-06 1.3e-04±1.2e-06 1.3e-04±1.2e-06 1.4e-04±4.1e-06 1.3e-04±1.9e-06 1.3e-04±1.1e-06 1.4e-04±1.7e-06 1.4e-04±8.7e-07 1.4e-04±1.4e-06 1.4e-04±9.8e-07 1.4e-04±8.5e-07 1.4e-04±1.3e-06 1.4e-04±1.4e-06 1.4e-04±6.7e-07

Measure `select-keys` variants, contains half keys

The next two charts involve scenarios where the hashmap contains half of the elements of the key sequence. For example, the following hashmap contains only half the keys of the key sequence.

{:a 11 :b 22 :c 33 :d 44 :e 55 :f 66}          ;; hashmap
[:a :c :e :g :h :i] ;; key sequence

The hashmap contains keywords :a, :c, and :e, but not :b, :d, nor :f. Furthermore, the key sequence contains three keywords that are not in the hashmap. The output hashmap would be half the size of the input hashmap.

Variants 4 (green triangles) and 6 (gold circles) show 15-20% speedups for both large hashmaps (first chart below) and small hashmaps (second chart below).

(fn [n] ((select-keys-tactic) (get-in large-maps [n :map]) (get-in large-maps [n :half-keys])))

Benchmark measurements for expression `(fn [n] ((select-keys-tactic) (get-in large-maps [n :map]) (get-in large-maps [n :half-keys])))`, time versus 'n' arguments, comparing different versions.

(fn [n] ((select-keys-tactic) (get-in small-maps [n :map]) (get-in small-maps [n :half-keys])))

Benchmark measurements for expression `(fn [n] ((select-keys-tactic) (get-in small-maps [n :map]) (get-in small-maps [n :half-keys])))`, time versus 'n' arguments, comparing different versions.
times in seconds, mean±std
arg, n
version 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
fselect-keys-1 1.4e-04±1.0e-05 1.4e-04±3.8e-06 1.3e-04±1.1e-06 1.4e-04±2.1e-06 1.4e-04±1.4e-06 1.4e-04±1.2e-06 1.4e-04±8.4e-07 1.4e-04±9.6e-07 1.4e-04±9.3e-07 1.4e-04±9.1e-07 1.4e-04±1.2e-06 1.4e-04±9.9e-07 1.4e-04±7.2e-07 1.4e-04±1.2e-06 1.4e-04±2.0e-06 1.4e-04±1.1e-06 1.4e-04±1.1e-06
fselect-keys-2 1.4e-04±1.4e-06 1.4e-04±1.5e-06 1.4e-04±1.1e-06 1.4e-04±1.2e-06 1.4e-04±1.5e-06 1.4e-04±9.5e-07 1.4e-04±2.1e-06 1.4e-04±1.2e-06 1.4e-04±1.3e-06 1.4e-04±1.8e-06 1.4e-04±2.6e-06 1.4e-04±1.1e-06 1.4e-04±1.5e-06 1.5e-04±4.6e-06 1.4e-04±3.5e-06 1.5e-04±3.0e-06 1.5e-04±6.4e-06
fselect-keys-3 1.4e-04±6.6e-07 1.4e-04±1.2e-06 1.5e-04±4.0e-06 1.4e-04±8.6e-07 1.4e-04±1.3e-06 1.4e-04±1.1e-06 1.4e-04±1.3e-06 1.4e-04±8.5e-07 1.4e-04±1.3e-06 1.5e-04±4.9e-06 1.4e-04±2.7e-06 1.4e-04±9.9e-07 1.4e-04±1.7e-06 1.5e-04±1.1e-06 1.4e-04±4.6e-07 1.5e-04±1.7e-06 1.5e-04±3.4e-06
fselect-keys-4 1.3e-04±2.0e-06 1.3e-04±1.9e-06 1.3e-04±1.8e-06 1.3e-04±1.5e-06 1.3e-04±1.2e-06 1.3e-04±6.3e-07 1.3e-04±1.6e-06 1.3e-04±9.0e-07 1.3e-04±1.9e-06 1.3e-04±9.5e-07 1.3e-04±1.1e-06 1.3e-04±9.4e-07 1.3e-04±1.9e-06 1.3e-04±1.5e-06 1.3e-04±4.3e-06 1.3e-04±2.1e-06 1.3e-04±1.7e-06
fselect-keys-5 1.4e-04±9.9e-07 1.4e-04±3.3e-06 1.4e-04±4.8e-06 1.3e-04±1.1e-06 1.3e-04±2.5e-06 1.4e-04±1.4e-06 1.4e-04±2.5e-06 1.4e-04±3.6e-06 1.4e-04±1.1e-06 1.4e-04±3.9e-06 1.4e-04±2.5e-06 1.3e-04±1.1e-06 1.4e-04±2.3e-06 1.4e-04±9.4e-07 1.4e-04±3.8e-06 1.4e-04±2.0e-06 1.4e-04±8.4e-07
fselect-keys-6 1.4e-04±2.0e-06 1.3e-04±2.2e-06 1.3e-04±2.6e-06 1.3e-04±1.5e-06 1.3e-04±3.9e-06 1.3e-04±3.1e-06 1.3e-04±2.1e-06 1.3e-04±6.7e-07 1.3e-04±1.2e-06 1.3e-04±9.7e-07 1.3e-04±1.5e-06 1.4e-04±1.2e-06 1.3e-04±1.6e-06 1.3e-04±1.7e-06 1.3e-04±1.0e-06 1.3e-04±1.7e-06 1.3e-04±1.1e-06
select-keys 1.3e-04±9.3e-07 1.3e-04±1.3e-06 1.3e-04±1.3e-06 1.3e-04±2.5e-06 1.3e-04±4.1e-06 1.3e-04±1.2e-06 1.3e-04±1.0e-06 1.3e-04±9.8e-07 1.3e-04±2.3e-06 1.3e-04±1.9e-06 1.3e-04±4.1e-06 1.3e-04±1.2e-06 1.3e-04±9.4e-07 1.3e-04±1.9e-06 1.3e-04±9.6e-07 1.3e-04±1.3e-06 1.4e-04±2.8e-06

Measure `select-keys` variants, contains no keys

These final two charts show scenarios when none of elements of the key sequence are keys of the hashmap. For example, the following hashmap and key sequence contain zero keys in common.

{:a 11 :b 22 :c 33 :d 44 :e 55 :f 66}                   ;; hashmap
[ :g :h :i :j :k :l] ;; key sequence

The hashmap contains keys :a through :f, while the key sequence contains keys :g through :l. In this case, though the select-keys variants must handle all the elements of the key sequence, the output hashmap would be empty.

Variant 6 (gold circles) shows about a 5% speedup over Clojure's built-in select-keys (magenta diamonds) for large hashmaps (first chart) and pretty much matches it performance for small hashmaps (second chart).

(fn [n] ((select-keys-tactic) (get-in large-maps [n :map]) (get-in large-maps [n :none-keys])))

Benchmark measurements for expression `(fn [n] ((select-keys-tactic) (get-in large-maps [n :map]) (get-in large-maps [n :none-keys])))`, time versus 'n' arguments, comparing different versions.

(fn [n] ((select-keys-tactic) (get-in small-maps [n :map]) (get-in small-maps [n :none-keys])))

Benchmark measurements for expression `(fn [n] ((select-keys-tactic) (get-in small-maps [n :map]) (get-in small-maps [n :none-keys])))`, time versus 'n' arguments, comparing different versions.
times in seconds, mean±std
arg, n
version 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
fselect-keys-1 1.4e-04±1.2e-06 1.3e-04±9.4e-07 1.4e-04±1.4e-06 1.3e-04±1.1e-06 1.3e-04±9.7e-07 1.4e-04±6.2e-07 1.4e-04±2.3e-06 1.4e-04±3.3e-06 1.4e-04±1.0e-06 1.4e-04±1.2e-06 1.4e-04±1.0e-06 1.4e-04±1.1e-06 1.4e-04±9.2e-07 1.4e-04±1.3e-06 1.4e-04±7.6e-07 1.4e-04±2.4e-06 1.4e-04±1.9e-06
fselect-keys-2 1.4e-04±1.4e-06 1.4e-04±3.5e-06 1.4e-04±1.2e-06 1.4e-04±4.6e-06 1.4e-04±1.1e-06 1.4e-04±9.6e-07 1.4e-04±1.1e-06 1.4e-04±1.5e-06 1.4e-04±5.1e-07 1.4e-04±1.6e-06 1.4e-04±7.1e-07 1.4e-04±1.1e-06 1.4e-04±1.1e-06 1.4e-04±1.1e-06 1.4e-04±6.2e-07 1.5e-04±3.1e-06 1.4e-04±2.0e-06
fselect-keys-3 1.4e-04±9.6e-07 1.4e-04±1.1e-06 1.4e-04±8.2e-07 1.5e-04±2.2e-06 1.4e-04±1.4e-06 1.4e-04±7.5e-07 1.4e-04±7.3e-07 1.4e-04±1.9e-06 1.4e-04±1.5e-06 1.4e-04±1.3e-06 1.5e-04±1.6e-06 1.4e-04±1.1e-06 1.4e-04±9.8e-07 1.4e-04±9.1e-07 1.4e-04±1.2e-06 1.4e-04±1.1e-06 1.5e-04±2.3e-06
fselect-keys-4 1.3e-04±1.5e-06 1.3e-04±2.0e-06 1.3e-04±1.4e-06 1.3e-04±1.9e-06 1.3e-04±1.3e-06 1.3e-04±1.7e-06 1.3e-04±3.5e-06 1.3e-04±2.9e-06 1.3e-04±1.2e-06 1.3e-04±3.7e-06 1.3e-04±9.2e-07 1.3e-04±1.2e-06 1.4e-04±2.6e-06 1.4e-04±3.2e-06 1.3e-04±1.1e-06 1.4e-04±3.8e-06 1.3e-04±1.5e-06
fselect-keys-5 1.4e-04±5.8e-06 1.4e-04±2.4e-06 1.3e-04±1.4e-06 1.4e-04±2.1e-06 1.4e-04±1.9e-06 1.4e-04±2.2e-06 1.4e-04±2.7e-06 1.4e-04±1.4e-06 1.3e-04±8.6e-07 1.4e-04±1.7e-06 1.4e-04±3.7e-06 1.4e-04±2.7e-06 1.4e-04±2.6e-06 1.3e-04±1.6e-06 1.3e-04±2.6e-06 1.3e-04±1.9e-06 1.4e-04±1.5e-06
fselect-keys-6 1.3e-04±2.3e-06 1.3e-04±1.1e-06 1.3e-04±7.6e-07 1.3e-04±1.0e-06 1.3e-04±1.2e-06 1.3e-04±6.7e-07 1.3e-04±9.5e-07 1.3e-04±9.4e-07 1.4e-04±5.5e-06 1.3e-04±8.4e-07 1.3e-04±1.3e-06 1.3e-04±7.5e-07 1.3e-04±4.5e-06 1.3e-04±9.4e-07 1.3e-04±9.5e-07 1.3e-04±1.4e-06 1.3e-04±1.5e-06
select-keys 1.3e-04±1.0e-06 1.3e-04±1.3e-06 1.3e-04±1.1e-06 1.3e-04±1.1e-06 1.3e-04±1.2e-06 1.4e-04±3.8e-06 1.3e-04±1.5e-06 1.3e-04±1.0e-06 1.3e-04±8.8e-07 1.3e-04±1.7e-06 1.3e-04±1.5e-06 1.3e-04±1.3e-06 1.3e-04±7.9e-07 1.3e-04±1.0e-06 1.3e-04±9.5e-07 1.4e-04±2.7e-06 1.3e-04±9.8e-07