speculoos.core

This namespace provides functions to validate Clojure data. They operate on any heterogeneous, arbitrarily-nested data structure.

Terminology:

  • data: A heterogeneous, arbitrarily-nested Clojure data structure that represents information.
  • scalar: A non-divisible datum, such as a number, string, boolean, etc.
  • collection: A composite data structure, such as a vector, list, map, set, lazy-sequence, etc., composed of scalars and other collections.
  • specification: A human- and machine-readable description of data.
  • validate: To systematically apply predicates to datums.
  • valid: All datums satisfy their corresponding predicates; more specifically, zero invalid datum-predicate pairs.
  • path: A vector of indexes/keys that uniquely locate a datum.
  • predicate: A function that returns true/false, usually 1-arity, but in particular circumstances may be more.
  • ordinal: A mode of operation wherein a nested collection’s path considers only its ordering relative to other collections.

Remember three Mottos:

  1. Validate scalars separately from validating collections.
  2. Make the specification mimic the shape of the data.
  3. Validation ignores un-paired predicates and un-paired datums.

all-paths

(all-paths form)(all-paths form accumulator path container-type)

Returns a vector of {:path _ :value _} to all values in form, a heterogeneous, arbitrarily-nested data structure, including both scalars (e.g., numbers, strings, etc.) and collections (e.g., lists, vectors, maps, sets). paths are suitable for consumption by get-in*, update-in*, assoc-in*, and the like. Outermost root element is located by MapEntry [:path []].

Note: The 4-arity version is a recursion target and not intended to be called.

Examples:

;; vector path elements are zero-indexed integers
(all-paths [42 :foo 22/7])
;; => [{:path [], :value [42 :foo 22/7]}
;;     {:path [0], :value 42}
;;     {:path [1], :value :foo}
;;     {:path [2], :value 22/7}]

;; map path elements are keys (often keywords, but not always)
(all-paths {:a 11 :b 22})
;; => [{:path [], :value {:a 11, :b 22}}
;;     {:path [:a], :value 11}
;;     {:path [:b], :value 22}]

;; list path elements are zero-indexed integers
(all-paths (list 11 22))
;; => [{:path [], :value (11 22)}
;;     {:path [0], :value 11}
;;     {:path [1], :value 22}]

;; set path elements are the values themselves
(all-paths #{:red :blue})
;; => [{:path [], :value #{:red :blue}}
;;     {:path [:red], :value :red}
;;     {:path [:blue], :value :blue}]

;; heterogeneous, nested collections; multi-element paths composed of integer indexes and keys
(all-paths [42 {:a 'foo}])
;; => [{:path [], :value [42 {:a foo}]}
;;     {:path [0], :value 42}
;;     {:path [1], :value {:a foo}}
;;     {:path [1 :a], :value foo}]

(all-paths {:x #{99}})
;; => [{:path [], :value {:x #{99}}}
;;     {:path [:x], :value #{99}}
;;     {:path [:x 99], :value 99}]

clamp

(clamp c1 c2)

Given two sequences c1 and c2, if either (isa? (type _) :speculoos/non-terminating), clamp its size at the count of the other, stuff its contents into a vector, and return both as [new-c1 new-c2]. If neither are :speculoos/non-terminating, return [c1 c2] unchanged. Supplying two non-terminating sequences throws.

Examples:

(clamp [:a :b :c :d :e] (range)) ;; => [[:a :b :c :d :e] [0 1 2 3 4]]
(clamp [] (repeat 42)) ;; ==> [[] []]
(clamp (iterate dec 0) (list 'foo 'bar 'baz)) ;; => [[0 -1 -2] (foo bar baz)]

;; neither non-terminating; args pass through unchanged
(clamp [1 2 3] (list :a \z)) ;; [[1 2 3] (:a \z)]

;; sequence is only possibly non-terminating; actual argument `c2` is shorter
(clamp [1 2 3 4 5] (range 3)) ;; => [[1 2 3 4 5] [0 1 2]]

only-invalid

(only-invalid validations)

Returns only validation entries where :datum does not satisfy :predicate, i.e., :valid? is false or nil.

Examples:

(only-invalid (validate-scalars [42   :foo     22/7   ]
                                [int? keyword? symbol?]))
;; => ({:path [2], :datum 22/7, :predicate symbol?, :valid? false})

(only-invalid (validate-collections [42    (list :foo)]
                                    [list? [list?]    ]))
;; => ({:path [0], :value list?, :datum [42 (:foo)], :ordinal-parent-path [], :valid? false})

only-valid

(only-valid validations)

Returns only validation entries where :datum satisfies :predicate, i.e., :valid? is neither false nor nil.

Examples:

(only-valid (validate-scalars [42       :foo    22/7  ]
                              [decimal? symbol? ratio?]))
;; => ({:path [2], :datum 22/7, :predicate ratio?, :valid? true})

(only-valid (validate-collections [42    (list :foo)]
                                  [list? [list?]    ]))
;; => ({:path [1 0], :value list?, :datum (:foo), :ordinal-parent-path [0], :valid? true})

ordinal-get

multimethod

(ordinal-get coll i)

Performs the same task as get*, but when encountering a vector or list, considers only elements that are collections. The element is addressed by an ordinal path. Map elements are addressed by keys, as usual. (Keys may themselves be integers, or a composite value.) Set elements are addressed by their identities.

Examples:

(ordinal-get [11 [22] 33 [44] [55] 66 [77]] 0) ;; => [22]
(ordinal-get [11 [22] 33 [44] [55] 66 [77]] 2) ;; => [55]
(ordinal-get [11 [22] 33 [44] [55] 66 [77]] 3) ;; => [77]

ordinal-get-in

(ordinal-get-in c keys)

A get-in* that, when encountering a vector or list, considers only elements that are collections. Map and set addresses work as usual.

Examples:

(ordinal-get-in [42 [:foo] 99 [:bar] 33 [:baz]] [2]) ;; => [:baz]
(ordinal-get-in {:a [[42] [77] ['hello]]} [:a 2]) ;; => [hello]

recover-literal-path

(recover-literal-path form ord-path)

Given heterogeneous, arbitrarily-nested data structure form and ordinal collection path ord-path, returns the literal path to the nested child collection.

Examples:

(recover-literal-path [11 [22] 33 [44] 55 [66]] [2])
;; => [5]

(recover-literal-path {:a {:b [11 [22] 33 [44]]}}
                      [:a :b 1])
;; => [:a :b 3]

(recover-literal-path (list 11 22 [33 [44] [55] [66]])
                      [1 2])
;; => [2 3]

reduce-indexed

(reduce-indexed f coll)(reduce-indexed f val coll)

Systematically apply f to elements of coll, carrying an index along with the accumulating value, analogous to how map-indexed relates to map. Function f should be a function of 3 arguments: zero-indexed integer, the accumulating value, and the next element of coll. coll may be any Clojure collection.

If val is not supplied:

  • reduce-indexed returns the result of applying f to the first 2 items coll, then applying f to that result and the 3rd item, etc.
  • If coll contains zero items, f must accept no arguments as well, and reduce-indexed returns the result of calling f with no arguments.
  • If coll has only one item, it is returned and f is not called.

If val is supplied:

  • reduce-indexed returns the result of applying f to val and the first item in coll, then applying f to that result and the 2nd item, etc.
  • If coll contains zero items, returns val and f is not called.

Note:

  1. Implemented with only first, next, etc., and therefore currently makes no consideration for performance.
  2. Hash-map and array-map elements are peeled off as instances of clojure.lang.MapEntry, a pseudo-vector of [key value].
  3. Set elements are consumed in an un-defined order.

Examples:

(reduce-indexed #(conj %2 (vector %1 %3)) [:initial] [:item1 :item2 :item3])
;; => [:initial [0 :item1] [1 :item2] [2 :item3]]

(reduce-indexed #(assoc %2 %3 %1) {:init-val 99} [:a :b :c])
;; => {:init-val 99, :a 0, :b 1, :c 2}

valid-collections?

(valid-collections? data spec)

Following validation with validate-collections, returns true if every collection element in data satisfies every corresponding predicate in collection specification spec, false otherwise.

Note: valid-collections? returns true if validation returns zero {:valid? falsey} results.

Note: If a corresponding specification predicate does not exist, that element of data will not be checked. Use collections-without-predicates to locate elements of data that lack corresponding predicates in spec. Use thoroughly-valid-collections? to require that every collection in data is validated.

See validate-collections for details on the mechanics of collection validation.

Examples:

(valid-collections? [      42 [        :foo]]  ;; <-- data
                    [list?    [vector?     ]]) ;; <-- specification
;; => false

(valid-collections? {                 :a 42 :b {                 :c 'foo}} ;; <-- data
                    {:outer-coll map?       :b {:inner-coll map?        }});; <-- specification
;; => true

;; un-paired datum; nested vector in data not tested
(valid-collections? [        11     [22]]
                    [vector?            ])
;; => true

valid-macro?

(valid-macro? macro-args spec)

Returns true if macroexpansion fully satisfies scalar specification spec. Supply macro and args as if to macroexpand-1 itself, i.e., `(macro-name arg1 arg 2...).

Note 1: Many entities that appear to be a function in a macro expansion are in fact symbols.

Note 2: Macro expansion works subtly different between the CIDER nREPL and from the CLI, e.g. $ lein test. Use syntax quote ` as a workaround.

Use validate-macro-with to produce a detailed validation report.

Example:

(defmacro example-macro [f & args] `(~f ~@args))

(macroexpand-1 `(example-macro + 1 2 3)) ;; => (clojure.core/+ 1 2 3)

(def example-macro-spec (list symbol? number? number? number?))

(valid-macro? `(example-macro + 1 2 3) example-macro-spec) ;; => true

valid-scalars?

(valid-scalars? data spec)

Following validation with validate-scalars, returns true if every scalar element in data satisfies every corresponding predicate in scalar specification spec, false otherwise.

Note: valid-scalars? returns true if validation returns zero {:valid? falsey} results.

Note: If a corresponding specification predicate does not exist, that element of data will not be checked. Use scalars-without-predicates to locate elements of data that lack corresponding predicates in spec. Use thoroughly-valid-scalars? to require that every scalar in data is validated.

See validate-scalars for details on the mechanics of scalar validation.

Examples:

(valid-scalars? [42   :foo     22/7  ]  ;; <-- data
                [int? keyword? ratio?]) ;; <-- specification
;; => true

(valid-scalars? {:a 42 :b 'foo}
                {:a string? :b symbol?}) ;; => false

;; un-paired datums
(valid-scalars? [42 :foo 22/7]
                [int?        ]) ;; => true

;; un-paired predicates
(valid-scalars? {:a 42     }
                {:b symbol?}) ;; => true

valid?

(valid? data scalar-spec collection-spec)

Following validations with validate-scalars and then with validate-collections, returns true if data satisfies every corresponding predicate in scalar specification scalar-spec and every corresponding predicate in collection specification collection-spec, false otherwise.

valid? provides a combined interface to valid-scalars? and valid-collections?. Scalar validation and collection validation are performed in completely distinct operations. Their results are merely combined into a single true/false high-level summary. Use validate to generate a detailed report of scalar and collection validation.

Note: valid? returns true if validations return zero {:valid? falsey} results.

See validate-scalars and validate-collections for details on the the mechanics of validation.

Examples:

(valid? [42 [:foo [22/7]]]             ;; data
        [int? [keyword? [ratio?]]]     ;; scalar specification
        [vector? [vector? [vector?]]]) ;; collections specification
;; => true

(valid? {:a 42 :b {:c ['foo true]}}           ;; data
        {:a int? :b {:c [keyword? boolean?]}} ;; scalar specification
        {:root-coll map? :b {:c [vector?]}})  ;; collection specification
;; => false

validate

(validate data scalar-spec collection-spec)

Perform a scalar validation of data using scalar specification scalar-spec, then immediately perform a collection validation of data using collection specification collection-spec, then return the merged vector of each result. See validate-scalars and validate-collections.

Remember three Mottos:

  1. Validate scalars separately from validating collections.
  2. Make the specification mimic the shape of the data.
  3. Validation ignores un-paired predicates and un-paired datums.

validate performs two separate validations, in two distinct steps, then returns a single summary that merges both results. First, data’s scalars are validated, then data’s collections are validated. Finally, the results of those two distinct validations are merged into a comprehensive summary.

Examples:

;; only scalar validation with `validate-scalars`
(validate-scalars [42]    ;; data
                  [int?]) ;; scalar specification
;; => [{:path [0], :datum 42, :predicate int?, :valid? true}]

;; only collection validation with `validate-collections`
(validate-collections [42]       ;; data
                      [vector?]) ;; collection specification
;; => ({:path [0], :value vector?, :datum [42], :ordinal-parent-path [], :valid? true})

;; scalar validation, then collection validation, with a single invocation
(validate [42]       ;; data
          [int?]     ;; scalar specification
          [vector?]) ;; collection specification
;; => ({:path [0], :datum 42, :predicate int?, :valid? true}
;;     {:path [0], :value vector?, :datum [42], :ordinal-parent-path [], :valid? true})


;; only scalar validation with `validate-scalars`
(validate-scalars {:a 11}       ;; data
                  {:a string?}) ;; scalar specification
;; [{:path [:a], :datum 11, :predicate #function[clojure.core/string?--5475], :valid? false}]

;; only collection validation with `validate-collections`
(validate-collections {                 :a 11}  ;; data
                      {:coll-type? map?      }) ;; collection specification
;; ({:path [:coll-type?], :value map?, :datum {:a 11}, :ordinal-parent-path [], :valid? true})

;; scalar validation, then collection validation, with a single invocation
(validate {:a 11}             ;; data
          {:a string?}        ;; scalar specification
          {:coll-type? map?}) ;; collection specification
;; => ({:path [:a], :datum 11, :predicate string?, :valid? false}
;;     {:path [:coll-type?], :value map?, :datum {:a 11}, :ordinal-parent-path [], :valid? true})

validate-collections

(validate-collections data spec)

Returns a sequence of {:path-predicate _ :predicate _ :path-datum _ :datum _ :ordinal-path-datum _ :valid? _} hash-maps for every collection datum in data with a corresponding predicate in collection specification spec. data is an arbitrarily-nested, heterogeneous data structure. spec is a corresponding ‘shape’, i.e., all nested structures are of the same type and position. Only elements of spec that satisfy fn? are used. validate-collections descends into all nested collections. validate-collections only validates complete datum-predicate pairs., i.e., only collections in data that have a corresponding predicate in spec. See valid-collections? and thoroughly-valid-collections? for high-level summaries of collection validation.

  • :path-predicate is a vector suitable for sending to get-in*, assoc-in*, update-in*, etc., that locates the predicate within the specification.
  • :path-datum is the literal path to the datum within data to which the the collection predicate is applied.
  • :ordinal-path-datum is a vector suitable for sending to ordinal-get and ordinal-get-in, which locates the collection within data to which the collection predicate is applied.
  • :predicate is a 1-arity function which returns truthy/falsey.
  • :datum is the collection entity in data.
  • :valid? is the result of invoking the predicate with the collection datum.

The ordering of results is an implementation detail and not specified.

Remember three Mottos:

  1. Validate scalars separately from validating collections.
  2. Make the specification mimic the shape of the data.
  3. Validation ignores un-paired predicates and un-paired datums.

Predicates at path within the collection specification are applied to the collection located at (drop-last path) within data. Generally, the predicate is applied to the ‘parent’ collection that contains it.

Examples:

(validate-collections [42      [99     ]]  ;; <-- data
                      [vector? [vector?]]) ;; <-- specification
;; => ({:datum [42 [99]], :valid? true, :path-predicate [0], :predicate vector?, :ordinal-path-datum [], :path-datum []}
;;     {:datum [99], :valid? true, :path-predicate [1 0], :predicate vector?, :ordinal-path-datum [0], :path-datum [1]})

;; predicate `vector?` at path [0] in specification is applied to the collection at path (drop-last [0]) in data
;; predicate `vector?` at path [1 0] in specification is applied to the collection at path (drop-last [1 0]) in data

(validate-collections {                :a 42 :b {                   :c 99}}
                      {:root-coll map?       :b {:child-coll? list?      }})
;; => ({:datum {:a 42, :b {:c 99}}, :valid? true, :path-predicate [:root-coll], :predicate map?, :ordinal-path-datum [], :path-datum []}
;;     {:datum {:c 99}, :valid? false, :path-predicate [:b :child-coll?], :predicate list?, :ordinal-path-datum [:b], :path-datum [:b]})

;; predicate `map?` at path [:root-coll] in specification is applied to the collection at path (drop-last [:root-coll]) in data
;; predicate `list?` at path [:b :child-coll?] in specification is applied to the collection at path (drop-last [:b :child-coll?]) in data

Only complete collection-predicate pairs are validated. Un-paired collections and un-paired predicates are ignored.

;; nested vector in data is not paired with a corresponding predicate in specification
(validate-collections [11 22 33 [44 55 66]] ;;  <-- data
                      [vector?  [        ]]) ;; <-- specification
;; => ({:datum [11 22 33 [44 55 66]], :valid? true, :path-predicate [0], :predicate vector?, :ordinal-path-datum [], :path-datum []})

;; specification's map does not contain a predicate that corresponds to data's outer map
(validate-collections {:a 11 :b [22 33]}  ;; <-- data
                      {      :b [list?]}) ;; <-- specification
;; => ({:datum [22 33], :valid? false, :path-predicate [:b 0], :predicate list?, :ordinal-path-datum [:b], :path-datum [:b]})

;; un-paired list? and set? predicates in collection specification are ignored
(validate-collections [99] [vector? [list?] [set?]])
;; => ({:datum [99], :valid? true, :path-predicate [0], :predicate vector?, :ordinal-path-datum [], :path-datum []})

Note: (Possibly) non-terminating sequences are clamped at the length of the corresponding element in the other collection. Therefore, if there are fewer spec predicates than elements to be tested in data, you must pad the collection specification, e.g., (constantly true), to force validation of those datums. (I don’t like this policy, but I don’t have a better heuristic by which to clamp a non-terminating sequence with in-band information.)

;; non-terminating specification is clamped at the length of the data
(validate-collections [[11] [22] [33]]
                      (repeat [vector?]))
;; => ({:datum [11], :valid? true, :path-predicate [0 0], :predicate vector?, :ordinal-path-datum [0], :path-datum [0]}
;;     {:datum [22], :valid? true, :path-predicate [1 0], :predicate vector?, :ordinal-path-datum [1], :path-datum [1]}
;;     {:datum [33], :valid? true, :path-predicate [2 0], :predicate vector?, :ordinal-path-datum [2], :path-datum [2]})

;; non-terminating data is clamped at the length of the specification
(validate-collections (cycle [[11] [22] [33]])
                      [[vector?]])
;; => ({:datum [11], :valid? true, :path-predicate [0 0], :predicate vector?, :ordinal-path-datum [0], :path-datum [0]})

;; only the first nested vector is validated because the data's non-terminating sequence
;; was clamped to the length of the specification

;; padding the specification to catch the full cycle of the data
(validate-collections (cycle [[11] [22] [33]])
                      [[vector?] [any?] [any?]])
;; => ({:datum [11], :valid? true, :path-predicate [0 0], :predicate vector?, :ordinal-path-datum [0], :path-datum [0]}
;;     {:datum [22], :valid? true, :path-predicate [1 0], :predicate any?, :ordinal-path-datum [1], :path-datum [1]}
;;     {:datum [33], :valid? true, :path-predicate [2 0], :predicate any?, :ordinal-path-datum [2], :path-datum [2]})

Overview of the algorithm.

  1. Run all-paths on the data.
    (all-paths [11 {:b 22} [[33]]])
    ;; => [{:path [], :value [11 {:b 22} [[33]]]}
    ;;     {:path [0], :value 11}
    ;;     {:path [1], :value {:b 22}}
    ;;     {:path [1 :b], :value 22}
    ;;     {:path [2], :value [[33]]}
    ;;     {:path [2 0], :value [33]}
    ;;     {:path [2 0 0], :value 33}]
    
    ;; Seven total elements: four collections, three scalars.
    
  2. Run all-paths on the specification.
    (all-paths [vector? {:coll-type? map?} [[list?]]])
    ;; => [{:path [], :value [vector? {:coll-type? map?} [[list?]]]}
    ;;     {:path [0], :value vector?}
    ;;     {:path [1], :value {:coll-type? map?}}
    ;;     {:path [1 :coll-type?], map?}
    ;;     {:path [2], :value [[list?]]}
    ;;     {:path [2 0], :value [list?]}
    ;;     {:path [2 0 0], :value list?}]
    
    ;; Seven total elements: four collections, three predicates.
    
  3. Remove scalar elements from the data.

    (filter #(coll? (:value %)) (all-paths [11 {:b 22} [[33]]]))
    ;; => ({:path [], :value [11 {:b 22} [[33]]]}
    ;;     {:path [1], :value {:b 22}}
    ;;     {:path [2], :value [[33]]}
    ;;     {:path [2 0], :value [33]})
    
    ;; Four collections in data.
    
  4. Remove collections from the specification.

    (only-fns (all-paths [vector? {:coll-type? map?} [[list?]]]))
    ;; => [{:path [0], :value vector?}
    ;;     {:path [1 :coll-type?], :value map?}
    ;;     {:path [2 0 0], :value list?}]
    
    ;; Three predicates in specification.
    
  5. Associate predicates in the specification with collections in the data. The collections in the data correspond to the containers/parent of the predicate within the specification. Basically, the predicate at path in the specification will be applied to the collection at (drop-last) in the data.

    ;; predicate `vector?` at path [0] in spec is paired with entity at path (drop-last [0]) in data
    ;; predicate `map?` at path [1 :coll-type?] in spec is paired with entity at path (drop-last [1 :coll-type?]) in data
    ;; predicate `list?` at path [2 0 0] in spec is paired with entity at path (drop-last [2 0 0]) in data
    ;; the nested vector at path [2] in data does not have a corresponding predicate in specification; it will not be validated
    
  6. For each collection-predicate pair, apply the predicate to the collection.

    (vector? [11 {:b 22} [[33]]]) ;; true
    (map? {:b 22}) ;; true
    (list? [33]) ;; false
    
    ;; or, all at once             v-----------------------v-----v--------- these scalars in data are ignored
    (validate-collections [        11 {                 :b 22} [[33   ]]]  ;; <-- data
                          [vector?    {:coll-type? map?      } [[list?]]]) ;; <-- specification
    ;;                     ^-----------------------^-------------^--------- these predicates in spec are applied
    ;;                                                                      to the parent containers in data
    
    ;; => ({:datum [11 {:b 22} [[33]]], :valid? true, :path-predicate [0], :predicate vector?, :ordinal-path-datum [], :path-datum []}
    ;;     {:datum {:b 22}, :valid? true, :path-predicate [1 :coll-type?], :predicate map?, :ordinal-path-datum [0], :path-datum [1]}
    ;;     {:datum [33], :valid? false, :path-predicate [2 0 0], :predicate list?, :ordinal-path-datum [1 0], :path-datum [2 0]})
    

validate-macro-with

(validate-macro-with macro-args spec)

Returns results of validating the macroexpansion of a macro and arguments against scalar specification spec. Supply macro-args as if to macroexpand-1 itself, i.e., `(macro-name arg1 arg 2...).

Note: Many entities that appear to be a function in a macro expansion are, in fact, symbols.

Use valid-macro? to produce a high-level summary result.

Example:

(defmacro example-macro [f & args] `(~f ~@args))

(macroexpand-1 `(example-macro + 1 2 3)) ;; => (clojure.core/+ 1 2 3)

(def example-macro-spec (list symbol? number? number? number?))

(validate-macro-with `(example-macro + 1 2 3) example-macro-spec)
;; => [{:path [0], :datum clojure.core/+, :predicate symbol?, :valid? true}
;;     {:path [1], :datum 1, :predicate number?, :valid? true}
;;     {:path [2], :datum 2, :predicate number?, :valid? true}
;;     {:path [3], :datum 3, :predicate number?, :valid? true}]

validate-scalars

multimethod

Returns a sequence of {:path _ :datum _ :predicate _ :valid? _} hash-maps for every scalar datum in data with a corresponding predicate in scalar specification spec. data is a heterogeneous, arbitrarily-nested data structure of arbitrary values. spec is a corresponding ‘shape’, i.e., all nested structures are of the same type and length, containing predicates to test against. validate-scalars recursively descends into all nested collections. Only validates complete datum-predicate pairs, i.e., only nodes that are in both data and in spec. See valid-scalars? and thoroughly-valid-scalars? for high-level summaries of scalar validation.

  • :path is a vector suitable for sending to get-in*, assoc-in*, update-in*, and friends.
  • :datum is the scalar entity in data.
  • :predicate is a 1-arity function-like thing that returns truthy/falsey, i.e, regexes and sets may serve as predicates.
  • :valid? is the result of invoking the predicate with the scalar datum.

The ordering of results is an implementation detail and not specified.

Remember three Mottos:

  1. Validate scalars separately from validating collections.
  2. Make the specification mimic the shape of the data.
  3. Validation ignores un-paired predicates and un-paired datums.

Examples:

(validate-scalars [42   :foo     \c   ] ;; <-- data
                  [int? keyword? char?]) ;; <-- specification
;; => [{:path [0], :datum 42, :predicate int?, :valid? true}
;;     {:path [1], :datum :foo, :predicate keyword?, :valid? true}
;;     {:path [2], :datum \c, :predicate char?, :valid? true}]

(validate-scalars {:a 42    :b 'foo  }  ;; <-- data
                  {:a int? :b symbol?}) ;; <-- specification
;; => [{:path [:a], :datum 42, :predicate int?, :valid? true}
;;     {:path [:b], :datum foo, :predicate symbol?, :valid? true}]

;; nested data and specification
(validate-scalars [42     {:z 'baz}    ]
                  [ratio? {:z keyword?}])
;; => [{:path [0], :datum 42, :predicate ratio?, :valid? false}
;;     {:path [1 :z], :datum baz, :predicate keyword?], :valid? false}]

;; data and specification not same length
(validate-scalars [42 :foo 22/7]
                  [int?        ])
;; => [{:path [0], :datum 42, :predicate int?, :valid? true}]

(validate-scalars [42                     ]
                  [decimal? keyword? char?])
;; => [{:path [0], :datum 42, :predicate decimal?, :valid? false}]

;; regular expression predicate
(validate-scalars [ "foo"]
                  [#"f.."])
;; => [{:path [0], :datum "foo", :predicate #"f..", :valid? "foo"}]

;; set as a membership predicate
(validate-scalars [:green              ]
                  [#{:red :green :blue}])
;; => [{:path [0], :datum :green, :predicate #{:green :red :blue}, :valid? :green}]

Within a scalar specification, a bare regular expression literal #"..." is automatically treated as #(re-matches #"...").

(validate-scalars ["abc" "xyz"]
                  [#"a.c" #"^[wxyz]{3}$"])
;; => [{:path [0], :datum "abc", :predicate #"a.c", :valid? "abc"}
;;     {:path [1], :datum "xyz", :predicate #"^[wxyz]{3}$", :valid? "xyz"}]

Within a scalar specification, a set is treated as a membership predicate when the data at that same path in the data contains a scalar…

(validate-scalars [11    :red]
                  [int? #{:blue :green :red}]) ;; <-- set in specification, but not in data
;; => [{:path [1], :datum :red, :predicate #{:green :red :blue}, :valid? :red}
;;     {:path [0], :datum 11, :predicate int?, :valid? true}]

… whereas a set in the scalar specification at the same path as a set in the data is treated as a regular nested collection: The set mimics a container in the data. Any predicates within the specification set are applied to all scalars contained in the data as if #(every? keyword? %). The key is changed from :datum to :datums-set to emphasize this behavior.

(validate-scalars [11   #{:tea :coffee :water}]  ;; <-- sets in both data...
                  [int? #{:keyword?}          ]) ;; <-- ... and in specification
;; => ({:path [0], :datum 11, :predicate #function[clojure.core/int?], :valid? true}
;;     {:path [1], :datums-set #{:coffee :tea :water}, :predicate :keyword?, :valid? false})

Non-terminating sequences in data are acceptable as long as the corresponding sequence (i.e., at the same path) in spec terminates, and vice versa.

(validate-scalars (cycle [42 'foo 22/7])   ;; <-- data is an infinite sequence
                  [int? keyword? ratio?])  ;; <-- specification terminates
;; => [{:path [0], :datum 42, :predicate int?, :valid? true}
;;     {:path [1], :datum foo, :predicate keyword?, :valid? false}
;;     {:path [2], :datum 22/7, :predicate ratio?, :valid? true}]

(validate-scalars [11 22 33]     ;; <-- data terminates
                  (repeat int?)) ;; <-- specification is an infinite sequence
;; => [{:path [0], :datum 11, :predicate int?, :valid? true}
;;     {:path [1], :datum 22, :predicate int?, :valid? true}
;;     {:path [2], :datum 33, :predicate int?, :valid? true}]

Overview of the algorithm.

  1. Run all-paths on the data.

    ;; data is a three-element vector composed of an integer, a symbol, and a ratio
    (all-paths [42 'foo 22/7])
    ;; => [{:path [], :value [42 foo 22/7]}
    ;;     {:path [0], :value 42}
    ;;     {:path [1], :value foo}
    ;;     {:path [2], :value 22/7}]
    
    ;; Four total elements: one vector and three scalars.
    
  2. Run all-paths on the specification.

    ;; specification is a two-element vector composed of an `int?` predicate and a `keyword?` predicate
    ;; will only validate first and second elements of data
    (all-paths [int? keyword?])
    ;; => [{:path [], :value [int? keyword?]}
    ;;     {:path [0], :value int?}
    ;;     {:path [1], :value keyword?}
    
    ;; Three total elements: one vector and two scalars (i.e., predicate functions).
    
  3. Remove collections elements from each result.

    ;; remove non-collections from data's all-paths sequence
    (only-non-collections [{:path [], :value [42 'foo 22/7]}
                           {:path [0], :value 42}
                           {:path [1], :value 'foo}
                           {:path [2], :value 22/7}])
    ;; => [{:path [0], :value 42}
    ;;     {:path [1], :value foo}
    ;;     {:path [2], :value 22/7}]
    
    ;; remove non-collections from specification's all-paths sequence
    (only-non-collections [{:path [], :value [int? keyword?]}
                           {:path [0], :value int?}
                           {:path [1], :value keyword?}])
    ;; => [{:path [0], :value int?}
    ;;     {:path [1], :value keyword?}]
    
  4. Remove from data scalars that lack a predicate in the specification.

    ;; => [{:path [0], :value 42}
    ;;     {:path [1], :value foo}
    
    ;; third element of data vector does not have a corresponding predicate
    
  5. Remove from specification predicates that lack a scalar in the data.

    ;; => [{:path [0], :value int?}
    ;;     {:path [1], :value keyword?}]
    
    ;; both predicates contained in specification have a corresponding scalar
    
  6. For each scalar-predicate pair, apply the predicate to the scalar.

    (int? 42) ;; => true
    (keyword? 'foo) ;; => false
    
    ;; or, all at once
    (validate-scalars [42   'foo     22/7]  ;; <-- data, a three-element vector
                      [int? keyword?     ]) ;; <-- specification, a two-element vector
    ;; [{:path [0], :datum 42, :predicate int?, :valid? true}
        {:path [1], :datum foo, :predicate keyword?, :valid? false}]
    

validate-with-path-spec

(validate-with-path-spec data spec)

Given a heterogeneous, arbitrarily-nested structure data, validate against path specification vector spec. Each entry in spec is a map with keys :paths and :predicate. :paths is a vector to get-in* paths to elements (scalar and/or collections) in data, supplied in-order to the function associated with :predicate, whose arity matches the number of paths in :paths. The function should return truthy or falsey values (strict true/false is recommended, but not required).

Examples:

;; relating one scalar to another (predicate is 2-arity)
(validate-with-path-spec [11 :foo 22] [{:paths [[2] [0]] :predicate #(= %2 (/ %1 2))}])
;; => ({:args (22 11), :valid? true, :paths [[2] [0]], :predicate fn--47025]})

;; relating one scalar to another, different depths of the sequence (predicate is 3-arity)
(validate-with-path-spec {:a 42 :b [42 {:c 42}]} [{:paths [[:b 0] [:a] [:b 1 :c]] :predicate #(= %1 %2 %3)}])
;; => ({:args (42 42 42), :valid? true, :paths [[:b 0] [:a] [:b 1 :c]], :predicate fn--47045]})

;; specification containing two validations
(validate-with-path-spec [:foo [42 22/7]] [{:paths [[1 0]] :predicate int?} {:paths [[1]] :predicate vector?}])
;; => ({:args (42), :valid? true, :paths [[1 0]], :predicate int?]}
;;     {:args ([42 22/7]), :valid? true, :paths [[1]], :predicate vector?]})