minhashMatch.php
Table of Contents
Functions
- minhashMatch() : string
- Match documents with an approximate Jaccard similarity of at least a threshold.
Functions
minhashMatch()
Match documents with an approximate Jaccard similarity of at least a threshold.
minhashMatch(string $path, string $target, string $analyzer[, float|null $threshold = null ]) : string
Wraps the ArangoDB AQL function MINHASH_MATCH(path, target, threshold, analyzer).
The similarity is approximated with the given minhash Analyzer — an
efficient first pass for entity resolution (duplicate detection) before an
exact JACCARD() computation.
Argument order notice — in AQL the optional threshold sits before the
mandatory analyzer; PHP forbids a required parameter after an optional one,
so this helper takes the analyzer third and the optional threshold
last, then re-orders the emitted AQL arguments.
Example AQL usage:
MINHASH_MATCH(doc.text, "the quick brown fox", 0.5, "myMinHash")
Parameters
- $path : string
-
Attribute path expression to test (kept raw).
- $target : string
-
String to hash and compare against (emitted as a quoted string literal).
- $analyzer : string
-
Name of the
minhashAnalyzer (emitted as a quoted string literal). - $threshold : float|null = null
-
Optional similarity threshold in
[0.0, 1.0].
Tags
Return values
string —The formatted AQL expression.