analyzer
Table of Contents
Namespaces
- enums
Interfaces
- AnalyzerOptions
- Common contract for every analyzer definition consumable by
{@see \oihana\arango\clients\Database::createAnalyzer()} and
{@see Analyzer::create()}.
Classes
- Analyzer
- Operations scoped to a single ArangoSearch analyzer on the server.
- IdentityAnalyzer
- Pass-through analyzer — emits its input verbatim, with no
transformation. Useful as the default analyzer on every link and
on every field that does not need language-aware normalisation.
- NgramAnalyzer
- N-gram analyzer — emits every substring (n-gram) of its input whose
length is between `min` and `max` characters, optionally keeping the
original token. It is the building block of substring / "as-you-type"
autocomplete search: indexing a field with an n-gram analyzer lets a
partial term (`ate`) match a longer value (`Atelier`).
- NormAnalyzer
- Locale-aware normaliser. Lower-cases / upper-cases the input and optionally strips diacritics.
- PipelineAnalyzer
- Pipeline analyzer — runs an **ordered** chain of sub-analyzers, each fed
the output of the previous one. It is the typed way to compose analyzers
the server otherwise only exposes individually ({@see NormAnalyzer},
{@see NgramAnalyzer}, {@see StemAnalyzer}, …); without it, the only escape
hatch was an untyped `new RawAnalyzer( 'pipeline' , … )`.
- RawAnalyzer
- Raw, type-agnostic analyzer options — carries a verbatim `type`
discriminator and `properties` map instead of the named arguments of a
typed value object ({@see TextAnalyzer}, {@see NormAnalyzer}, {@see StemAnalyzer},
{@see IdentityAnalyzer}).
- StemAnalyzer
- Locale-aware stemmer. Reduces inflected forms of a word to a
common root (e.g. `running` → `run`). Single-token input only —
compose with a tokenising analyzer upstream when working on full
sentences.
- TextAnalyzer
- Full-text analyzer — tokenises on word boundaries, optionally lower-cases, removes stopwords,
applies stemming and accent folding, and optionally emits edge n-grams for prefix search.