AnalyzerField uses ConstantsTrait
JSON field names exchanged with the ArangoDB analyzer API (`/_api/analyzer`), on both the request side (body of `POST /_api/analyzer`) and the response side (wrapper of `GET /_api/analyzer/{name}` and entries of `GET /_api/analyzer?force=true`).
Two families coexist:
- Top-level fields (
name,type,features,properties) that frame every analyzer payload regardless of its type, - Type-specific properties (
locale,case,accent,stemming,stopwords,stopwordsPath,edgeNgram,min,max,preserveOriginal,startMarker,endMarker,streamType,pipeline) that nest inside thepropertieswrapper fortext,norm,stem,ngramandpipelineanalyzers.
Tags
Table of Contents
Constants
- ACCENT : string = 'accent'
- Whether the analyzer should keep diacritics on the input (`text` / `norm` only).
- CASE : string = 'case'
- Case folding strategy applied to the input (`text` / `norm` only). Recognised values: `"lower"`, `"upper"`, `"none"`.
- EDGE_NGRAM : string = 'edgeNgram'
- Edge n-gram options nested inside the `properties` of a `text` analyzer — carries `min`, `max`, `preserveOriginal` sub-fields.
- END_MARKER : string = 'endMarker'
- String appended to the end of the input before n-gram emission (`ngram` only), so end-of-token n-grams can be distinguished.
- FEATURES : string = 'features'
- List of analyzer feature toggles — entries of {@see AnalyzerFeature}.
- LOCALE : string = 'locale'
- BCP 47 / ICU locale tag (e.g. `"en"`, `"fr.utf-8"`) driving the language-aware behaviour of the analyzer (`text` / `norm` / `stem`).
- MAX : string = 'max'
- Upper bound of the n-gram window (inclusive). Lives under the {@see self::EDGE_NGRAM} wrapper for a `text` analyzer, or at the top level of the `properties` for an `ngram` analyzer.
- MIN : string = 'min'
- Lower bound of the n-gram window (inclusive). Lives under the {@see self::EDGE_NGRAM} wrapper for a `text` analyzer, or at the top level of the `properties` for an `ngram` analyzer.
- NAME : string = 'name'
- Top-level analyzer name. Must be prefixed with the database name when shared across databases (`mydb::myanalyzer`).
- PIPELINE : string = 'pipeline'
- Ordered list of sub-analyzers run as a chain (`pipeline` only).
- PRESERVE_ORIGINAL : string = 'preserveOriginal'
- Whether the n-gram emitter should also keep the original (un-trimmed) token in the output stream. Lives under the {@see self::EDGE_NGRAM} wrapper for a `text` analyzer, or at the top level of the `properties` for an `ngram` analyzer.
- PROPERTIES : string = 'properties'
- Wrapper field carrying the type-specific options of an analyzer. Always an object — empty (`{}`) for the {@see AnalyzerType::IDENTITY} analyzer.
- RESULT : string = 'result'
- Wrapper field carrying the list of analyzers in the response of `GET /_api/analyzer`.
- START_MARKER : string = 'startMarker'
- String prepended to the start of the input before n-gram emission (`ngram` only), so start-of-token n-grams can be distinguished. Lives at the top level of the `ngram` `properties`.
- STEMMING : string = 'stemming'
- Whether the `text` analyzer should apply Snowball-style stemming on the tokens it emits.
- STOPWORDS : string = 'stopwords'
- List of stopwords to drop from the token stream (`text` only).
- STOPWORDS_PATH : string = 'stopwordsPath'
- Filesystem path to a newline-separated stopwords file (`text` only). The path is resolved server-side.
- STREAM_TYPE : string = 'streamType'
- Input encoding the `ngram` analyzer operates on — `"binary"` (byte-wise, the server default) or `"utf8"` (codepoint-wise).
- TYPE : string = 'type'
- Analyzer type discriminator — entries of {@see AnalyzerType}.
Constants
ACCENT
Whether the analyzer should keep diacritics on the input (`text` / `norm` only).
public
string
ACCENT
= 'accent'
CASE
Case folding strategy applied to the input (`text` / `norm` only). Recognised values: `"lower"`, `"upper"`, `"none"`.
public
string
CASE
= 'case'
EDGE_NGRAM
Edge n-gram options nested inside the `properties` of a `text` analyzer — carries `min`, `max`, `preserveOriginal` sub-fields.
public
string
EDGE_NGRAM
= 'edgeNgram'
END_MARKER
String appended to the end of the input before n-gram emission (`ngram` only), so end-of-token n-grams can be distinguished.
public
string
END_MARKER
= 'endMarker'
Lives at the top level of the ngram properties.
FEATURES
List of analyzer feature toggles — entries of {@see AnalyzerFeature}.
public
string
FEATURES
= 'features'
Top-level field on every analyzer payload.
LOCALE
BCP 47 / ICU locale tag (e.g. `"en"`, `"fr.utf-8"`) driving the language-aware behaviour of the analyzer (`text` / `norm` / `stem`).
public
string
LOCALE
= 'locale'
MAX
Upper bound of the n-gram window (inclusive). Lives under the {@see self::EDGE_NGRAM} wrapper for a `text` analyzer, or at the top level of the `properties` for an `ngram` analyzer.
public
string
MAX
= 'max'
MIN
Lower bound of the n-gram window (inclusive). Lives under the {@see self::EDGE_NGRAM} wrapper for a `text` analyzer, or at the top level of the `properties` for an `ngram` analyzer.
public
string
MIN
= 'min'
NAME
Top-level analyzer name. Must be prefixed with the database name when shared across databases (`mydb::myanalyzer`).
public
string
NAME
= 'name'
PIPELINE
Ordered list of sub-analyzers run as a chain (`pipeline` only).
public
string
PIPELINE
= 'pipeline'
Lives at the top level of the pipeline properties; each entry
is itself a { type, properties } analyzer fragment, fed the
output of the previous one. See
PipelineAnalyzer.
PRESERVE_ORIGINAL
Whether the n-gram emitter should also keep the original (un-trimmed) token in the output stream. Lives under the {@see self::EDGE_NGRAM} wrapper for a `text` analyzer, or at the top level of the `properties` for an `ngram` analyzer.
public
string
PRESERVE_ORIGINAL
= 'preserveOriginal'
PROPERTIES
Wrapper field carrying the type-specific options of an analyzer. Always an object — empty (`{}`) for the {@see AnalyzerType::IDENTITY} analyzer.
public
string
PROPERTIES
= 'properties'
RESULT
Wrapper field carrying the list of analyzers in the response of `GET /_api/analyzer`.
public
string
RESULT
= 'result'
START_MARKER
String prepended to the start of the input before n-gram emission (`ngram` only), so start-of-token n-grams can be distinguished. Lives at the top level of the `ngram` `properties`.
public
string
START_MARKER
= 'startMarker'
STEMMING
Whether the `text` analyzer should apply Snowball-style stemming on the tokens it emits.
public
string
STEMMING
= 'stemming'
STOPWORDS
List of stopwords to drop from the token stream (`text` only).
public
string
STOPWORDS
= 'stopwords'
STOPWORDS_PATH
Filesystem path to a newline-separated stopwords file (`text` only). The path is resolved server-side.
public
string
STOPWORDS_PATH
= 'stopwordsPath'
STREAM_TYPE
Input encoding the `ngram` analyzer operates on — `"binary"` (byte-wise, the server default) or `"utf8"` (codepoint-wise).
public
string
STREAM_TYPE
= 'streamType'
Lives at the top level of the ngram properties.
TYPE
Analyzer type discriminator — entries of {@see AnalyzerType}.
public
string
TYPE
= 'type'