| Class | Description |
|---|---|
| TibAffixedFilter |
Removes འི, འོ, འིའོ, འམ, འང and འིས characters at end of token.
|
| TibCharFilter | |
| TibetanAnalyzer |
An Analyzer that uses
TibSyllableTokenizer and filters with StopFilter
Derived from Lucene 6.4.1 analysis.core.WhitespaceAnalyzer.java |
| TibEwtsFilter |
A filter that converts EWTS input into Tibetan Unicode
Partially inpired from Lucene 6 org.apache.lucene.analysis.charfilterMappingCharFilter
|
| TibSyllableTokenizer |
A TibSyllableTokenizer divides text between sequences of Tibetan Letter and/or Digit
characters and sequences of all other characters - typically some sort of white space
but other punctuation and characters from other language code-pages are not considered
as constituents of tokens for the purpose of search and indexing.
|
| TibWordTokenizer |
A maximal-matching word tokenizer for Tibetan that uses a
Trie. |
Copyright © 2018. All rights reserved.