| Class | Description |
|---|---|
| BuildCompiledTrie | |
| CommonHelpers | |
| PaBaFilter |
Transforms བ and བོ into པ and པོ.
|
| TibAffixedFilter |
Removes འི, འོ, འིའོ, འམ, འང and
འིས characters at end of token.
|
| TibCharFilter | |
| TibetanAnalyzer |
An Analyzer that uses
TibSyllableTokenizer and filters with
StopFilter
Derived from Lucene 6.4.1 analysis.core.WhitespaceAnalyzer.java |
| TibEwtsFilter |
A filter that converts EWTS input into Tibetan Unicode
Partially inspired from Lucene 6
org.apache.lucene.analysis.charfilterMappingCharFilter
|
| TibSyllableTokenizer |
A TibSyllableTokenizer divides text between sequences of Tibetan Letter
and/or Digit characters and sequences of all other characters - typically
some sort of white space but other punctuation and characters from other
language code-pages are not considered as constituents of tokens for the
purpose of search and indexing.
|
| TibWordTokenizer |
A maximal-matching word tokenizer for Tibetan that uses a
Trie. |
Copyright © 2019. All rights reserved.