public final class TibetanAnalyzer
extends org.apache.lucene.analysis.Analyzer
TibSyllableTokenizer and filters with StopFilter
Derived from Lucene 6.4.1 analysis.core.WhitespaceAnalyzer.java| Modifier and Type | Field and Description |
|---|---|
static String |
INPUT_METHOD_ALALC |
static String |
INPUT_METHOD_DEFAULT |
static String |
INPUT_METHOD_DTS |
static String |
INPUT_METHOD_EWTS |
static String |
INPUT_METHOD_UNICODE |
| Constructor and Description |
|---|
TibetanAnalyzer()
Creates a new
TibetanAnalyzer with the default values |
TibetanAnalyzer(boolean segmentInWords,
boolean lemmatize,
boolean filterChars,
String inputMethod,
String stopFilename)
Creates a new
TibetanAnalyzer |
| Modifier and Type | Method and Description |
|---|---|
protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents |
createComponents(String fieldName) |
static ArrayList<String> |
getWordList(InputStream inputStream,
String comment) |
protected Reader |
initReader(String fieldName,
Reader reader) |
public static final String INPUT_METHOD_UNICODE
public static final String INPUT_METHOD_DTS
public static final String INPUT_METHOD_EWTS
public static final String INPUT_METHOD_ALALC
public static final String INPUT_METHOD_DEFAULT
public TibetanAnalyzer(boolean segmentInWords,
boolean lemmatize,
boolean filterChars,
String inputMethod,
String stopFilename)
throws IOException
TibetanAnalyzersegmentInWords - if the segmentation is on words instead of syllableslemmatize - if the analyzer should remove affixed particles, and normalize words in words modefilterChars - if the text should be converted to NFD (necessary for texts containing NFC strings)inputMethod - if the text should be converted from EWTS to UnicodestopFilename - a file name with a stop word listIOException - if the file containing stopwords can't be openedpublic TibetanAnalyzer()
throws IOException
TibetanAnalyzer with the default valuesIOException - if the file containing stopwords can't be openedpublic static ArrayList<String> getWordList(InputStream inputStream, String comment) throws IOException
inputStream - stream to the list of stopwordscomment - The string representing a commentArrayList to fill with the reader's wordsIOException - if the file containing stopwords can't be openedprotected Reader initReader(String fieldName, Reader reader)
initReader in class org.apache.lucene.analysis.Analyzerprotected org.apache.lucene.analysis.Analyzer.TokenStreamComponents createComponents(String fieldName)
createComponents in class org.apache.lucene.analysis.AnalyzerCopyright © 2018. All rights reserved.