A TibSyllableTokenizer divides text between sequences of Tibetan Letter
and/or Digit characters and sequences of all other characters - typically
some sort of white space but other punctuation and characters from other
language code-pages are not considered as constituents of tokens for the
purpose of search and indexing.