pythainlp.morpheme

The pythainlp.benchmarks module is collect functions for morpheme analysis, word formation and more for Thai language.

pythainlp.morpheme.nighit(w1: str, w2: str) str[source]

Nighit (นิคหิต or ํ ) is the niggahita in Thai language for create new words from Pali language in Thai. The function use simple method to create new Thai word from two words that the root is from Pali language.

Read more: https://www.trueplookpanya.com/learning/detail/1180

Parameters:
  • w1 (str) – A Thai word that has a nighit.

  • w2 (str) – A Thai word.

Returns:

Thai word.

Return type:

str

Example:

from pythainlp.morpheme import nighit

assert nighit("สํ","คีต")=="สังคีต"
assert nighit("สํ","จร")=="สัญจร"
assert nighit("สํ","ฐาน")=="สัณฐาน"
assert nighit("สํ","นิษฐาน")=="สันนิษฐาน"
assert nighit("สํ","ปทา")=="สัมปทา"
assert nighit("สํ","โยค")=="สังโยค"
pythainlp.morpheme.is_native_thai(word: str) bool[source]

Check if a word is an “native Thai word” (Thai: “คำไทยแท้”) This function is based on a simple heuristic algorithm and cannot be entirely reliable.

Parameters:

word (str) – word

Returns:

True or False

Return type:

bool

Example:

English word:

from pythainlp.util import is_native_thai

is_native_thai("Avocado")
# output: False

Native Thai word:

is_native_thai("มะม่วง")
# output: True
is_native_thai("ตะวัน")
# output: True

Non-native Thai word:

is_native_thai("สามารถ")
# output: False
is_native_thai("อิสริยาภรณ์")
# output: False

The is_native_thai function is a language detection tool that identifies whether text is predominantly in the Thai language or not. It aids in language identification and text categorization tasks.