pythainlp.generate

The pythainlp.generate is Thai text generate with PyThaiNLP.

Modules

class pythainlp.generate.Unigram(name: str = 'tnc')[source]

Text generator using Unigram

Parameters

name (str) – corpus name * tnc - Thai National Corpus (default) * ttc - Thai Textbook Corpus (TTC) * oscar - OSCAR Corpus

__init__(name: str = 'tnc')[source]
gen_sentence(start_seq: Optional[str] = None, N: int = 3, prob: float = 0.001, output_str: bool = True, duplicate: bool = False) Union[List[str], str][source]
Parameters
  • start_seq (str) – word for begin word.

  • N (int) – number of word.

  • output_str (bool) – output is str

  • duplicate (bool) – duplicate word in sent

Returns

list words or str words

Return type

List[str], str

Example

from pythainlp.generate import Unigram

gen = Unigram()

gen.gen_sentence("แมว")
# ouput: 'แมวเวลานะนั้น'
class pythainlp.generate.Bigram(name: str = 'tnc')[source]

Text generator using Bigram

Parameters

name (str) – corpus name * tnc - Thai National Corpus (default)

__init__(name: str = 'tnc')[source]
prob(t1: str, t2: str) float[source]

probability word

Parameters
  • t1 (int) – text 1

  • t2 (int) – text 2

Returns

probability value

Return type

float

gen_sentence(start_seq: Optional[str] = None, N: int = 4, prob: float = 0.001, output_str: bool = True, duplicate: bool = False) Union[List[str], str][source]
Parameters
  • start_seq (str) – word for begin word.

  • N (int) – number of word.

  • output_str (bool) – output is str

  • duplicate (bool) – duplicate word in sent

Returns

list words or str words

Return type

List[str], str

Example

from pythainlp.generate import Bigram

gen = Bigram()

gen.gen_sentence("แมว")
# ouput: 'แมวไม่ได้รับเชื้อมัน'
class pythainlp.generate.Trigram(name: str = 'tnc')[source]

Text generator using Trigram

Parameters

name (str) – corpus name * tnc - Thai National Corpus (default)

__init__(name: str = 'tnc')[source]
prob(t1: str, t2: str, t3: str) float[source]

probability word

Parameters
  • t1 (int) – text 1

  • t2 (int) – text 2

  • t3 (int) – text 3

Returns

probability value

Return type

float

gen_sentence(start_seq: Optional[str] = None, N: int = 4, prob: float = 0.001, output_str: bool = True, duplicate: bool = False) Union[List[str], str][source]
Parameters
  • start_seq (str) – word for begin word.

  • N (int) – number of word.

  • output_str (bool) – output is str

  • duplicate (bool) – duplicate word in sent

Returns

list words or str words

Return type

List[str], str

Example

from pythainlp.generate import Trigram

gen = Trigram()

gen.gen_sentence()
# ouput: 'ยังทำตัวเป็นเซิร์ฟเวอร์คือ'
pythainlp.generate.thai2fit.gen_sentence(start_seq: Optional[str] = None, N: int = 4, prob: float = 0.001, output_str: bool = True) Union[List[str], str][source]

Text generator using Thai2fit

Parameters
  • start_seq (str) – word for begin word.

  • N (int) – number of word.

  • output_str (bool) – output is str

  • duplicate (bool) – duplicate word in sent

Returns

list words or str words

Return type

List[str], str

Example

from pythainlp.generate.thai2fit import gen_sentence

gen_sentence()
# output: 'แคทรียา อิงลิช  (นักแสดง'

gen_sentence("แมว")
# output: 'แมว คุณหลวง '