Thai G2P

Thai Grapheme-to-Phoneme (Thai G2P) based on Deep Learning (Seq2Seq model)

v0.1

Model Details

Intended Use

Grapheme-to-Phoneme conversion tool.

Factors

  • Based on thai grapheme-to-phoneme conversion problems.

Metrics

f1-score.

Training Data

wiktionary trainset

Evaluation Data

wiktionary testset

Quantitative Analyses

F1 (macro-average) =  0.9415941561267093
EM =  0.71
EM (Character-level) =  0.8660247630539959
save best model em score=0.71 at epoch=1148
Save model at epoch  1148
Epoch: 1149 | Time: 2m 55s
    Train Loss: 0.352 | Train PPL:   1.422
     Val. Loss: 0.512 |  Val. PPL:   1.669
epoch=1149, teacher_forcing_ratio=0.4

Ethical Considerations

This model is based on the Thai wiktionary Dump (include bias from Thai wiktionary).

Caveats and Recommendations

  • 1 Thai word only