pythainlp.coref

Introduction

The pythainlp.coref module is dedicated to Coreference Resolution for the Thai language. Coreference resolution is a crucial task in natural language processing (NLP) that deals with identifying and linking expressions (such as pronouns) in a text to the entities or concepts they refer to. This module provides tools to tackle coreference resolution challenges in the context of the Thai language.

Coreference Resolution Function

The primary component of the pythainlp.coref module is the coreference_resolution function. This function is designed to analyze text and identify instances of coreference, helping NLP systems understand when different expressions in the text refer to the same entity. Here’s how you can use it:

The pythainlp.coref is Coreference Resolution for Thai.

pythainlp.coref.coreference_resolution(texts: List[str], model_name: str = 'han-coref-v1.0', device: str = 'cpu')[source]

Coreference Resolution

Parameters:
  • texts (List[str]) – list of texts to apply coreference resolution to

  • model_name (str) – coreference resolution model

  • device (str) – device for running coreference resolution model on (“cpu”, “cuda”, and others)

Returns:

List of texts with coreference resolution

Return type:

List[dict]

Options for model_name:
  • han-coref-v1.0 - (default) Han-Coref: Thai coreference resolution by PyThaiNLP v1.0

Example:

from pythainlp.coref import coreference_resolution

print(
    coreference_resolution(
        ["Bill Gates ได้รับวัคซีน COVID-19 เข็มแรกแล้ว ระบุ ผมรู้สึกสบายมาก"]
    )
)
# output:
# [
# {'text': 'Bill Gates ได้รับวัคซีน COVID-19 เข็มแรกแล้ว ระบุ ผมรู้สึกสบายมาก',
# 'clusters_string': [['Bill Gates', 'ผม']],
# 'clusters': [[(0, 10), (50, 52)]]}
# ]

Usage

To use the coreference_resolution function effectively, follow these steps:

  1. Import the coreference_resolution function from the pythainlp.coref module.

  2. Pass the Thai text you want to analyze for coreferences as input to the function.

  3. The function will process the text and return information about coreference relationships within the text.

Example:

::

from pythainlp.coref import coreference_resolution

text = “นาย A มาจาก กรุงเทพ และเขา มีความรักต่อ บางกิจ ของเขา” coreferences = coreference_resolution(text)

print(coreferences)