Discourse markers have been subject of different studies in the field of computational pragmatics and natural language processing. especially in applications concerned with the detection of document structure for automatic summarization or for the interpretation and generation of speech acts in speech corpora and dialogue systems (Kawahara & Hasegawa, 2002). However, most of these studies focused on the automatic ambiguity resolution of only certain discourse markers (Zufferey & Popescu-Belis, 2004) or in the classification of the markers in monolingual corpora. Moreover, the majority of these studies deal with the ambiguity from a merely technical perspective which doesn’t study in depth the linguistic ambiguity in discourse markers. This ambiguity can be categorial, discursive or both.

On the other hand, the state-of-the-art reveals a lack of studies considering discourse markers in Arabic from a computational perspective. To our knowledge, they have been briefly mentioned in the annotation tools provided by the LDC for the annotation of Arabic speech corpora (Strassel & Walker, 2004), while the rest of the studies adopted a completely theortical linguistic point of view (?).

Given all the above mentioned facts, the novelty of our study lies in two main aspects. First, this work offers a comprehensive multillingual treatment of the discourse markers. Second, it addresses the written Arabic language from a computational pragmatic perspective.

Add Comment

Your email address will not be published. Required fields are marked *

error: Este contenido está sometido a copyright.