Introduction

Pragmatics is the science of the use of language in its context. In the second half of the 20th century, linguists and philosophers started to study the importance of context in the interpretation of statements. There have been two areas of progress in this field: the study of spoken language and the conception of the verbal communication as inferential process. A speaker only codifies the most relevant part of what they actually want to express. At the same time, hearer does not decode but rather interpret the statements expressed by the speaker. As a result, some linguists believe that it is impossible to define the meaning of a statement a priori, since this meaning is the result of a negotiation process carried out by the parties during the interaction. The dependency of the meaning to the context is main reason why it is so difficult to objectify, systematize and implement pragmatic knowledge.

The first step is to translate information from implicit to explicit. Many research groups have started to introduce pragmatic knowledge in its natural languages processing systems. To understand language, it is more important to recognize the speech acts of a verbal interaction than the grammatical categories of their constituent words. This is true particularly for machines or L2 students who don’t know how to use the language. For this reason the most important initiatives in this area have been developed in order to improve machine-human interaction and second language learning. Part of the initiatives focused on annotating corpus with

pragmatic information. The phenomena being tagged are: co-reference relations, discourse structure, speech acts, emotional language and discourse markers. The tagset design normally depends on the system necessities. These efforts present three limitations for the current study: the corpora used are domain-specific; relevant pragmatic phenomena in statement interpretation are not represented (e.g. discursive modalization); these researches have not been generally developed for Spanish spoken language.

This paper presents PRAGMATEXT, a pragmatic annotation model designed for C-ORAL-ROM, a Spanish spoken language corpus of 300000 words representing a wide range of communicative situations.The paper is divided in four sections. The first section presents the most relevant features of the C-ORAL-ROM corpus. The second describes the pragmatic-discursive annotation model. The phenomena tagged are: emotional discourse, argumentative operations, modalization operations, evidentiality, phraseological units with metaphoric meaning and speech acts in interrogative clauses. In developing the model, discourse particles have been the object of our study. The third section resolves the three challanges related to the implementation of such annotation model to the XML language: (1) pragmatic- discursive operations are expressed at different grammatical levels (lexicon, prosody, syntax, etc.); (2) a linguistic unit can have as attributes different types of pragmatic information; (3) the pragmatic knowledge is not expressed by a closed word class. The fourth section discusses the uses of a corpus tagged with pragmatic knowledge in the field of man-machine conversational systems and teaching of Spanish as a foreign language.

Add Comment

Your email address will not be published. Required fields are marked *

error: Este contenido está sometido a copyright.