skip navigation
ERPAePRINTS  logo: erpanet skip navigation

Building a document genre corpus: a profile of the KRYS I corpus

Berninger, Ms Vera and Kim, Dr Yunhyong and Ross, Prof Seamus (2008) Building a document genre corpus: a profile of the KRYS I corpus. In Proceedings Corpus Profiling Workshop at IIiX 2008.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

This paper describes the KRYS I corpus, consisting of documents classified into 70 genre classes. It has been constructed as part of an effort to automate document genre classification as distinct from topic detection. Previously there has been very little work on building corpora of texts which have been classified using a nontopical genre palette. The reason for this is partly due to the fact that genre as a concept, is rooted in philosophy, rhetoric and literature, and highly complex and domain dependent in its interpretation ([11]). The usefulness of genre in everyday information search is only now starting to be recognised and there is no genre classification schema that has been consolidated to have applicable value in this direction. By presenting here our experiences in constructing the KRYS I corpus, we hope to shed light on the information gathering and seeking behaviour and the role of genre in these activities, as well as a way forward for creating a better corpus for testing automated genre classification tasks and the application of these tasks to other domains.

Item Type:Conference Poster
Subjects:E Data Description, Documentation and Standards > EE Description
E Data Description, Documentation and Standards > ED Contextual Information
E Data Description, Documentation and Standards > EB Identification
E Data Description, Documentation and Standards > EC Cataloguing
E Data Description, Documentation and Standards > EA Metadata
Document Language:English
ID Code:155
Deposited By:Kim, Dr Yunhyong
Deposited On:19 November 2008