skip navigation
ERPAePRINTS  logo: erpanet skip navigation

Variations of word frequencies in Genre classification tasks.

Kim, Dr Yunhyong and Ross, Professor Seamus (2007) Variations of word frequencies in Genre classification tasks.. In Proceedings DELOS conference on Digital Libraries, Tirrenia, Italy.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

This paper examines automated genre classification of text documents and its role in enabling the effective management of digital documents by digital libraries and other repositories. Genre classification, which narrows down the possible structure of a document, is a valuable step in realising the general automatic extraction of semantic metadata essential to the efficient management and use of digital objects. In the present report, we present an analysis of word frequencies in different genre classes in an effort to understand the distinction between independent classification tasks. In particular, we examine automated experiments on thirty-one genre classes to determine the relationship between the word frequency metrics and the degree of its significance in carrying out classification in varying environments.

Item Type:Conference Paper
Keywords:automated metadata extraction, management, genre classification
Subjects:L Digital Repository, Digital Archive and Digital Library Models > LA Ingest
L Digital Repository, Digital Archive and Digital Library Models > LB Management
E Data Description, Documentation and Standards > EA Metadata
Document Language:English
ID Code:134
Deposited By:Kim, Dr Yunhyong
Deposited On:19 November 2007