A Document Descriptor Extractor Based on Relevant Expressions
{ Wed, 20 Jan 2010, 14h00 }

By: Joaquim Ferreira da Silva

People are often asked to associate keywords to documents as it enables applications to access to the summarized core content of documents. This fact was the main motivation to work on an approach that may contribute to change this manual procedure to an automatic one. Since Relevant Expressions (REs) or multi-word term expressions can be automatically extracted using the LocalMaxs algorithm, the most relevant ones can be used to describe the core content of each document. In this work we present a language-independent approach for automatic generation of document descriptors. Results are shown for three different European languages and comparisons are made concerning different metrics for selecting the most informative REs of each document.

Hosted by: MultiModal Systems

Location: DI seminars room (FCT/UNL campus)

