|Home \ Graduation Activities \ Post-Graduation Page||Login|
Parametrização da Estrutura de Dados Métrica RLC
In many applications, there is the need to search objects that are similar or close to a given one. Examples of these objects include medical or face images, protein or DNA sequences, natural language words or hurricane trajectories. Proximity searches can be formalised in the metric space setting, where similarity between two elements of the domain is measured through the distance function. As, in general, databases have large amounts of information and the cost of evaluating distances is very high, several data structures, called metric data structures, have been developed in order to minimise the number of distance computations performed in searches of this type. In this thesis, we survey the metric spaces that are most commonly used to evaluate the performance of metric data structures. Then, we describe the evolution of the Recursive Lists of Clusters (RLC) metric data structure, characterising its variants. The RLC performance, like that of any parameterized metric data structure, depends strongly on the values of its parameters. The problem is that the most suitable values for each metric space have been found by observation of experimental results, which makes this process unreliable and very time consuming. To tackle this issue, a new RLC version is proposed, where the parameter values are defined by functions that depend on the intrinsic dimensionality of the metric space. The experimental results, which involve fifteen metric spaces of different domains, show that the new variant outperforms the previous one.
Start Date: 2010-09-28
End Date: 2012-06-22
Post-Graduation Student / Researcher / Professor: