SVMTEC-2 is an improvement of SVMTEC that is described in Kavitha Mahesh, Luís Gomes, Gabriel Pereira Lopes, 2011, "Using SVMs for Filtering Translation Tables" in: Rui Prada, Sofia Pinto and Luís Antunes (Eds.),"Proceedings of the 15th Portuguese Conference in Arificial Intelligence, EPIA 2011, Lisbon, October, 2011", (ISBN 978-989-95618-4-7), pages 690-702, Instituto Superior Técnico (Portugal). It is based on the idea that Translation Lexicons can improve the quality of parallel corpora alignment at sub-sentence granularity, of newly extracted translations, and as a consequence, of translations produced by a machine. Bilingual pairs (entries) that are part of such translation lexicons should be correct if they are to contribute positively to the improvement of application's quality. This prototype classifies as correct or incorrect the bilingual entries automatically extracted from aligned parallel corpora. It uses a Support Vector Machine based classifier that is trained on previously validated bilingual lexical entries that have been manually classified as correct or incorrect. The features used are described in the paper above. Experimental results demonstrate that this classification approach enabled a Micro f-measure higher than 85% for language pair English-Portuguese.

Date: June, 2011

Authors: Gabriel Pereira Lopes, Kavitha Mahesh, Luís Gomes
