CITI has stopped operations in 2014, to co-launch NOVA LINCS THIS SITE IS NOT BEING UPDATED SINCE 2013
citi banner
  Home  \  Publications  \  InProceedings Page Login  
banner bottom
File Top
Bilingually Learning Word Senses for Translation

All words in every natural language are ambiguous, specially when translation is at stake. In translation tasks, there is the need for finding out adequate translations for such words in the contexts where they occur. In this article, a bilingual strategy to cluster words according to their meanings is described. A publicly available parallel corpora sen- tence aligned is used. Word senses are discriminated by their translations and by the words occurring in a window, both in the source and target language parallel sentences. This strategy is language independent and uses a correlation algorithm for filtering out irrelevant features. Clus- ters obtained were evaluated in terms of F-measure (getting an average rating of 94%) and their homogeneity and completeness was determined using V-Measure (getting an average rating of 83%). Learned clusters are then used to train a support vector machine to tag ambiguous words with their translations in the contexts where they occur. This task was also evaluated in terms of F-measure and confronted with a baseline.


@ Computational Linguistics and Intelligent Text Processing, 15th International Conference, CICLing 2014, Kathmandu, Nepal, April 6-12, 2014, Proceedings, Part II

Editors: Alexander Gelbukh

Series: Lecture Notes in Computer Science

Number: 8404

Publisher: Springer ( Germany )

Pages: 283 to 295

Date: April, 2014


File Bottom