Hostname: page-component-76fb5796d-skm99 Total loading time: 0 Render date: 2024-04-26T09:53:36.683Z Has data issue: false hasContentIssue false

Optimization of word alignment clues

Published online by Cambridge University Press:  21 September 2005

JÖRG TIEDEMANN
Affiliation:
Alfa-Informatica, University of Groningen, Groningen, The Netherlands e-mail: tiedeman@let.rug.nl

Abstract

Statistical, linguistic, and heuristic clues can be used for the alignment of words and multi-word units in parallel texts. This article describes the clue alignment approach and the optimization of its parameters using a genetic algorithm. Word alignment clues can come from various sources such as statistical alignment models, co-occurrence tests, string similarity scores and static dictionaries. A genetic algorithm implementing an evolutionary procedure can be used to optimize the parameters necessary for combining available clues. Experiments on English/Swedish bitext show a significant improvement of about 6% in F-scores compared to the baseline produced by statistical word alignment.Most of the work described in this paper was carried out at the Department of Linguistics and Philology at Uppsala University. I would like to acknowledge technical and scientific support by people at the department in Uppsala.

Type
Papers
Copyright
2005 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)