Support tools and machine translation engine evaluation methodology
From the viewpoint of text collection and processing for machine translation training, some tools need to be developed. We will create tools for personal data anonymization, tools for (semi)automatic bilingual text alignment and tools for the extraction of suitable texts from larger databases.
In addition to support tools, we will also develop a methodology for evaluating the machine translation engine. The primary method for determining the quality of an engine, which is reflected in the quality of its translations, will be the BLEU automatic metric. However, as BLEU does not always provide a comprehensive qualitative insight, we will develop an additional evaluation method – one that will be based on manual review. The manual evaluation will be performed by MA in translation students who will be suitably trained to perform the task. The results of the evaluation will be reviewed by the coordinator, who will also analyse the results.
Prior to the development of the machine translation engine, we will repeat the evaluation of the reference engine developed at the Jožef Stefan Institute within the TraMOOC project, funded by the European Union Horizon 2020 program. The engine is available in open access on the website www.translexy.com.