Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009
…
24 pages
1 file
IEEE Workshop on Automatic Speech Recognition and Understanding, 2005., 2005
This paper reports on recent experiments for speech to text (STT) translation of European Parliamentary speeches. A Spanish speech to English text translation system has been built using data from the TC-STAR European project. The speech recognizer is a state-of-the-art multipass system trained for the Spanish EPPS task and the statistical translation system relies on the IBM-4 model. First, MT results are compared using manual transcriptions and 1-best ASR hypotheses with different word error rates. Then, an n-best interface between the ASR and MT components is investigated to improve the STT process. Derivation of the fundamental equation for machine translation suggests that the source language model is not necessary for STT. This was investigated by using weak source language models and by n-best rescoring adding the acoustic model score only. A significant loss in the BLEU score was observed suggesting that the source language model is needed given the insufficiencies of the translation model. Adding the source language model score in the n-best rescoring process recovers the loss and slightly improves the BLEU score over the 1-best ASR hypothesis. The system achieves a BLEU score of 37.3 with an ASR word error rate of 10% and a BLEU score of 40.5 using the manual transcripts.
In this paper we present the ongoing work at RWTH Aachen University for building a speechto-speech translation system within the TC-Star project. The corpus we work on consists of parliamentary speeches held in the European Plenary Sessions. To our knowledge, this is the first project that focuses on speech-to-speech translation applied to a real-life task. We describe the statistical approach used in the development of our system and analyze its performance under different conditions: dealing with syntactically correct input, dealing with the exact transcription of speech and dealing with the (noisy) output of an automatic speech recognition system. Experimental results show that our system is able to perform adequately in each of these conditions. Paper type: (R) Research Keywords: Speech Translation, Methodologies for MT, Text and speech corpora for MT, MT evaluation results. a J 1 J j=1
We investigate the possibility of automatically detecting whether a piece of text is an original or a translation. On a large parallel English-French corpus where reference information is available, we find that this is possible with around 90% accuracy. We further study the implication this has on Machine Translation performance. After separating our corpus according to translation direction, we train direction-specific phrase-based MT systems and show that they yield improved translation performance. This suggests that taking directionality into account when training SMT systems may have a significant effect on output quality. yes yes
2017
MMT is a new open source machine translation software specifically addressing the needs of the translation industry. In this paper we describe its overall architecture and provide details about its major components. We report performance results on a multi-domain benchmark based on public data, on two translation directions, by comparing MMT against state-of-theart commercial and research phrase-based and neural MT systems.
The current paper evaluates the performance of the PRESEMT methodology, which facilitates the creation of machine translation (MT) systems for different language pairs. This methodology aims to develop a hybrid MT system that extracts translation information from large, predominantly monolingual corpora, using pattern recognition techniques. PRESEMT has been designed to have the lowest possible requirements on specialised resources and tools, given that for many languages (especially less widely used ones) only limited linguistic resources are available. In PRESEMT, the main translation process is divided into two phases, the first determining the overall structure of a target language (TL) sentence, and the second disambiguating between alternative translations for words or phrases and establishing local word order. This paper describes the latest version of the system and evaluates its translation accuracy, while also benchmarking the PRESEMT performance by comparing it with other established MT systems using objective measures.
2008
We describe a set of experiments to explore statistical techniques for ranking and selecting the best translations in a graph of translation hypotheses. In a previous paper (Carl, 2007) we have described how the hypotheses graph is generated through shallow mapping and permutation rules . We have given examples of its nodes consisting of vectors representing morpho-syntactic properties of words and phrases. This paper describes a number of methods for elaborating statistical feature functions from some of the vector components. The feature functions are trained off-line on different types of text and their log-linear combination is then used to retrieve the best translation paths in the graph. We compare two language modelling toolkits, the CMU and the SRI toolkit and arrive at three results: 1) word-lemma based feature function models produce better results than token-based models, 2) adding a PoS-tag feature function to the word-lemma model improves the output and 3) weights for lexical translations are suitable if the training material is similar to the texts to be translated 1 A number of recent SMT architectures are described
translution.com
1 Executive Summary We evaluated the French-to-English versions of two rules based machine translation (MT) systems (referred to as s01 and s02 in this document) in order to assess the quality of their output and to determine whether updating the system dictionaries brought about an improvement in performance.
1993
Six years ago at the first MT Summit conference, the field of MT was dominated by approaches which had been established in the late 1970s. These were the systems which had built upon experience gained in what may be called the 'quiet' decade of machine translation, the ten years after the publication of the ALPAC report in 1966 had brought to an end MT research in the United States and had profoundly affected its support elsewhere.
2005
This paper describes a statistical machine translation system that uses a translation model which is based on bilingual n-grams. When this translation model is log-linearly combined with four specific feature functions, state of the art translations are achieved for Spanish-to-English and English-to-Spanish translation tasks. Some specific results obtained for the EPPS (European Parliament Plenary Sessions) data are presented and discussed. Finally, future research issues are depicted.
Miscellanea Historico-Iuridica
Vietnam Journal of Hydrometeorology, 2020
Confronting Climate Change in Bangladesh, 2019
Research Square (Research Square), 2024
Слово і Час, 2024
Zenodo (CERN European Organization for Nuclear Research), 2022
Pensamiento, arte y comunicación: la importancia de hacer llegar el mensaje, 2023
FEMS Microbiology Letters, 2012
African Journal of Food Science and Technology, 2019
Journal of Pediatric Surgery Case Reports, 2014
Demetrius, Nizamulmulk, Nasiraddin Tusi and their common aspects, 2023
Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, 2015
Archives of Orthopaedic and Trauma Surgery, 2016
The Journal of Immunology
The Journal of Physical Chemistry B, 2002