With so many different machine translation (MT) software options on the market, it can be hard to understand everything that is on offer, much less evaluate the pros and cons of each one.
Much work has been done to assess the quality of MT over the years. Examples include translating from source language to target language and back to source language with the same engine, but research shows this method is a poor predictor of performance. Meanwhile, extensive studies wherein people compared the translation output of humans and machines provided for the creation of highly reliable automated metrics that can be used to evaluate MT quality. There are currently more than a hundred such metrics available. Moreover, as MT technology improves, more metrics are needed to account for the nuances of different approaches to quality assessment.
To assess the competency of various AI engines as part of its development of Leveraged AI, TBSJ required a set of tools that provided access to the desired metrics and eased the process of quality evaluation.
Not finding what was needed on the market, Paul O’Hare, CTO and co-founder, says the technology team began building its own software. The goal was to use not only specific automated metrics, but to also include human evaluation tools, since human feedback is crucial for MT evaluation.
The resulting automatic tool was Sanbi, which TBSJ is making public this month. Developed by technologists who are also experienced translators and linguists, the tool automatically evaluates and compares the quality of one or more machine-translated documents. The name Sanbi comes from the Japanese words san (算), meaning calculation, and bi (比), meaning comparison, because the tool generates the comparative values by means of calculation. It is pronounced like a portmanteau of “sun” and “be” in English.
Sanbi’s sister tool, Ginbi, is similarly named based on ginmi (吟味), meaning close examination, and the same bi for comparison, as this tool provides for careful human evaluation of MT output. We hope to release Ginbi in the upcoming months.
How does Sanbi work?
First, prepare the files you want to evaluate, making sure to include a translation by a human translator as well as one or more machine-translated versions. You can upload them into Sanbi with peace of mind because neither the tool nor TBSJ records or stores your files or your calculated scores. Every part of the process is anonymous, private, and secure.
Once the calculations are done, a report with sliding scores between 0 (worst) and 1 (best) is presented for each machine translation using a number of quality evaluation metrics. In its initial release, Sanbi will provide scores based on the BLEU and RIBES metrics, but we will expand to include other important metrics in the upcoming weeks.
Evaluating with a number of metrics provides for a more comprehensive result because each metric reflects different aspects of translation quality, says TBSJ Chief Scientist Yury Sharshov. BLEU, for example, is the industry standard and the most widely used MT quality assessment metric. RIBES, meanwhile, is particularly suited to assessing translations in Asian languages.
Sanbi also provides scores on the sentence level, in case users wish to delve deeper into how each metric has assessed particular segments.
For those less familiar with assessing MT software, Sanbi includes a short description of each metric along with links to further information.
Who is Sanbi for?
With applications for academia, the translation industry, and international business, we expect the tool will be in demand for anyone who needs to understand the quality of different machine translations.
Perhaps you need to choose MT software for your company or you want to know if it is worth investing in a software update rather than staying with the current version. You may be developing your own MT technology and want to check if you are moving in the right direction with the changes you are making.
Whether you are an expert or a newcomer, if you are shopping for providers, Sanbi can be a useful asset. As it provides a fast, consistent, and objective evaluation of machine-translated documents, it is the first step to comparing the quality of software in the marketplace.
Of course, MT software is only the start of the process, as skilled translators hold the key to polishing MT output to perfection. But choosing the best software from the outset with the help of TBSJ technology can give you the best start in producing high-quality translation.