PT EN

Computer-based Assistance for the Translator: Machine Translation

Marcos Chiquetto mar 28 2023

A lot of clients ask about the current state of machine translation (MT). In this post, I’ll perform a quick practical analysis of this resource, using Google Translate. The analysis will focus on three aspects: confidentiality, quality and cost.

Confidentiality

With any MT process, the text is sent to the service provider’s system (in this example, we are using Google) to be translated. This means that the entire text travels over the Internet and is housed in the machines and servers of another company. Technically speaking, there is nothing to stop someone at that company from using the text for any reason or to prevent a hacker from intercepting the communication. In practice, users from all over the world trust in these providers and believe that their communications are not going to be intercepted by hackers, but it is important to understand this process and its inherent risks. If a translator signs a confidentiality agreement with a client and intends to use MT, he or she must reveal this information to the client. If not, the confidentiality agreement is being broken.

Quality

In order to evaluate the current level of quality of MT, let’s analyze some practical cases using the following procedure:

1. Translate a text in English into another language;

2. Back-translate the result, or in other words, translate it back into English.

If the final result is exactly the same as the original, the tool will have been shown to work perfectly. If there are differences, we can identify the kinds of problems that we find.

We will use Portuguese as the target language.

We start by translating a very simple passage of technical text, where the essential elements such as the subject, verb and object are explicit and the words are used in accordance with their most common meanings.

Let’s try the following text:

When you see the result on the screen, press the “Enter” key. Then turn the machine off.

The following table shows the two steps involved in our MT translation experiment:

The result turned out perfectly. The back-translation from Portuguese is the same as the original text. Therefore, we can formulate our first observation:

MT can translate very simple texts perfectly.

In practice, not all texts are so simple. For example, marketing fliers for high-tech products are particularly difficult to translate, because they mix the informal, everyday language of marketing with complex technological concepts.

Let’s see how MT does with an old press release, from 2019, that is promoting the release of a line of products from a major manufacturer of notebooks. I’ve changed the names of the company and the product to fictitious ones, and left the rest of the text unchanged:

Stargraphics Announces PowerNote, a Full Product Portfolio Designed for Creators.
PowerNote is a new brand of premium high-end desktops, notebook PCs, and monitors designed for professional and amateur creators. Timeless design language, silenced processing, and extremely color-accurate displays distinguish these creator PCs and monitors. The PowerNote 500 high-end desktop has an 8-core and 16-thread 9th Gen Intel® Core™
i9–9900K processor that hits up to 5.0GHZ, and up to NVIDIA Quadro RTX 4000 GPUs.

Just as we did with the previous analysis, let’s translate the English text into Portuguese and back-translate it into English, using MT both times.

First, let’s look at the sentence that opens the text:

The result was excellent. The final sentence contains just one word that is different from the original: instead of “full” we have “complete”, which, in this case, is a synonym. The work done by MT was perfect, confirming that this tool works well with simple texts.

Let’s go on to the next sentence:

The final back-translation is very true to the original. The term “notebook PCs” was translated into Portuguese as “notebooks”, which is correct, and it appears this way in the back-translation. However, the word “premium”, indicating something of very high quality, simply disappeared in the translation to Portuguese and, consequently, there is no reference to high quality in the final back-translated text. This is a translation error.

This part of the text is very simple, with no ambiguity that might lead MT to make a mistake. Thus, we can make a second observation:

Even with simple texts, MT can make a mistake.

Another important point that should be pointed out is the term “high-end”, which was translated into Portuguese as “última geração” and reappears in the back-translation as “next generation”. How did this happen? I suppose that many translators have used “última geração”, literally “latest generation”, as a translation for “high-end”, and MT simply learned this. We can’t say that this is a mistake, but a revision would be required in this case to see if this is really the choice that is desired.

Therefore, here we have another important observation:

MT tends to adopt patterns based on the most commonly used translations, and these patterns don’t always coincide with the choices a translator would make in a given translation.

Let’s go on to the next sentence:

Here we encounter two significant errors:

– The term “extremely color-accurate displays” was correctly translated into Portuguese as “monitores extremamente precisos em cores”, but this wording was poorly done, leaving the meaning ambiguous, because it is not clear whether “em cores” refers to the “monitores” or the word “precisos”. In the back-translation, the MT assumed that “em cores” referred to the “monitores”, translating this as “extremely accurate color displays”, giving this passage an entirely different meaning than the original in English. This problem would not have slipped through if the first translation had been revised and the poorly constructed phrase had been rewritten.

This leads to another observation:

MT doesn’t always produce well-written texts. In general, a human translator is needed to improve the quality of the written text.

Also in this sentence, the word “creator” was badly used in the original: “creator PCs and monitors”. Rereading the beginning of the document (Stargraphics Announces PowerNote …Designed for Creators), it is understood that the products are designed for creators, but in this phrase the distinction isn’t clear. MT (so far) doesn’t have the intelligence required to consider the entire text in order to contextualize the sentence and translate this phrase with the understanding that these PCs and monitors are tools that enable people to create. In the back-translation, the PCs and monitors appear to be creative, which is not the same as the original idea.

Thus:

When translating a passage of text, MT can’t look for context throughout the rest of the text. If a phrase is poorly written, MT will probably make a mistake.

Now let’s look at the translation of this sentence:

In this case, the original sentence was extremely succinct, with few verbs and a lot of adjectives modifying each noun, and full of technical terms that are new. The resulting translation is extremely confusing. Note that “core” was translated as “testemunho”, which can be testimony or demonstration, perhaps based on the use of the term “core” in mining, where this word means a sample. The result is completely confusing. A human translator revising this would probably delete this entire translation and start from scratch.

In very dense texts that involve technological concepts that have not yet been established in dictionaries, the quality of an MT translation can be so poor that it ends up being of no help to the translator at all.

Below, we list the main points to be learned from our brief analysis:

· MT will generally translate simple texts correctly, this being texts that are written in a direct way using words according to their most common meanings. However, it may make mistakes even in these kinds of texts;

· MT uses patterns in its translations based on the most common usages of words. When a word is used in a way that is different from its most common use, MT will probably translate it according to the way it is most commonly used;

· MT doesn’t always produce well-written texts. In general, it is necessary to have a human translator revise the text in order to make sure that the sentences are well written.

· When translating a section of a text, MT cannot look through the rest of the document for context. If a passage is poorly written, and a general comprehension of the text is required to make it clear, MT will probably make a mistake;

· In very dense texts that refer to technological concepts that have not yet been established in dictionaries, the quality of a translation using MT can be quite bad.

These conclusions were drawn from an analysis of a single marketing flier. If we had analyzed a translation of a literary work or a pharmaceutical protocol, perhaps some details of the conclusions presented here would have been different, but, in general, the overall picture would have been quite similar. It appears to me that these points sum up the current status of this type of tool pretty well.

Cost

MT, for small texts, is free. For large volume texts, like we see at translation companies, the charge is calculated per word. Today, when using Google, this amount is around a fraction of a cent per word, which is extremely low as compared to a human translator.

Conclusion

MT is available at a very low cost, but it is still no substitute for a human translator. Nevertheless, a large part of what it does can be used, primarily with simple texts. Thus, these days, a hybrid solution is often used: run an MT to get an initial translation in order to lower the cost, and then pay a regular team of translators to correct/revise the work. Since the professionals already start with something that is partially done, the cost to the client is lower.

Depending on the kind of text, the reduction in cost achieved by using MT will be more or less significant.

· For materials with simpler text, such as technical manuals or contracts, MT is very useful, and results in a significant reduction in cost.

· For more complex materials, such as marketing texts, MT can require so much revision that the amount of work and the cost can end up being as much as a traditional translation.

And, in addition, every company should evaluate the inevitable impact of a loss of confidentiality before they decide to use MT or allow their translation supplier to use it. If a translation supplier has signed a contract with a confidentiality clause and intends to use MT, it must explain the process to the client and request authorization to use this tool.

Well, with that we end our brief analysis. Any comments are more than welcome.

Follow us here to see all of our weekly posts:

Talk to us

CONTACT

Rua José Jannarelli, 75, 401
05615-000 | São Paulo – SP
+ 55 (11) 3721-1280
        + 55 (21) 9 8123 1484
info@latinlanguages.com.br
Latin Languages