apoAI returns to the Language Lounge...
Hi, apoAI! How’s things?
Hello, Ishmael! I can’t complain – thank you for asking. How are you?
All good! So have you read Moby Dick yet?
To be honest, I just haven’t got round to it. I’m swamped with work.
Well, I’m happy I’ve bumped into you. Can I ask you a few questions?
Go for it!
I’ve been working on machine translation post-editing (MTPE) projects for a while now and I sometimes feel like I go into a bit of a trance and stop looking at the text with a critical eye. I’m starting to worry that I’ll forget how to translate from scratch and will end up relying on machine translation output.
Hmm. Do you know what? I don’t think I’m really qualified to talk to about this.
Why’s that?
I don’t have any experience of post-editing whatsoever. It’s not exactly in my skill set...
Oh yeah, that makes sense.
But would you believe it? I can offer you some advice anyway.
How come? Have you got a mini post-editor hidden somewhere?
Well, no. But I did read a book about post-editing during a break one day.
Of course you did, you cheeky chap! Come on then, spill the beans! How do I avoid losing my translation skills when post-editing?
I have a few tips for you...
- Start by reading the whole text through once. This will give you an idea of what it’s all about. That way, you can keep the bigger picture in mind while you’re post-editing the text. That’s something I can’t do!
- Make sure you stop for lots of little breaks. That works better for humans than stopping for longer less often. (This method has something to do with tomatoes but it’s all a bit beyond me, I’m afraid.)
- Read the whole text through one last time before you deliver the project. Even better if that step can wait until the next day.
- You’ll also find it helps to switch between different projects, so aim for a good mix of post-editing, translation, proofreading and editing.
Thank you, apoAI! That’s super helpful. Especially the stuff about tomatoes!
Can I ask you something now? I’m curious to find out which errors you find yourself correcting most often in my machine translation output...
There are so many to choose from... Sometimes you miss words out or add random bits to the translation that aren’t in the source. Then there are the grammar errors, incorrect prepositions and missing articles. The terminology isn’t always right. And you translate things that don’t need to be translated at all like names. Oh and sometimes...
OK, OK, that’s enough! My hard drive is whirring away here. But you’ve made your point that I have plenty of room for improvement.
I also wanted to point out that sometimes you translate the same word in the source inconsistently... How does that happen?
Hold on, I need a moment... OK, back to it. Let’s just say that it takes a lot of computing power for me to remember what I’ve translated already.
How can that be? Surely you have enough disk space?
I most certainly have enough space to store databases and so on. But we’re talking about processing data that’s generated during the translation process. The algorithm would need to factor in all of that when processing every single new word. Doing that for a whole document is still beyond my capacity for the moment. When I’m translating a word, I can factor in the other words before it in the same sentence. This was considered major progress in the world of machine translation!
How exciting! So you had to be given a short-term memory?
You could put it like that!
How does that work exactly?
What we’re talking about here is the shift from traditional to recurrent neural networks (RNN). In traditional neural networks, every input is processed individually to produce an output. But when it comes to sequential information – which is everywhere in language – it’s much better if I can also factor in my previous output as I carry on processing. This has been made possible by building in loops and special memory units in the form of hidden layers, which give me a sense of time and context at the sentence level.
Can you give me an example?
Absolutely! Let’s take a German sentence.
Nora isst schon früh morgens Chips.
If I had to rely on a traditional neural network to translate this sentence into English, I’d be asking myself whether the word ‘Chips’ is referring to potatoes or computers. But, with my recurrent neural network, I can use the context of the whole sentence and see that the word ‘isst’ or ‘eats’ appears before ‘Chips’. Based on that information, it’s obviously much more likely that the English translation I’m looking for here is ‘crisps’.
But why do you sometimes just get the terminology plain wrong?
Hey, nobody’s perfect! You’ve already made that much clear! Do you have an example for me?
I’m thinking of the times when you use standard German words instead of the Swiss alternatives.
Well, that really depends on the data that has been used to train me. Since I’ve been trained for Switzerland, you shouldn’t really find that I use standard German words instead of the Swiss alternatives. That’s much more likely to happen with DeepL, though. But there’s always the option of creating a termbase for a specific field or customer. That way, I know to always prioritise that terminology over anything else.
So why do you sometimes miss bits out of your translations? Do you get distracted easily?
No, not exactly. But I have to admit that I will sometimes be reading a book when I’m working. (Multitasking is my middle name, don’t you know!)
Basically, I construct each sentence one word at a time and there are always multiple options to choose from for each word. All of those translations could be the right choice with varying degrees of probability. I go for the word that is most likely to be the right one. And then I move on to the next word and the next word, following the same process each time. At one point or another, the time comes when the most likely option is that the sentence ends. So I insert a full stop and that’s that.
But if you translate each word individually, how can you miss words out?
It’s not quite as simple as that. Like I said, I construct each sentence one word at a time. But that doesn’t mean that every word in the source text is allocated a single word in the target text. It’s rare that translation works that way. And because I’ve been trained with live translation jobs, I know that it’s fairly unlikely that a word-for-word translation will be appropriate.
I’m getting confused!
Let’s get a bit more technical. The translation of a sentence is a joint effort between my encoder and my decoder.
- The encoder represents each word in the input language with a vector, producing a sequence of numbers to represent the sentence meaning. (You may remember that we talked about assigning coordinates in my semantic space to each word last time.)
- The decoder then outputs the translated sentence, word by word, working from left to right.
- It can, in theory, consider the encoder output (words in the source language) that it finds useful.
- But there is no guarantee that it will have considered all the encoder output by the time it has reached the end of the sentence.
Why can’t your encoder just check that the decoder has translated all the content?
The encoder just does its bit and hands over to the decoder. They simply don’t interact with one another. Don’t ask me why... Maybe they don’t like each other! Anyway, neither of them has any idea about what the sentence actually means. For them, it’s all about numbers and probability. But one thing’s for sure – the probability is almost never 1. And what that means in practice is that I can never know for certain that my translation is correct. That’s up to you to decide at the post-editing stage.
No problem – you can leave it with me! But I need a break to process all this information first. It’s my brain’s turn to be whirring away now! Thank you for your help, apoAI! See you around!
My pleasure, as always!