Here we’ll analyze the results our model produced and discuss the potential of a DL-based approach for universal translators.
Introduction
Google Translate works so well, it often seems like magic. But it’s not magic - it’s Deep Learning!
In this series of articles we’ll show you how to use Deep Learning to create an automatic translation system. This series can be viewed as a step-by-step tutorial that helps you understand and build a Neuronal Machine Translation.
This series assumes that you are familiar with the concepts of Machine Learning: model training, supervised learning, neural networks, as well as artificial neurons, layers, and backpropagation.
What we achieved in the previous article was good, even if it's not quite ready to replace Google Translate. In this article, we’ll train and test our translation models on additional languages.
Translating multiple languages
The functions we create here will support automatic translation from one language to another using the model we had developed.
You are welcome to download the code as a Google Colab file.
Our application will be able to create a model from a tab-separated parallel corpus, such as the ones from the Tatoeba project.
Looking at our code, you'll see we've grouped everything needed to train the model into a train_model function, and to translate and a translate function that takes a file full of text in the model's input language and translates it into the model's output language.
Let’s run our tool with a file containing some English text we wish to translate – f test.txt, which contains:
this is a test
hello
can you give me the bill please
where is the main street
translate("rus.txt","test.txt","model12")
We get the following output:
input model translation
0 this is a test это тест
1 hello привет
2 can you give me the bill please не можете мне пожалуйста
3 where is the main street где здесь улице
The result is correct except for the third line.
Now let’s train and then use a French translator:
train_model("fra.txt","model_fr")
translate("fra.txt","test.txt","model_fr")
input model translation
0 this is a test c'est un d'un
1 hello
2 can you give me the bill please tu me donner la s'il te prie
3 where is the main street où est la rue est rue
The result is pretty bad. Only the fourth sentence is translated in a somewhat intelligible way. The reasons are the complexity of the French language, and the fact that the training data set was relatively small compared to the Russian dataset.
Here is the result of automatic translation from English to German:
input model translation
0 this is a test das ist eine test
1 hello
2 can you give me the bill please könntest sie mir die rechnung geben
3 where is the main street wo ist die straße
This is almost 100% perfect.
Finally, let’s try see how well the same approach translates English to Dutch:
input model translation
0 this is a test dit is een nationale
1 hello hallo
2 can you give me the bill please kunt je me instapkaart geven
3 where is the main street waar is de bushalt
It’s not perfect "Where is the main street" is translated as "Where is the bus station?", and "can you give me the bill please" is translated as "can you give me the boarding pass".
As you can see, we have very different results depending on the language and the size of the training dataset.
Next Steps
As we've seen, we were able to build a very good machine translation system without writing thousands of lines of code or spending thousands of dollars on GPUs to train or model. Of course, as with most deep learning tasks, the bigger your training data set (and the more time you can spend training it) the more accurate your translation models will be.
There are many ways to build ML systems for machine translation. We just explored one of them. Alternatively, you can use convolutional neural networks (CNNs) instead of RNNs, or software like Moses; that combines statistical machine translation with deep learning models.
Now that you've seen AI language translation in action, you might want to try AI translation using Transformers. Transformers are a state-of-the-art approach to natural language processing tasks that are fully attention-based. They don't use sequences at all like the models we've been creating. Although transformers are new and aren't backed by as much research as sequence-based AI translation, it's starting to look like transformers will be the future of many natural language processing tasks.
Finally, if you've enjoyed what you've learned why not take your new skills, create something great, then write about it and share it on CodeProject?