Swedish Language Translation Into English – Breaking down language barriers through machine translation (MT) is one of the most important ways to bring people together, provide authentic information about COVID-19, and protect against harmful content. Today, we perform an average of 20 billion translations every day on the Facebook news feed, thanks to our recent development of low-resource machine translation and recent advances in translation quality assessment.
Typical MT systems require building separate AI models for each language and each task, but this approach does not apply effectively to Facebook, where people post content in over 160 languages across billions of posts. Advanced multilingual systems can handle multiple languages simultaneously, but compromise accuracy by relying on English data to bridge the gap between the source and target languages. We need a multilingual machine translation (MMT) model that can translate any language to better serve our community, nearly two-thirds of which use a language other than English.
Swedish Language Translation Into English
On the culmination of years of MT research at Facebook, we’re excited to announce a major milestone: the first single massive MMT model that can directly translate 100 x 100 languages in any direction without relying solely on English-centric data. Our unique multilingual model performs as well as traditional bilingual models and achieved an improvement of 10 BLEU points compared to English-focused multilingual models. Using new mining strategies to generate translation data, we created the first truly many-to-many dataset with 7.5 billion sentences for 100 languages. We used several scaling techniques to build a universal model with 15 billion parameters that captures information from related languages and reflects a more diverse language script and morphology. We offer this work as open source here.
Language Translator Device Two Way Real Time Voice/photo/recording Translation Wifi/hotspot/offline With Camera Support 137 Languages For Travel Or Business Or Learning
One of the biggest hurdles in building a many-to-many MMT model is constructing large quantities of high-quality sentence pairs (also known as parallel sentences) for arbitrary non-English translation instructions. It is much easier to find translations from Chinese to English and English to French than, say, from French to Chinese. Additionally, the amount of data required for training grows quadratically with the number of languages we support. For example, if we need 10 million sentence pairs for each direction, then we need to extract 1B sentence pairs for 10 languages and 100 B sentence pairs for 100 languages.
We took on this ambitious challenge of creating the most diverse MMT MMT dataset to date: 7.5 billion sentence pairs in 100 languages. This was made possible by combining complementary data mining resources that have been in the works for years, including ccAligned, ccMatrik and LASER. As part of this effort, we created the new LASER 2.0 and improved fastTekt language identification, which improves mining quality and includes open source scripts for training and evaluation. All of our data mining resources use publicly available data and are open source.
Facebook’s new multilingual many-to-many AI model is the culmination of years of pioneering work in MT with revolutionary models, data mining resources and optimization techniques. This timeline highlights several important achievements. In addition, we created a large training dataset by mining ccNET based on fastTekt, our pioneering work in word representation processing; our LASER library for CCMatrik, which embeds sentences in a multilingual embedding space; and CCAligned, our method for aligning documents based on matching URLs. As part of this effort, we created LASER 2.0, which improves on previous results.
However, even with advanced underlying technologies such as LASER 2.0, mining large training data for arbitrary pairs of 100 different languages (or 4450 possible language pairs) is very computationally demanding. To make this type of mining more manageable, we first focused on the languages with the most translation requests. Accordingly, we prioritized mining paths with the highest quality data and the largest amount of data. We avoided directions for which there is a statistically rare need for translation, such as Icelandic-Nepali or Sinhalese-Javanese.
Beautiful Words For Love From Around The World
Next, we introduced a new bridge mining strategy where we group languages into 14 language groups based on language classification, geography, and cultural similarities. People living in countries with languages of the same family communicate more often and would benefit from high-quality translations. For example, one group would include languages spoken in India such as Bengali, Hindi, Marathi, Nepali, Tamil and Urdu. Within each group, we systematically excavated all possible language pairs. In order to connect the languages of the different groups, we have identified a small number of bridge languages, which are usually one to three main languages of each group. In the above example, Hindi, Bengali and Tamil would be the bridge languages of the Indo-Aryan languages. We then extracted parallel training data for all possible combinations of these bridging languages. Using this technique, our training dataset ended up with 7.5 billion parallel sentences of data corresponding to 2200 instructions. Since the resulting data can be used to train two directions of a given language pair (e.g. en->fr and fr->en), our mining strategy helps us effectively thin the mine to best cover all 100×100 (total 9,900) instructions in one model. To supplement parallel data for languages with low resources and low translation quality, we used the popular back translation method that helped us win first place in the VMT 2018 and 2019 international machine translation competitions. For example, if our goal is to train a translation model from Chinese to French, we would first train a model from French to Chinese and translate all the monolingual French data to create synthetic, back-translated Chinese. We have found this method to be particularly effective at large scales, when hundreds of millions of monolingual sentences are translated into parallel datasets. In our research setting, we used back-translation to supplement the training of directions we had already mined by adding synthetic back-translated data to the mined parallel data. We used back translation to generate data for previously unsupervised trajectories.
Overall, the combination of our bridging strategy and back-translated data improved performance across 100 back-translated directions by 1.7 BLEU on average compared to training on data mining alone. With a more robust, efficient and high-quality training set, we were well equipped with a strong foundation to build and adapt our many-to-many model.
We also found impressive results in null image settings where no training data is available for the language pair. For example, if a model is trained in French-English and German-Swedish, we can translate from French to Swedish. In environments where our model must eliminate many-to-many translation between non-English majors, it performed significantly better than English-oriented multilingual models.
One of the challenges of multilingual translation is that a single model must capture information in many different languages and different fonts. To solve this problem, we saw a clear benefit in increasing the performance of our model and adding language-specific parameters. Model size scaling is particularly useful for language pairs with large resources because they have the most data to train additional model capacity. Finally, we observed an average improvement of 1.2 BLEU in language directions when densely scaling the model size to 12 billion parameters, with diminishing returns as further dense scaling followed. The combination of dense scaling and sparse language-specific parameters (3.2 billion) allowed us to build an even better model with 15 billion parameters.
Most Used Swedish Travel Phrases With Pronunciation [plus Audio]
We compare our model with core bilingual and multilingual models focusing on English. We start with a baseline of 1.2 billion parameters with 24 encoder layers and 24 decoder layers and compare the English-targeted models to our M2M-100 model. Then, comparing 12B parameters to 1.2 billion parameters gives 1.2 BLEU points of improvement.
To increase the size of the model, we increased the number of layers in our Transformer meshes and the width of each layer. We found that large models converged quickly and trained with high data efficiency. Notably, this many-to-many system is the first to use Fairscale, a new PiTorch library specifically designed to support tensor pipelining and parallelism. We built this general infrastructure to accommodate large-scale models that don’t fit on a single GPU by parallelizing the models in Fairscale. We built the ZeRO optimizer, intra-layer model parallelism, and pipeline model parallelism for training large models. But it is not enough to simply fit models to billions of parameters. In order to be able to produce this model in the future, we need to adapt the models as efficiently as possible through high-speed training. For example, a large number of existing works use multi-model ensemble, where multiple models are trained and applied to the same source sentence to produce a translation. To reduce the complexity and computations required to train multiple models, we investigated multi-source autocomposition that translates the source sentence into multiple languages to improve translation quality. We also built on our work with LayerDrop and Depth-Adaptive to jointly train a model with a common shell and different sets of language-specific parameters. This approach is great for many-to-many models because it provides a natural way to partition parts of the model by language pairs or language families. By combining dense model performance scaling with language-specific parameters (total 3B), we provide the benefits of large models as well as the ability to learn specialized layers
Vietnam language translation into english, romanian language translation into english, thailand language translation into english, swedish translation into english, indonesian language translation into english, italian language translation into english, swedish language translation, russian language translation into english, mexican language translation into english, german language translation into english, swedish language to english translation, malaysian language translation into english