sulfonyl

DeepL

History

Let's start with Gruenderszene's interview (2013) with Gereon Frahling, Linguee's co-founder.

Frahling received his university diploma in mathematics in 2002, his doctorate in theoretical computer science in 2006, and completed a post-doctorate at Google in New York in 2007 before retiring for a year and a half together with Leonard Fink to work on Linguee. This Cologne-based company now processes more than five million queries a day, and the startup employs 14 full-time staff. Linguee also recently reported that it had reached break-even.

I am the founder and CEO of Linguee. I'm a mathematician by background and have spent a lot of my life working on statistical analysis of extremely large amounts of data, first at university and then at Google after my PhD.

From 2007 to 2008, I worked in Google's research department in New York. At that time, of course, I had to use online dictionaries very often. What bothered me then was that you never got translations displayed in context and had access to very little information. That's when I realized how valuable a search engine would be where all the translations in the world could be searched. Because the development of a search engine fit so well with my knowledge, I quit Google and implemented this together with a friend.

I founded Linguee at the end of 2007. If someone has a problem translating a certain group of words, he can look at Linguee to see if a translator somewhere in the world has already translated this exact phrase and use the translation as a guide.

Timeline

Secret Sauce

DeepL remains tight-lipped about their secret sauce. The company shies away from attending conferences or releasing research papers. Unsurprising given that keeping its tech a trade secret is key to maintaining its competitive edge against bigger competitors. In contrast, Google, Microsoft, and Facebook, while also maintaining some degree of secrecy, are more actively engaged in the academic community. Google, for instance, has published 107 papers on Machine Translation, including eight in 2021.

Early DeepL marketing materials provided some hints, stating that DeepL was built on convolutional neural networks (CNNs), a type of neural network more commonly used in analyzing images. Meanwhile, the first version Google Translate’s NMT, released in 2016, ran on recurrent neural networks or RNNs. The “transformer model” (arising from Google research published in 2017) is widely recognized as producing superior results and is now the dominant paradigm.

Indeed, DeepL blogged about a major change to its approach to neural networks in Feb 2020. While disclosing no details, it stated that its AI researchers had succeeded in achieving another breakthrough in quality, resulting in more precision for the target text. Here's a citation from their company blog :

It is well known that most publicly available translation systems are direct modifications of the Transformer architecture. Of course, the neural networks of DeepL also contain parts of this architecture, such as attention mechanisms. However, there are also significant differences in the topology of the networks that lead to an overall significant improvement in translation quality over the public research state of the art. We see these differences in network architecture quality clearly when we internally train and compare our architectures and the best known Transformer architectures on the same data.

Most of our direct competitors are major tech companies, which have a history of many years developing web crawlers. They therefore have a distinct advantage in the amount of training data available. We, on the other hand, place great emphasis on the targeted acquisition of special training data that helps our network to achieve higher translation quality. For this purpose, we have developed, among other things, special crawlers that automatically find translations on the internet and assess their quality.

In public research, training networks are usually trained using the “supervised learning” method. The network is shown different examples over and over again. The network repeatedly compares its own translations with the translations from the training data. If there are discrepancies, the weights of the network are adjusted accordingly. We also use other techniques from other areas of machine learning when training the neural networks. This also allows us to achieve significant improvements.

Meanwhile, we (like our largest competitors) train translation networks with many billions of parameters. These networks are so large that they can only be trained in a distributed fashion on very large dedicated compute clusters. However, in our research we attach great importance to the fact that the parameters of the network are used very efficiently. This is how we have managed to achieve a similar translation quality even with our smaller and faster networks. We can therefore also offer very high translation quality to users of our free service.

And here's their claim on blind test result :

We are pleased to inform you that we have launched a completely new translation system that represents another quantum leap in translation quality. This has prompted us to conduct new blind tests. We translated 119 lengthy passages from a wide variety of subjects using DeepL Translator and some competing systems. We then asked professional translators to evaluate these translations and choose the best translation-without being informed which system produced which translation. The translators selected the translations from DeepL four times more often than those from any other system (Google, Amazon, Microsoft):

Kutylowski gave away very little about the model they use to run their neural machine translations. He stressed that they are reading a lot of what’s being published in the space and then combine that information with their own ideas and insights in developing DeepL. Regardless of the model used, DeepL’s access to Linguee’s curated translation data is an important asset for the company as high-quality bilingual translation data has been a sought after commodity of late.


References

  1. Magdalena Räth (December 11, 2013) "Linguee : Wir haben uns 18 Monate vergraben" Gruenderszene
  2. Anna Wyndham (September 15, 2021) "Inside DeepL: The World’s Fastest-Growing, Most Secretive Machine Translation Company" Slator
  3. DeepL (November 1, 2021) "How does DeepL work?"
  4. DeepL (February 6, 2020) "Another breakthrough in AI translation quality"
  5. Florian Faes (October 2017) "Why DeepL Got into Machine Translation and How It Plans to Make Money" Slator
  6. DeepL "Company Profile"