Lilt Labs

Learn and explore everything you need to know about global experience

Case Study: First Large-Scale Application of Auto-Adaptive MT

2

Combining Machine Translation (MT) with auto-adaptive Machine Learning (ML) enables a new paradigm of machine assistance. Such systems learn from the experience, intelligence and insights of their human users, improving productivity by working in partnership, making suggestions and improving accuracy over time. The net result is that human reviewers produce far higher volumes of content, with nearly the same level of quality, for a fraction of the time and cost. Machine assistance can save customers up to one half (or more) of the price of traditional high-quality human translation services. Or, if you’ve been used to machine translation alone and have been unhappy with the results, watch your translation quality rise dramatically with a marginal increase in price.

Advanced Terminology Management in Lilt

1

Our new advanced termbase editor lets you manage terminology more effectively by keeping terms organized with meta information that you can customize. Import terminology with meta fields or add your own fields. Your terms will appear in both the Lexicon and the Editor suggestions and help you increase consistency and quality.

Keeping Your Data Secure in Lilt

2

In a world where data hacks and breaches seem to make front-page news more often than we’d like, a common question translators and businesses have about Lilt is usually: is my data safe? No need to worry. Lilt was built with that concern in mind. Read the answers below to some common questions about security in Lilt. Is my data shared with anyone? Your data is private to your Lilt account. It is never shared with other accounts and/or users. When you upload a translation memory or translate a document, those translations are only associated with your account. For Business customers, translation memories can be shared across your projects, but they are not shared with other users or third parties.

What We’re Reading: Domain Attention with an Ensemble of Experts

1

A major problem in effective deployment of machine learning systems in practice is domain adaptation — given a large auxiliary supervised dataset and a smaller dataset of interest, using the auxiliary dataset to increase performance on the smaller dataset. This paper considers the case where we have K datasets from distinct domains and adapting quickly to a new dataset. It learns K separate models on each of the K datasets and treats each as experts. Then given a new domain it creates another model for this domain, but in addition, computes attention over the experts. It computes attention via a dot product that computes the similarity of the new domain’s hidden representation with the other K domains’ representations.

What We’re Reading: Learning to Decode for Future Success

1

When doing beam search in sequence to sequence models, one explores next words in order of their likelihood. However, during decoding, there may be other constraints we have or objectives we wish to maximize. For example, sequence length, BLEU score, or mutual information between the target and source sentences. In order to accommodate these additional desiderata, the authors add an additional term Q onto the likelihood capturing the appropriate criterion and then choose words based on this combined objective.

FREE THE TRANSLATORS! How Adaptive MT turns post-editing janitors into cultural consultants

4

Originally posted on LinkedIn by Greg Rosner. I saw the phrase “linguistic janitorial work” in this Deloitte whitepaper on “AI-augmented government, using cognitive technologies to redesign public sector work”, used to describe the drudgery of translation work that so many translators are required to do today through Post-editing of Machine Translation. And then it hit me what’s really going on. The sad reality over the past several years is that many professional linguists, who have decades of particular industry experience, expertise in professional translation and have earned degrees in writing, whose jobs have been reduced to sentence-by-sentence clean-up of translations that flood out of Google Translate or other Machine Translation (MT) systems.

The Augmented Translator: How Their Jobs are Changing and What They Think About It

3

Written by Kelly Messori The idea that robots are taking over human jobs is by no means a new one. Over the last century, the automation of tasks has done everything from making a farmer’s job easier with tractors to replacing the need for cashiers with self-serve kiosks. More recently, as machines are getting smarter, discussion has shifted to the topic of robots taking over more skilled positions, namely that of a translator. A simple search on the question-and-answer site Quora reveals dozens of inquiries on this very issue. While a recent survey shows that AI experts predict that robots will take over the task of translating languages by 2024. Everyone wants to know if they’ll be replaced by a machine and more importantly, when will that happen?

What We’re Reading: Neural Machine Translation with Reconstruction

1

Neural MT systems generate translations one word at a time. They can still generate fluid translations because they choose each word based on all of the words generated so far. Typically, these systems are just trained to generate the next word correctly, based on all previous words. One systematic problem with this word-by-word approach to training and translating is that the translations are often too short and omit important content. In the paper Neural Machine Translation with Reconstruction, the authors describe a clever new way to train and translate. During training, their system is encouraged not only to generate each next word correctly but also to correctly generate the original source sentence based on the translation that was generated. In this way, the model is rewarded for generating a translation that is sufficient to describe all of the content in the original source.

What We’re Reading: Single-Queue Decoding for Neural Machine Translation

1

The most popular way of finding a translation for a source sentence with a neural sequence-to-sequence model is a simple beam search. The target sentence is predicted one word at a time and after each prediction, a fixed number of possibilities (typically between 4 and 10) is retained for further exploration. This strategy can be suboptimal as these local hard decisions do not take the remainder of the translation into account and can not be reverted later on.