I was chatting about translation with Andrew Ng at an event several weeks ago. It's fun to see him post a workflow with the seed of the AI Review concept that's live in LILT today. === This is along the path to what many in the industry are doing. Two comments: 1. Replace step 2 with a formalized reflection approach such as AutoMQM. 2. BLEU isn't discriminative at the accuracy levels modern MT systems achieve. Use a trained metrics like BLEURT or COMET to see real differences between systems. 20 years ago we called this idea automatic post-editing, where you'd train a second string transducer to rewrite initial MT output. However, those systems were very difficult to train and cascade. Using LLMs in cascades is significantly more effective.
I think AI agentic machine translation has huge potential for improving over traditional neural machine translation, and am releasing as open-source a demonstration I'd been playing with as a fun weekend project. Using an agentic workflow, this demonstration (i) Prompts an LLM to translate from one language to another, (ii) Reflects on the translation to come up with constructive suggestions, (iii) Uses the suggestions to refine the translation. In our limited testing, this is sometimes competitive with, and sometimes worse than, leading commercial providers. But it gives a highly steerable translation system where by simply changing the prompt, you can specify the tone (formal/informal), regional variation (do you want Spanish as spoken in Spain or as spoken in Latin America?), and ensure consistent translation of terms (by providing a glossary). This is not mature software. But I hope the open-source community can make agentic translation work much better. Given how a simple reflection workflow already gives decent results, I think there's significant headroom to make agentic translation much better. Releasing an early software prototype like this is something new I decided to try to see if it is helpful to the developer community. I'd love any feedback on this. Thanks to Joaquin Dominguez, Nedelina Teneva, PhD and John Santerre PhD for help with this. https://lnkd.in/gjGANH6H