News
Facebook Open Sources LASER Natural Language Processing Toolkit
Facebook announced today Tuesday it is open sourcing LASER (Language-Agnostic SEntence Representations), a toolkit created by Facebook Research that is "the first successful exploration of massively multilingual sentence representations to be shared publicly with the [natural language processing] community," the company says.
LASER originally worked with only a few romantic and Germanic languages but has since been expanded to 90 languages and 28 alphabets, and does so within the same model.
The toolkit's multilingual encoder and PyTorch code can be downloaded on GitHub here. Facebook has also included test sets for almost 100 languages.
"LASER opens the door to performing zero-shot transfer of NLP models from one language, such as English, to scores of others -- including languages where training data is extremely limited," the company said in its blog post announcing the release. "LASER is the first such library to use one single model to handle this variety of languages, including low-resource languages, like Kabyle and Uighur, as well as dialects such as Wu Chinese. The work could one day help Facebook and others launch a particular NLP feature, such as classifying movie reviews as positive or negative, in one language and then instantly deploy it in more than 100 other languages."
More information on exactly how LASER works can be found in the blog link above.
About the Author
Becky Nagel serves as vice president of AI for 1105 Media specializing in developing media, events and training for companies around AI and generative AI technology. She also regularly writes and reports on AI news, and is the founding editor of PureAI.com. She's the author of "ChatGPT Prompt 101 Guide for Business Users" and other popular AI resources with a real-world business perspective. She regularly speaks, writes and develops content around AI, generative AI and other business tech. Find her on X/Twitter @beckynagel.