clock menu more-arrow no yes mobile

Filed under:

Why AI doesn’t speak every language

It could learn them all. But will it?

Phil Edwards is a senior producer for the Vox video team.

Large language models are astonishingly good at understanding and producing language. But there’s an often-overlooked bias toward languages that are already well-represented on the internet. That means some languages might lose out in AI’s big technical advances.

Researchers are looking into how that works — and how to possibly shift the balance from these “high resource” languages to ones that haven’t yet had a huge online footprint. We spoke to a few of the researchers who are trying to make languages like Catalan and Jamaican Patois more accessible to AI language models. Their approaches range from original dataset creation to studying the outputs of large language models to training open source alternatives.

You can find this video and the entire library of Vox’s videos on YouTube.

Why AI Art struggles with hands

Subscribe to our channel and turn on notifications to make sure you don’t miss the next three episodes of this series on machine learning.

Sign up for the newsletter Today, Explained

Understand the world with a daily explainer plus the most compelling stories of the day.