Artificial intelligence tools may widen the gap between international students from different language backgrounds
Nov 23, 2024 — by Gengyan Tang. The original article was published on Postplagiarism.
The rise of AI-generation tools has significantly shortened the communication gap between speakers of different languages. Particularly in the post-plagiarism era, the language barriers that used to impede communication may now be completely erased. This opens up exciting possibilities, but before getting too thrilled, it’s worth considering some challenges that come along with it.
Since coming to the University of Calgary, I’ve interacted with many international students. Nearly every one of them has mentioned using ChatGPT to complete assignments. Most rely on it to polish their language, aiming to make their assignments more understandable to native English speakers.
For international students, language is one of the biggest challenges when studying abroad. When tackling complex tasks, seeking the help of AI tools to meet assignment standards, with the instructor’s approval, raises no ethical or integrity issues.
However, a subtler problem often goes unnoticed: Do current AI-generation tools work equally well for every language in terms of translation, refinement, and error-checking?
The Problem of Language Bias
Initially, I didn’t think much about this. But then a friend from Africa raised a valid point: Why was my translation of Chinese text into English using ChatGPT so accurate, while his translation from his language into English wasn’t nearly as good? This made me realize that not all languages are treated equally by AI tools.
A potential reason could be the volume of high-quality training data available for different languages. Languages spoken by larger populations produce more text data, which means AI models are trained with more resources for those languages. In contrast, languages spoken by fewer people generate less training data, leading to differences in how effectively AI tools handle them.
A recent news showed that when researchers posed the same math problem in 16 different languages, GPT-4 performed better in some languages, like English, German, and Spanish. This highlights a “language bias” in generative AI-generation tools.
Another Form of Digital Divide
This language bias has a detrimental effect on education, most acutely experienced by speakers of non-prominent languages.
Imagine a classroom with students from different countries, all required to complete assignments in English. If they’re allowed to use ChatGPT to polish their language, the results could vary in quality due to this hidden language bias. Even if their digital literacy levels are the same, their assignments may end up with different levels of language quality after being refined by AI-generation tools.
Language quality may be just one factor in determining grades, but it’s still an important one. Clear expression of ideas is often crucial for earning high marks. Therefore, students using AI tools for languages that have more robust training data might end up submitting higher-quality work. This difference could impact their grades, future academic opportunities, and even their publishing prospects, as language quality plays a key role in academic publishing too.
This emerging form of digital divide, rooted in non-skill-related factors, is something we need to address alongside traditional digital inequalities.
Possible Solutions
One potential solution is for educators to offer more targeted support for international students whose native languages are at a disadvantage. This could include additional language training and feedback, helping them quickly improve their skills and close the gap.
Another way forward is to create a more inclusive assessment environment. In subjects that prioritize ideas over language proficiency—such as medicine or mathematics—teachers could focus on developing grading systems that support language improvement, rather than penalizing it.
In conclusion, while AI-generation tools offer a welcome solution for international students facing language barriers, addressing the biases rooted in training datasets and bridging this non-traditional digital divide is something we should carefully reflect on.
About the author: Gengyan Tang, MA, is a PhD student in the Werklund School of Education at the University of Calgary. His research interests include research integrity and academic integrity.