ChatGPT, Copilot, Grok, Mistral, NotebookLM: artificial intelligence tools are becoming widespread in schools, universities, and businesses. However, until now, there has been no way for users to know how reliable these educational chatbots are. AI Score, a new tool developed by researchers at UNamur, fills this gap by measuring the educational reliability of chatbots. "AI Score is to chatbots what the speedometer was to cars," says Professor Michaël Lobet, one of the authors of the research. "The arrival of the automobile at the beginning of the 20th century revolutionized usage... but it was the invention of the speedometer that made it a controlled and reliable tool. Today, educational chatbots and other chatbots used in businesses in general are at a similar stage: powerful and exciting, but without reliable control instruments. The AI Score aims to be that speedometer," he explains.
In the same way that NutriScore, EcoScore, and PEB certification help citizens make informed choices, AI Score provides a simple and immediate reading of the level of trust that can be placed in a chatbot. "At a time when trust in generative AI is becoming a societal issue, AI Score guides teachers and companies in their choice of tools to put in the hands of their students or customers," says Dr. Miguël Dhyne, scientific collaborator at UNamur, educator, and physics researcher. "It can also help institutions evaluate AI solutions before deployment or verify their reliability over time," he adds.