Do ChatBots Belong More in the English or Math Department?

By: Benjamin He

AI seems completely unstoppable these days. The growth of AI is inevitable. With the advancements of ChatGPT and many large companies incorporating AI into their websites of businesses (Grammarly, Wix, etc.) AI is sure to expand to pretty much everywhere. Chatbots can write out a list of Egypt’s most prominent pharaohs, explain the stock market to 10-year-olds, and even manage to write complete songs and decent poetry.

And yet, some are stumped by simple arithmetic that require using more than one step.

In case you’ve been in a stasis chamber for the past few years, let me explain AI. AI stands for “artificial intelligence.” This concept itself is defined by IBM as “technology that enables computers and machines to simulate human intelligence and problem-solving capabilities.” We can see examples of AI usage in our day-to-day life, from virtual assistants like Siri to robots that are programmed to put together entire cars. Recently, though, there has been a very significant boost in AI development. Suddenly there were chatbots like ChatGPT who could explain the War of 1812 to you, tell you how you can adopt a pet, or even write an entire essay.

And all it takes is a few sentences typed onto a screen.

Chatbots today seem to be more fluent in writing stories and summarizing literature, rather than being well-versed in mathematics. Much of AI struggles with math word problems.

“The A.I. chatbots have difficulty with math because they were never designed to do it,” said Kristian Hammond, a computer science professor and artificial intelligence researcher at Northwestern University.

Kristen DiCerbo, chief learning officer of Khan Academy, an education nonprofit, is experimenting with incorporating AI into their system. The AI in question is called Khanmigo. Khanmigo itself is a chatbot meant for educational purposes. Videos from Khan Academy explain that Khanmigo is essentially a mini tutor meant to help learners with tough problems. While seemingly useful, there are some problems.

According to Common Sense Media, Khanmigo does not grade student work, it doesn’t accept any visual input (pictures and drawings), and Khamigo employs a Large Language Model (LLM), which struggles with math.

The way many AI currently do math is by plugging problems into a separate calculator program by first interpreting from the word problem what numbers and equations it has to use. Khamigo does this as well. While it waits for the program to finish, it shows the bot doing “math” above its bobbing head.

Some improvements have been made recently to address AI’s difficulty with math. OpenAI’s new version of ChatGPT has “achieved nearly 64 percent accuracy on a public database of thousands of problems requiring visual perception and mathematical reasoning” up nearly 58% from the previous version, proving an increased familiarity in the math department.

Still, Chabots mainly excel in the fields whose knowledge they’ve thoroughly absorbed, namely textbook and novels in general. Maybe in the end, AI is just the embodiment of robot English professors.

Image Credit by Kindel Media