Who performs better: students or ChatGTP?

Wednesday, 26 April, 2023

A large crowd-sourced study has used more than 25,000 questions across 186 institutions’ accounting assessments to determine whether ChatGTP outperforms students.

The study found that students did better than the artificial intelligence tool. It also found that ChatGTP sometimes made up facts, made nonsensical errors such as adding two numbers in a subtraction problem, and often provided descriptive explanations for its answers, even if they were incorrect.

The study’s 328 co-authors from around the world, including University of Auckland accounting and finance academics Ruth Dimes and David Hay, entered assessment questions into ChatGPT-3 and evaluated the accuracy of its responses between December 2022 and January 2023.

Dimes, who directs the Business School’s Business Masters program, utilised two recent exams from the ‘analysing financial statements’ course.

“I entered the exam questions into ChatGPT and recorded how it performed compared to the students’ grades. My findings were consistent with the study overall and I was surprised that ChatGPT didn't perform as well as I thought it might have,” she said.

Meanwhile, David Hay, Professor of Auditing, used exam and test questions from the auditing course and found that the bot was able to perform slightly better in auditing courses compared to financial accounting courses, but still not as well as the students.

The study, led by Professor David Wood of Brigham Young University in Utah, includes a total of 25,817 questions (25,181 gradable by ChatGPT) that appeared across 869 different class assessments, as well as 2268 questions from textbook test banks covering topics such as accounting information systems (AIS), auditing, financial accounting, managerial accounting and tax.

The co-authors evaluated ChatGPT’s answers to the questions they entered and determined whether they were correct, partially correct, or incorrect. The results indicate that across all assessments, students scored an average of 76.7%, while ChatGPT scored 47.4% based on fully correct answers. However, after giving ChatGPT some credit for partially correct answers, it would have scraped through many courses with an average of 56.5% overall.

The study also revealed differences in ChatGPT’s performance based on the topic area of the assessment. Specifically, the chatbot performed relatively better on AIS and auditing assessments compared to tax, financial and managerial assessments.

Dimes said she was interested in seeing how newer versions of ChatGPT and other AI tools would perform if a similar study were undertaken at another point in time.

“These tools will perform better over time and the study highlights the importance of thinking carefully about what universities assess and how. Are we assessing critical thinking as opposed to something that can be rote learned and regurgitated?

“ChatGPT has already changed how we teach and learn. Many teaching staff run our assessments through the tool so we’re aware of what it might come up with.”

Dimes said the study, believed to be the first of its kind in the accounting field, was a unique experience.

“One of the most interesting parts of this for me was the process of gathering the data. It was amazing to see the speed at which researchers all over the world collated their data and trusted in the process. It was a really collaborative and effective way to do research,” she said.

Image credit: iStock.com/Laurence Dutton

Content from other channels on our network

Novel network cuts latency and energy in smart factories

Amping up battery insights in the tropics

Electrodes created with light

Chip-scale magnetometer for high-precision magnetic sensing

Siemens, NVIDIA to build industrial AI operating system

SentinelOne, Arete launch public sector security solution

APS disclosure of automated decision-making could be improved: OAIC

Rethinking endpoint security: the overlooked risk in hybrid public sector work

DroneShield added to Defence counter-drone panel

Comparing NZ and Australia on AI: adoption-first versus guardrails-first

Why do a fifth of solar panels degrade rapidly?

Retailer slapped with 130 charges over sale of electrical goods

Fujitsu General Australia announces name change

2025 saw record EV sales in Australia

2025–26 GenCost report released for public consultation

Who performs better: students or ChatGTP?

SaaS uplift to boost student experience

Tech partnership simplifies school administration

Does online delivery trump the classroom?

Content from other channels on our network