[ad_1]

  • An economics professor was stunned by the progress ChatGPT made in an exam in just three months.
  • Bryan Caplan of George Mason University said the chatbot got a D in his economics test in January.

An economics professor says he’s stunned by the progress made by ChatGPT, which improved its score from a D to an A on his economics test in just three months.

Bryan Caplan, an economics professor at George Mason University, told Insider the latest version of ChatGPT could now be responsible for the first big bet he’s ever lost.

ChatGPT-3.5 didn’t understand basic theory

Writing in a blog post on his Substack “Bet On It” in January, Caplan said he posed ChatGPT questions from his Fall mid-terms.

Caplan says his exam questions are meant to test students’ understanding of economics, rather than have them regurgitate the textbook or be seen as memory exercises.

It’s here where the old version of ChatGPT tripped up. The bot scored 31/100 in his test, equivalent to a D and well below his 50% median.

Caplan told Insider the bot failed to understand basic concepts, such as the principle of comparative and absolute advantage. Its answers were also more political than economic, he said.

“ChatGPT does a fine job of imitating a very weak GMU econ student,” Caplan wrote in his withering January blog post.

He wasn’t the only academic to be disappointed by ChatGPT. While it passed a Wharton Business School exam in January, its professor said it made “surprising mistakes” on simple calculations.

Big bet

Caplan likes to bet. He’s previously placed 23 public bets and won them all. They’re usually for modest sums of about $100, and often on technical subjects like predicted unemployment rates and inflation readings.

He also narrowly won a 2008 bet that no member state would leave the European Union before 2020: the UK left in January of that year.

So underwhelmed was he by ChatGPT’s responses that Caplan bet an AI model wouldn’t score an A on 6 out of 7 of his exams before 2029.

But when ChatGPT-4 was released, Caplan was stunned by its progress. It scored 73% on the same midterm test, equivalent to an A and among the best scores in his class.

ChatGPT’s paywalled upgrade sought to fix some of the early issues with the beta version, GPT-3.5. This purportedly included making GPT 40% more likely to return accurate responses, as well as being able to handle more nuanced instructions.

For Caplan, the improvements were obvious. The bot gave clear answers to his questions, understanding principles it previously struggled with. It also scored perfect marks explaining and evaluating concepts put forward by economists like Paul Krugman.

“The only thing I can say is it just seems a lot better,” Caplan said.

Caplan thought ChatGPT’s training data might have picked up his previous blog post where he explained his answers, but colleagues told him this was highly unlikely.

He added that he’s already fed the bot new tests it hasn’t seen before, where it did even better than its previous 73% grade. “I was very smug in my judgement, and I’m not smug anymore.”

Caplan is more confident he’ll win his next AI-related wager. He has a bet with Eliezer Yudkowsky, an AI doomer who has sparred with ChatGPT creator Sam Altman, that AI will lead to the end of the world before January 1, 2030.

“I’m probably going to lose this AI bet but I am totally on board to do a bunch more end-of-the-world AI bets because I think these people are out of their minds.”

Tough to test

AI bots have caused headaches for examiners. Professors told Insider that plagiarism can be hard to prove with material from ChatGPT because there is no material evidence of wrongdoing.

Caplan says he’s thinking of doing away with graded homework in the wake of ChatGPT’s rise. He hopes his habit of regularly changing questions will be enough to stop students from learning and regurgitating ChatGPT’s responses in an exam setting.

[ad_2]

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *