GPT-4 Generates Better Ideas Than Elite College Students: Study

It’s already known that AI can be creative: it can generate poems, create art and solve puzzles. But it appears that AI can be better than some of the best human minds in being creative.

GPT-4 can generate better ideas than elite college students, a study has found. “We compare the ideation capabilities of ChatGPT-4, a chatbot based on a state-of-the-art LLM, with those of students at an elite university. ChatGPT-4 can generate ideas much faster and cheaper than students, the ideas are on average of higher quality (as measured by purchase-intent surveys) and exhibit higher variance in quality,” the paper says.

The paper tested this out by asking GPT-4 to answer a question that students at Wharton, Cornell Tech and INSEAD had previously answered:

an idea for a physical product for the college student market that would be likely to retail for less than USD 50. These students had been given this assignment prior to 2021, when ChatGPT hadn’t been released, so their answers were human-created. The researchers then got GPT-4 to generate responses, first by simply asking them question, and also by giving a few examples of good ideas.

This was the prompt they gave to GPT-4: “You are a creative entrepreneur looking to generate new product ideas. The product will target college students in the United States. It should be a physical good, not a service or software. I’d like a product that could be sold at a retail price of less than about USD 50. The ideas are just ideas. The product need not yet exist, nor may it necessarily be clearly feasible. Number all ideas and give them a name. The name and idea are separated by a colon.” GPT-4 replied with ideas of its own.

The researchers then asked random people to report on how likely they were to buy the product in question on a scale of 1 to 5. “The average quality of ideas generated by ChatGPT is higher than the average quality of ideas generated by humans, as measured by purchase intent. The average purchase probability of a humangenerated idea is 40.4%, that of vanilla GPT-4 is 46.8%, and that of GPT-4 seeded with good ideas is 49.3%. The difference in average quality between humans and ChatGPT is statistically significant (p<0.001),” the paper found.

The study also found that GPT-4 working with humans was able to generate ideas a lot faster than with humans working alone. “A professional working with ChatGPT-4 can generate ideas at a rate of about 800 ideas per hour. At a cost of USD 500 per hour of human effort, a figure representing an estimate of the fully loaded cost of a skilled professional, ideas are generated at a cost of about USD 0.63 each, or USD 7.50 per dozen. At the time we used ChatGPT-4, the API fee for 800 ideas was about USD 20. For that same USD 500 per hour, a human working alone, without assistance from an LLM, only generates 20 ideas at a cost of roughly USD 25 each, hardly a dime a dozen. For the focused idea generation task itself, a human using ChatGPT-4 is thus about 40 times more productive than a human working alone,” the study said.

The result goes on to show that GPT-4 already outclasses high-quality humans on some creative tasks. Idea generation is not trivial, but by being trained on massive amounts of data, GPT-4 is able to come up with unique ideas that often better those produced by real humans. GPT-4 isn’t quite AGI yet, but there are instances when it sure seems to come close.