andrej karpathy openai releasing model weights

Andrej Karpathy Hints OpenAI Could Release Its Model Weights

OpenAI has copped a fair bit of criticism for its volte face on creating open-source models, but a prominent voice within the company has said that there’s a possibility that its model weights could be made public.

Former Tesla AI Director Andrej Karpathy, who’d rejoined OpenAI earlier this year, has hinted that the company could ultimately release its model weights. Karpathy was replying to a Twitter user who’d commented on him building a simplified version of Meta’s Llama-2 while working at OpenAI. “Worth noting that all of this is quite generic to just transformer language models in general. If/when OpenAI was to release models as weights (which I can neither confirm nor deny!) then most of the code here would be very relevant,” he said.

Karpathy’s statement indicates that there’s at least some conversation within OpenAI about releasing the model weights. Thus far, OpenAI has resolutely kept its model weights a secret. It’s not even revealed how its models were created, though there has been speculation that GPT-4 is a 220 billion parameter eight-way mixture of experts model. Karpathy said “if/when” OpenAI would release weights, and pointedly refused to confirm or deny whether it would happen.

Karpathy’s statement was seen as an indication that OpenAI would ultimately reveal its model weights by many in the AI community. “Does anyone have the impression that a certain amazing person is trying to get OpenAI to release their weights? Because if so, we should all be cheering him on!” tweeted FastAI founder and AI educator Jeremy Howard.

“I think he’s trying to tell us something. Every foundational model will eventually opensource some subset of its weights’ No private company wants to finetune an LLM with their private data on OpenAIs server They will do it on-premise likely with an opensource pretrained LLM,” wrote another user.

Others speculated that Karpathy was trying to push the issue with his tweet. “Might be he’s trying to make OpenAI release something,” wrote another.

If OpenAI were to actually release the model weights of GPT-4, it could upend the LLM landscape that’s developing at the moment. GPT-4 is clearly the best LLM out there, but there are several competent open-source models breathing down its neck, led by Meta’s Llama-2. If these other models can approach GPT-4 in capability, while being free compared to GPT-4’s $20 per month charges, they could end up wrestling away a large fraction of the market from GPT-4. But if GPT-4 were to be open-source as well, and instead OpenAI decided to monetize through different means such as charging enterprise customers, it could once again render the field wide open. It remains to be seen whether OpenAI does actually ever release its weights, but Andrej Karpathy’s statement is perhaps the biggest evidence yet that it might actually end up doing so.