As Anthropic takes on OpenAI and other challengers in the growing artificial intelligence industry, there is also an existential question looming: Can large language models and the systems they enable continue growing in size and capability? CEO and co-founder Dario Amodei has a simple answer: yes.
Speaking onstage at TechCrunch Disrupt, Amodei explained that he doesn’t see any barriers on the horizon for his company’s key technology.
“The last 10 years, there’s been this remarkable increase in the scale that we’ve used to train neural nets and we keep scaling them up, and they keep working better and better,” he said. “That’s the basis of my feeling that what we’re going to see in the next 2, 3, 4 years… what we see today is going to pale in comparison to that.”
Asked whether he thought we would see a quadrillion-parameter model next year (rumor has it we will see hundred-trillion-parameter models this year), he said that is outside the expected scaling laws, which he described as roughly the square of compute. But certainly, he said, we can expect models to still grow.
Some researchers have suggested, however, that no matter how large these transformer-based models get, they may still find certain tasks difficult, if not impossible. Yejin Choi pointed out that some LLMs have a lot of trouble multiplying two three-digit numbers, which implies a certain incapacity deep in the heart of these otherwise highly capable models.
“Do you think that we should be trying to identify those sort of fundamental limits?” asked the moderator (myself).
“Yeah, so I’m not sure there are any,” Amodei responded.
“And also, to the extent that there are, I’m not sure that there’s a good way to measure them,” he continued. “I think those long years of scaling experience have taught me to be very skeptical, but also skeptical of the claim that an LLM can’t do anything. Or that if it wasn’t prompted or fine-tuned or trained in a slightly different way, that it wouldn’t wouldn’t be able to do anything. That’s not a claim that LLMs can do anything now, or that they’ll be able to do absolutely anything at some point in the future. I’m just skeptical of these hard lists — I’m skeptical of the skeptics.”
At the very least, Amodei suggested, we won’t see diminishing returns for the next three or four years — event beyond which point in time you’d need more than a quadrillion-parameter AI to predict.