AI Pioneers Grapple with Mysteries of Hallucinating Machines and 'Emergent Abilities'

With advanced artificial intelligence technology reshaping the world in real-time, it can be disconcerting to acknowledge that even the creators of these systems lack a complete understanding of their inner workings.

Illustrating this uncertainty is the occurrence of AI "hallucinations." In this context, an AI system, exemplified by large language models (LLMs) employed in machine learning, including those underpinning advanced systems like ChatGPT and GPT-4 by OpenAI, exhibits unwavering confidence in providing responses containing inaccurate or entirely fabricated information. The precise mechanisms behind these occurrences remain elusive, although experts have proposed a range of theories.

Another enigmatic phenomenon that is garnering increasing attention and research is the emergence of "abilities" or "properties" in LLMs. Inexplicably, these models seem to acquire knowledge that should be beyond their capacity to learn.

These capabilities were brought to the forefront in last Sunday's "60 Minutes" TV broadcast by CBS News, which interviewed Google AI leaders, as explained in the post "Is artificial intelligence advancing too quickly? What AI leaders at Google say."

Google engineers were surprised to discover that their LLM had taught itself a new language.

"Of the AI issues we talked about, the most mysterious is called emergent properties," Pelley said. "Some AI systems are teaching themselves skills that they weren't expected to have. How this happens is not well understood. For example, one Google AI program adapted, on its own, after it was prompted in the language of Bangladesh, which it was not trained to know."

Google exec James Manyika weighed in: "We discovered that with very few amounts of prompting in Bengali, it can now translate all of Bengali. So now, all of a sudden, we now have a research effort where we're now trying to get to a thousand languages."

And Google CEO Sundar Pichai responded: "There is an aspect of this which we call -- all of us in the field call it as a 'black box.' You know, you don't fully understand. And you can't quite tell why it said this, or why it got wrong. We have some ideas, and our ability to understand this gets better over time. But that's where the state of the art is."

While the emergent phenomenon was brought to the mainstream public's eye in the broadcast reaching millions, it's not exactly new.

For example, it was examined back in August 2020 in the paper, "Emergent Abilities of Large Language Models."

The paper's abstract reads:

Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence raises the question of whether additional scaling could potentially further expand the range of capabilities of language models.

A much more recent study examining emergent behavior, "A Survey of Large Language Models," was published on Sunday, April 16, coincidentally the same day as the "60 Minutes" broadcast.

"Despite the progress and impact, the underlying principles of LLMs are still not well explored," that paper said. "Firstly, it is mysterious why emergent abilities occur in LLMs, instead of smaller PLMs [pre-trained language models]. As a more general issue, there lacks a deep, detailed investigation of the key factors that contribute to the superior abilities of LLMs. It is important to study when and how LLMs obtain such abilities."

A site maintained by Jason Wei, an AI researcher at OpenAI who works on ChatGPT, tracks emergent abilities of LLMs, listing 137.

While emergent abilities might scare some people ("What if AI teaches itself to take control of humanity's computer systems?"), Wei thinks the new capabilities of LLMs could lead to several promising future research directions beyond simply scaling up. "Overall, the existence of emergent abilities applies that scaling further would unlock even more emergent abilities," he said. "This idea is super exciting to me."

He specifically listed these potential research directions, which should answer these questions:

  • Can we improve model architectures? E.g., sparsity, external memory, better objectives
  • Can we improve data quality and quantity? Training for longer increases pre-training compute but not inference compute
  • Better prompting. How can we extract the most performance out of an existing language model?
  • Frontier tasks. What tasks are language models currently not able to perform, that we should evaluate on future language models of better quality?
  • Why do emergent abilities occur, and can we predict them? E.g., do language models learning compositional abilities that enable them to solve harder problems?

Wei, a former research scientist at Google Brain who is an author of the "Emergent Abilities of Large Language Models" paper mentioned above, in November 2022 co-wrote a post titled "Characterizing Emergent Phenomena in Large Language Models." In that post, in addition to discussing emergent prompted tasks, he examines emergent prompting strategies.

"One example of an emergent prompting strategy is called 'chain-of-thought prompting,' for which the model is prompted to generate a series of intermediate steps before giving the final answer," he said. "Chain-of-thought prompting enables language models to perform tasks requiring complex reasoning, such as a multi-step math word problem. Notably, models acquire the ability to do chain-of-thought reasoning without being explicitly trained to do so." The graphic above shown an example of chain-of-thought prompting.

So, while AI LLM emergence may be mysterious and threatening, the phenomenon is being actively explored right now as a means to improve AI LLMs.

The fear mongers, however, were surely further inflamed by this Q&A sequence in Sunday's "60 Minutes" segment on advanced AI:

Scott Pelley: You don't fully understand how it works. And yet, you've turned it loose on society?

Sundar Pichai: Yeah. Let me put it this way. I don't think we fully understand how a human mind works either.

Stay tuned to find out if LLM emergent abilities turn out to be good or bad.

About the Author

David Ramel is an editor and writer for Converge360.