Quantum Tech Might Speed Up Reinforcement Learning

Reinforcement learning (RL) is a type of dynamic programming in widespread use in the artificial intelligence (AI) world. It's being applied to robotics, video games, finance, and even healthcare. But an RL algorithm, a decision-making entity called an "agent," learns by interacting with its environment and taking actions based on feedback--in other words, trial and error. Consequently, the agents are what you might call "slow learners."

The ever-present need for speed has led to investigations into the application of quantum computing to accelerate RL agent decision making, but so far without much of a reduction in learning time. However, a recent experiment conducted by an international collaboration of researchers might have hit on the right application of quantum tech to kick RL into another gear.

Led by Dr. Philip Walther at the University of Vienna, the researchers developed a quantum-enhanced hybrid agent capable of both quantum and classical information transfer, one that was able to take advantage of the superposition principle to screen several potential solutions simultaneously. This weird-universe multitasking ability of a quantum system to be in multiple states at the same time until it is measured enabled "a quantum speed-up" in the agent's learning time and "optimal control of the learning process," the researchers found.

The results of this experiment were recently published in the journal Nature ("Experimental Quantum Speed-up in Reinforcement Learning Agents"). In their paper, the researchers explained how they implemented this learning protocol on a compact and fully programmable nanophotonic processor. The device interfaced with telecommunication-wavelength photons and featured a fast active-feedback mechanism, which demonstrated the agent's systematic quantum advantage in a setup the researchers believe could be integrated within future large-scale quantum communication networks.

"In general, it has been shown that granting agents access to quantum hardware (while still considering classical communication) does not reduce the learning time," the researchers wrote, "although it allows actions to be output quadratically faster. To achieve reductions in learning times, quantum communication becomes necessary. We therefore consider an environment and a quantum-enhanced hybrid agent with access to internal quantum (as well as classical) hardware interacting by exchanging quantum states…. Such agents may behave 'classically,' that is, use a classical channel, or 'quantumly,' meaning that communication is no longer limited to a fixed preferred basis, but allows for exchanges of arbitrary superpositions via a quantum channel…."

In other words, the hybrid agent was able to use classical computing for simple decision making, and quantum computing when the decisions were more complex. In this experiment the hybrid agent was 63% faster at learning a solution compared with traditional reinforcement learning, decreasing its learning effort from 270 guesses to 100.

"Our setup allows the agents to choose the favorable strategy by switching from quantum to classical when the latter becomes advantageous," the researchers wrote.

The researchers envision the protocol that emerged from this experiment to aid specifically in problems where frequent search is needed. And they speculate that the development of superconducting detectors, on-demand single-photon sources, or the large-scale integration of artificial atoms within photonic circuits will lead to scalable multiphoton applications. "Although photonic architectures are particularly suitable for such learning algorithms," they wrote, "our theoretical background is applicable to different platforms, for example, trapped ions or superconducting qubits. Here future realizations can feature the implementation of agent and environment as spatially separated systems, and a light–matter quantum interface for coherent exchange between them."

The growing demand for data-intensive, learned-based ML methods is leading to the adoption of RL across a range of industries, from healthcare and education to manufacturing and marketing, according to a study from Research and Markets ("Reinforcement Learning: An Introduction to the Technology") published last year. In that study, market watchers found an increasing demand for a general framework for deep reinforcement learning (DRL--also known as a semi-supervised learning model in the ML paradigm).

About the Author

John K. Waters is the editor in chief of a number of sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at