Reinforcement learning based quantum circuit optimization via ZX-Calculus

Riu Vicente, Jordi; Nogué Gómez, Jan; Vilaplana, Gerard; García Sáez, Artur; Pascual Estarellas, Marta

Reinforcement learning based quantum circuit optimization via ZX-Calculus

dc.contributor

Universitat Politècnica de Catalunya. Doctorat en Física Computacional i Aplicada

dc.contributor

Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors

dc.contributor

Barcelona Supercomputing Center

dc.contributor.author

Riu Vicente, Jordi

dc.contributor.author

Nogué Gómez, Jan

dc.contributor.author

Vilaplana, Gerard

dc.contributor.author

García Sáez, Artur

dc.contributor.author

Pascual Estarellas, Marta

dc.date.issued

2025-05-28

dc.identifier

Riu I, J. [et al.]. Reinforcement learning based quantum circuit optimization via ZX-Calculus. "Quantum", 28 Maig 2025, vol. 9, article 1758.

dc.identifier

2521-327X

dc.identifier

https://hdl.handle.net/2117/432671

dc.identifier

10.22331/q-2025-05-28-1758

dc.description.abstract

We propose a novel Reinforcement Learning (RL) method for optimizing quantum circuits using graph-theoretic simplification rules of ZX-diagrams. The agent, trained using the Proximal Policy Optimization (PPO) algorithm, employs Graph Neural Networks to approximate the policy and value functions. We demonstrate the capacity of our approach by comparing it against the best performing ZX-Calculus-based algorithm for the problem in hand. After training on small Clifford+T circuits of 5-qubits and few tenths of gates, the agent consistently improves the state-of-the-art for this type of circuits, for at least up to 80-qubit and 2100 gates, whilst remaining competitive in terms of computational performance. Additionally, we illustrate the versatility of the agent by incorporating additional optimization routines on the workflow during training, improving the two-qubit gate count state-of-the-art on multiple structured quantum circuits for relevant applications of much larger dimension and different gate distributions than the circuits the agent trains on. This conveys the potential of tailoring the reward function to the specific characteristics of each application and hardware backend. Our approach is a valuable tool for the implementation of quantum algorithms in the near-term intermediate-scale range (NISQ).

dc.description.abstract

A.G-S received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 951911 (AI4Media). This work was supported by the Agència de Gestió d’Ajuts Universitaris i de Recerca through the DI grant (No. 2020-DI00063) and by MICIU/AEI/10.13039/501100011033/ FEDER, UE.

dc.description.abstract

Peer Reviewed

dc.description.abstract

Postprint (published version)

dc.format

22 p.

dc.format

application/pdf

dc.language

eng

dc.relation

https://quantum-journal.org/papers/q-2025-05-28-1758/

dc.relation

info:eu-repo/grantAgreement/EC/H2020/951911/EU/A European Excellence Centre for Media, Society and Democracy/AI4Media

dc.rights

http://creativecommons.org/licenses/by/4.0/

dc.rights

Open Access

dc.rights

Attribution 4.0 International

dc.subject

Àrees temàtiques de la UPC::Informàtica::Aplicacions de la informàtica::Aplicacions informàtiques a la física i l‘enginyeria

dc.subject

Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic

dc.subject

Reinforcement Learning (RL)

dc.subject

Quantum circuits

dc.subject

Proximal Policy Optimization (PPO)

dc.subject

Graph Neural Networks

dc.title

Reinforcement learning based quantum circuit optimization via ZX-Calculus

dc.type

Article

Ficheros en el ítem

Ficheros	Tamaño	Formato	Ver
No hay ficheros asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

E-prints [73026]